Home Artificial Intelligence Alibaba new AI model can understand images, more complex conversations

Alibaba new AI model can understand images, more complex conversations

by delta
0 comment
  • Alibaba launched on Friday two new artificial intelligence models — Qwen-VL and Qwen-VL-Chat — the company says can understand images and carry out more complex conversations.
  • One example Alibaba gave is a hospital sign. Qwen-VL-Chat is able to answer questions about which floor of the building certain hospital departments are on by interpreting an image of the sign.
  • This AI push comes from Alibaba’s cloud division which is looking to reignite growth as it prepares to go public.

An Alibaba Group sign is seen at the World Artificial Intelligence Conference in Shanghai, July 6, 2023.

Aly Song | Reuters

Alibaba on Friday launched a new artificial intelligence model that the company says can understand images and carry out more complex conversations than the company’s previous products, as the global race for leadership in the technology heats up.

The Chinese technology giant said that its two new models, Qwen-VL and Qwen-VL-Chat, will be open source — meaning that researchers, academics and companies worldwide can use them to create their own AI apps without needing to train their own systems, therefore saving time and expense.

Alibaba said that Qwen-VL can respond to open-ended queries related to different images and generate picture captions.

Qwen-VL-Chat meanwhile caters to more “complex interaction,” according to Alibaba, such as comparing multiple image inputs and answering several rounds of questions. Some tasks that Alibaba says Qwen-VL-Chat can perform include writing stories and creating images based on photos that a user inputs, as well as solving mathematical equations shown in a picture.

One example Alibaba gave is of an input featuring a hospital sign in the Chinese language. The AI can answer questions about the locations of certain hospital departments by interpreting the image of the sign.

So far, much of generative AI — where the technology generates responses based on human inputs — has focused on responding to text. The latest version of OpenAI’s ChatGPT also has the ability to understand images and respond in text, much like Qwen-VL-Chat.

Alibaba’s two latest models are built upon the company’s large language model called Tongyi Qianwen, released earlier this year. An LLM is an AI model trained on huge amounts of data and underpins chatbot applications.

The Hangzhou-headquartered company this month open sourced two other AI models. While not earning Alibaba any licensing fees, the open-source distribution will help the company get more users for its AI model — at a time when the firm’s cloud division is looking to reignite growth, as it prepares to go public.

You may also like

Leave a Comment

delta-compliance.com

Delta-Compliance.com is a premier news website that provides in-depth coverage of the latest developments in finance, startups, compliance, business, science, and job markets.

Editors' Picks

Latest Posts

This Website is operated by the Company DELTA Data Protection & Compliance, Inc., located in Lewes, DE 19958, Delaware, USA.
All feedback, comments, notices of copyright infringement claims or requests for technical support, and other communications relating to this website should be directed to: info@delta-compliance.com. The imprint also applies to the social media profiles of DELTA Data Protection & Compliance.

Copyright ©️ 2023  Delta Compliance. All Rights Reserved