Capital News

Alibaba And Microsoft’s AI System Competition Of “Look At The Picture And Talk”, WIMI AI Vision Is Outstanding

 Breaking News
  • No posts were found

Alibaba And Microsoft’s AI System Competition Of “Look At The Picture And Talk”, WIMI AI Vision Is Outstanding

April 15
11:01 2020

With the continuous progress of material science, perceptual artificial intelligence, 5g, cloud and other network technologies, there will be robots with rich forms such as nursing robots, bionic robots, social robots, housekeeper robots, etc., emerging in the household management, education, health service industry, bringing new life styles to human beings. The super vision enabled by 5g, AR / VR, holographic technology, machine learning and other new technologies will help us to break through the limitations of space, image and time, and give us new capabilities.

Recently, AI from China has broken the world record in this ability. In the second global AI Visual Dialogue competition, Alibaba AI defeated 10 teams including Microsoft and Seoul University to win the championship.

This visual dialogue competition was initiated by Georgia Institute of technology, Facebook Artificial Intelligence Laboratory (fair) and other institutions in the United States, in collaboration with CVPR, a top academic conference in the field of global visual technology. It is one of the most authoritative competitions in the field of visual dialogue at present.

The results show that Ali AI won the championship with an accuracy of 74.57%, raising the record of the previous competition by 16.82%, and surpassing the accuracy of Microsoft AI by 64.78%. In the same data set, the accuracy of human is only 64.27%, and AI is even better than human. Traditional visual AI mainly aims at the detection and recognition of objects, but the ability of understanding and reasoning of the logical relationship between objects in complex scenes is weak, and it can not answer the complex questions of expressing the direct relationship between image objects, and it is also difficult to transform the image information into the language output of human understanding. This means that in order to achieve visual dialogue ability, traditional visual AI needs to have a language model to support its “speaking” after learning to “look at the picture”. Alibaba AI’s breakthrough lies in the “recursive exploration dialogue model”.

At present, Microsoft has also opened an AI system that can “look at pictures and speak”. Users can upload pictures on the official website www.captionbot.ai. After a while, they can see the description of pictures in the system. Although its accuracy is not low, it still needs to be improved. Taking a once popular black question mark facial expression picture as an example, AI quickly gave an objective answer: “I think it’s basketball player Nick Young showing his teeth and smiling.”

Visual Dialogue represented by “looking at pictures and speaking” is a rapidly rising research direction of AI in recent years. Its purpose is to teach machines to discuss visual content with humans in natural language, which enables machines to have the ability to understand and infer the real visual world, and also means that AI’s cognitive ability will step up to a new level.

As described by the barrier free man-machine dialogue system in science fiction films such as “2001 Space Odyssey” and “wandering earth”, AI with intelligent visual dialogue ability is coming to us with the continuous breakthrough of technology.

According to the latest global semi annual AI system spending guide released by IDC, the Asia Pacific region is expected to spend $5.5 billion on artificial intelligence (AI) systems in 2019, an increase of nearly 80% over 2018. With the active investment of all walks of life in projects utilizing AI software functions, it is expected to increase to US $15.06 billion in 2022. The Asia Pacific region is rapidly promoting the rapid growth of AI applications, because its digital ecosystem is very suitable for the development of this new technology.

Human body recognition refers to the detection, attribute description and behavior recognition of human body in video stream or image by using computer technology, such as human body detection and tracking, human body key point positioning, human image segmentation, human body behavior recognition Reid technology, gesture recognition, etc.

WIMI Hologram Cloud (NASDAQ:WIMI) focuses on computer vision holographic cloud service, covering multiple links of holographic AR technology, including holographic computer vision AI synthesis, holographic visual presentation, holographic interactive software development, holographic AR online and offline advertising, holographic arsdk payment, 5g holographic communication software development, holographic face recognition development, holographic AI face change development, etc., is a comprehensive holographic cloud technology solution provider.

WIMI Hologram Cloud (NASDAQ:WIMI) attaches great importance to the research and development of the underlying technology. With years of hard work and accumulation in the field of computer vision, WIMI holography has completed the technical precipitation and polishing from 0 to 1, the AI vision technology of WIMI has reached the mature landing, and various technical indicators have reached the leading level of the industry. When the technology is mature, WIMI Hologram Cloud (NASDAQ:WIMI) focuses on planning from 1 to n landing applications, in which robot is a very important part of the planning. WIMI Hologram Cloud (NASDAQ:WIMI) product layout business application scenarios are mainly concentrated in five professional fields, such as home entertainment, optical theater, performance system, commercial publishing system and advertising display system.

The holographic image processing function of WIMI Hologram Cloud (NASDAQ:WIMI) is regularly optimized and improved, including two core technologies: holographic AI face recognition technology and holographic AI face change technology. Due to the development of video processing and recognition technology, WIMI Hologram Cloud (NASDAQ:WIMI) AR advertising and holographic imaging services based on image detection, recognition, template matching, image dynamic fusion and replacement are currently in a leading position in the industry.

Holographic face change technology is based on 3D layer replacement technology, including AI based image recognition and dynamic fusion processing technology, real-time tracking image and face replacement with other faces. This technology replaces the face in the video frame, synthesizes the video and adds the original audio. WIMI Hologram Cloud (NASDAQ:WIMI) has verified these technical modules in holographic AR plug-in advertising application, and will continue to develop and upgrade these technical modules. WIMI Hologram Cloud (NASDAQ:WIMI) believes that this technology will bring new business growth to celebrity advertising, film distribution, live video streaming and other applications.

Figure: holographic 3D face recognition technology

https://note.youdao.com/yws/public/resource/5f7da2ccec5799c6e7d4ac4ad1f08e79/xmlnote/16D7F37A4C07451596EEEAEE69108688/14474

At the same time, the robot industry also urgently needs the support of AI vision technology to break through the technical bottleneck and solve the existing cost, intelligence and other problems of robots. Through the image semantic technology of WIMI Hologram Cloud (NASDAQ:WIMI) , the robot can recognize and segment the image. Combining with the depth data, it can help the robot to build the semantic map, realize the three-dimensional perception and understanding of the surrounding environment, enable the robot to have the AI interaction ability from recognition to decision-making, realize the functions such as tracking the target pedestrian, serving the robot to deliver the meal to the door, etc., greatly Improve the intelligent level of robot.

Recently, China’s Huawei Technology Co., Ltd. released the global industry outlook report giv2025, believing that the intelligent world is accelerating and within reach. According to the forecast of the report, by 2025, intelligent technology will penetrate into everyone, every family and every organization. 58% of the global population will be able to enjoy 5g network, 14% of families will have “robot housekeeper”, and 97% of large enterprises will apply artificial intelligence. Zhang Hongxi, chief marketing officer of Huawei’s global information and communication technology infrastructure business, said: “human exploration will never stop. We need to fly higher from earth to space, see further from the past to the future, and think deeper from innovation to creation. Today, the changes brought about by the fourth industrial revolution led by artificial intelligence, 5g and cloud computing are changing all walks of life and accelerating the arrival of the intelligent world.

Media Contact
Company Name: OBNewsOnline Inc
Contact Person: Matt Smith
Email: Send Email
Phone: (+44) 20 8383 1211
Address:218-993 Harold Street
City: London W2E 3LT
State: England
Country: United Kingdom
Website: https://observernewsonline.com

Categories