Skip to main content

Has GPT-3 freaked you out? A monster 1.75 trillion parameter language AI called Wu Dao 2.0 has just launched in China. It’s 10x larger than GPT-3.

In AI, exciting times are now. With GPT-3, OpenAI stunned the world a year ago. Google unveiled LaMDA and MUM two weeks ago; these two AIs will change chatbots and search engines, respectively. And just a few days ago, on June 1, Wu Dao 2.0 was presented at the Beijing Academy of Artificial Intelligence (BAAI) conference.

The most powerful and currently largest neural network ever developed is Wu Dao 2.0. Although its entire potential and constraints have not yet been revealed, expectations are high—and rightfully so.

I’ll go over the details of Wu Dao 2.0 that are now known, including what it is, what it can accomplish, and what its developers have promised for the future. Enjoy!

Wu Dao 2.0: Key Differences from GPT-3

Another language model that resembles the GPT is Wu Dao, which translates to “Enlightenment.” The policy director at OpenAI, Jack Clark, refers to this imitation of GPT-3 as “model diffusion.” However, Wu Dao 2.0, which has a stunning 1.75 trillion parameters, is the largest of all the replicas (10x GPT-3).

According to Coco Feng’s research for the South China Morning Post, the training dataset for GPT-3 (570GB) pales in comparison to the 4.9TB of high-quality text and image data used to train Wu Dao 2.0. However, it’s important to note that OpenAI researchers cleaned 570GB of data from 45TB of data.

The training data are separated into:

Chinese text data in Wu Dao Corpora totals 1.2 TB.
Chinese graphic data in 2.5TB.
In the Pile dataset, there are 1.2 TB of English text.
Wu Dao 2.0 uses several media. It can handle jobs involving both sorts of data and learn from both text and images, which GPT-3 cannot. In recent years, we’ve observed a shift away from AI systems that were only capable of managing one type of information towards multimodality.

Future deep learning systems are predicted to mix computer vision and natural language processing, the two major subfields within the field. It is a multimodal world. People have multiple senses. It makes sense to design AIs that imitate this capability.

Various Experts
FastMoE, a system comparable to Google’s Mixture of Experts, was used to train Wu Dao 2.0. (MoE). For each modality, it is intended to train various models within a bigger model. The larger model can choose which models to consult for each sort of task using a gating system.

FastMoE is more inclusive than Google’s MoE because it is open-source and doesn’t call for any particular hardware. By doing so, BAAI researchers were able to eliminate training bottlenecks that kept models like GPT-3 from achieving the 1-trillion-parameter milestone. They stated that “[FastMoE] is simple to use, versatile, high-performance, and allows large-scale parallel training” in the official WeChat blog for BAAI. Large AI systems of the future will undoubtedly go through these training frameworks.

The amazing skills of Wu Dao 2.0
Kyle Wiggers highlighted Wu Dao 2.0’s multimodal capabilities in a VentureBeat article: It can “perform natural language processing, text generation, image recognition, and image generating activities, as well as caption photographs and create nearly photorealistic artwork, given natural language descriptions,” according to its capabilities.

Wu Dao 2.0 can “both produce alt text based off of a static image and generate nearly photorealistic visuals based on natural language descriptions,” according to Andrew Tarantola’s article for Engadget. Similar to DeepMind’s AlphaFold, it can also predict the 3D structures of proteins.

The system “has [been] close to breaking past the Turing test, and competing with humans,” according to leading researcher Tang Jie, who also noted Wu Dao 2.0’s abilities in “poetry creation, couplets, text summaries, human setting questions and replies, painting.”

Wu Dao 2.0’s has nothing on GPT-3 or any other AI model now in existence. It holds the distinction of being the most adaptable AI due to its multitasking skills and multimodal makeup. These findings imply that multi-AI systems will rule in the future.

Benchmark successes
According to BAAI, Wu Dao 2.0 achieved or exceeded state-of-the-art (SOTA) levels on 9 benchmark tasks that are well-known in the AI community (benchmark: achievement).

SOTA outperforms OpenAI CLIP in ImageNet (zero-shot).
LAMA: Exceeded AutoPrompt in terms of knowledge of facts and common sense.
Microsoft Turing NLG was surpassed by LAMBADA for cloze tasks.
SOTA for SuperGLUE, surpassing OpenAI GPT-3.
UC SOTA beats OpenAI CLIP in Merced Land Use (zero-shot).
MS COCO outperformed OpenAI DALLE (text generation diagram).
In terms of English graphic retrieval, MS COCO outperformed Google ALIGN and OpenAI CLIP.
Surpassed UC2 with MS COCO (multilingual graphic retrieval) (best multilingual and multimodal pre-trained model).
Surpassed UC2 in Multi 30K (multilingual graphic retrieval).
Unquestionably outstanding outcomes were achieved. In terms of critical benchmarks across tasks and modalities, Wu Dao 2.0 performs superbly. These benchmarks lack a quantitative comparison between the Wu Dao 2.0 and SOTA models, though. We’ll have to wait to find out how fantastic Wu Dao 2.0 is until they publish a paper.

a distance learner
Wu Dao 2.0’s child Hua Zhibing is the first Chinese virtual student. She can continually learn new things, writes poetry, does artwork, and will eventually learn to code. Wu Dao 2.0 can learn new tasks over time, unlike GPT-3, while keeping in mind what it has already learnt. This trait appears to move AI even more toward the memory and learning processes found in humans.

Hua Zhibing has “some talent in logic and emotional contact,” Tang Jie even went so far as to say. According to Peng Shuang, a researcher in Tang’s lab, “the virtual female will have a higher EQ and be able to communicate like a human,” according to People’s Daily Online.

Many folks became completely obsessed with the outcomes when they started using GPT-3. People described GPT-3 as “sentient,” “generally intelligent,” and “capable of understanding,” among other things. There is currently no evidence to support this. Wu Dao 2.0 now has the opportunity to prove to the world that it is capable of “reasoning and emotional engagement.” I’d be wise for the moment and wait to draw any conclusions.