A Look into Constructing LLMs in China: Insights from an Alibaba Employee

Home » A Look into Constructing LLMs in China: Insights from an Alibaba Employee
A Look into Constructing LLMs in China: Insights from an Alibaba Employee

Tech companies in China are actively collecting resources and skilled individuals in order to close the gap with OpenAI. Interestingly, researchers on both sides of the Pacific Ocean share similar experiences. A recent post from an Alibaba researcher provides a unique insight into the process of building Large Language Models at the e-commerce company. This is just one example of the many Chinese internet giants working to reach the same level as ChatGPT.

On X, Binyuan Hui, a researcher at Alibaba’s team Qwen for large language models in natural language processing, shared his daily routine. This gained attention as it was similar to the schedule of Jason Wei, a researcher at OpenAI, that recently went viral.

A comparison of their daily routines showcases remarkable similarities, as they both wake up at 9 a.m. and go to bed around 1 a.m. The day begins with meetings, followed by coding, training models, and collaborating with colleagues. Despite returning home, they persist in conducting experiments at night and contemplating ways to improve their models until bedtime.

There are notable distinctions in the way they describe their free time. For instance, Hui, who works at Alibaba, stated that he spends his leisure time reading research papers and browsing X to stay updated on current events. On the other hand, as observed by a commentator, Hui does not indulge in a glass of wine like Wei does when he returns home.

In China’s present LLM industry, it is common for highly skilled individuals with prestigious university degrees to join technology companies in large numbers in order to develop competitive AI models. This rigorous work schedule is not uncommon in this field.

Hui’s busy schedule can be seen as a manifestation of their personal determination to keep up with (or at least maintain the illusion on social media) and possibly even surpass Silicon Valley companies in the field of AI. This appears to be distinct from the typical activities of Chinese internet businesses, such as video games and e-commerce, which require a significant amount of operational work.

It is undeniable that the highly regarded AI investor and computer scientist Kai-Fu Lee dedicates an immense amount of effort. In a recent conversation about his newly founded LLM unicorn 01.AI in November, Lee acknowledged that working late hours is the usual practice, but his team members willingly put in hard work. On that particular day, one of his employees messaged him at 2:15 a.m. to share their enthusiasm about being a part of 01.AI’s mission.

The visible demonstration of strong work ethics reflects the sense of urgency in fulfilling the objectives set by technology companies in the nation, and consequently, the rapid pace at which these companies are implementing LLMs.

For instance, Qwen has a collection of fundamental models that were trained using both English and Chinese data. The biggest one among them has 72 billion parameters, which indicates the model’s knowledge gained from previous training data and its ability to generate appropriate responses in context. To put it into perspective, OpenAI’s GPT3 is estimated to have 175 billion parameters, while their latest LLM, GPT4, has 1.7 trillion. However, it can be argued that the primary factor in determining the value of high parameter numbers is the purpose of a specific LLM.

In addition, the team has demonstrated promptness in launching commercial uses. In April of last year, Alibaba incorporated Qwen into their corporate communication tool Dingtalk and their e-commerce platform Tmall.

Currently, there is no clear frontrunner in China’s LLM industry. Both venture capital firms and corporate investors are diversifying their investments among various potential leaders. Along with developing their own LLM platform, Alibaba has been actively investing in startups like 01.AI, , and .

In response to competition, Alibaba has been striving to establish its unique position, and its recent implementation of multilingual capabilities could prove to be a valuable feature. In December, the corporation launched a language model, named SeaLLM, which can handle data in multiple Southeast Asian languages including Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Tagalog, and Burmese. With the support of its cloud computing business and acquisition of ecommerce platform Lazada, Alibaba has established a significant presence in the region and has the potential to integrate SeaLLM into these services in the future.

Leave a Reply