The world’s largest chip maker has made a big breakthrough in AI
Cerebras Systems, the world’s largest manufacturer Processorhas broken the record for the most complex AI model trained using a single device.
using one CS-2 systemPowered by the company’s wafer-sized chip (WSE-2), Cerebras is now able to train AI models with 20 billion parameters thanks to new optimizations at the software level.
The firm says the breakthrough will solve one of the most frustrating problems for AI engineers: the need to split massive models into thousands GPU, The result is an opportunity to drastically cut the time spent developing and training new models.
Cerebras brings AI to the masses
In sub-disciplines such as Natural Language Processing (NLP), model performance correlates in a linear fashion with the number of parameters. In other words, the larger the model, the better the final result.
Today, developing AI products on a large scale has traditionally involved spreading a model across a large number of GPUs or accelerators, either because there are too many parameters to be put into it. Commemoration Or the performance calculation is insufficient to handle the training workload.
“This process is painful, often taking months to complete,” explained Cerebras. To make matters worse, the process is unique for each network compute cluster pair, so the task is not portable across different compute clusters, or neural networks. It’s all set.”
Although majority of With complex models containing more than 20 billion parameters, the ability to train AI models on a relatively large scale on a CS-II device eliminates these barriers for many, accelerating the development of existing players and allowing them to participate earlier. Provides democratic access to people with disabilities. space.
“Cerebrus’ ability to bring large language models to the masses with cost-effective, easy access heralds an exciting new era in AI. It gives organizations that want an easy and affordable on-going access to major league NLP. Can’t afford the ramps,” said Dan Olds, chief research officer, Intersect 360 Research.
“It will be interesting to see new applications and discoveries made by CS-2 customers when training GPT-3 and GPT-J class models on large datasets.”
What’s more, Cerebras indicated that its CS-2 system may be able to handle even larger models with “trillions of parameters” in the future. And Connecting Multiple CS-2 Systems TogetherIn the meantime, AI could pave the way for networks larger than the human brain.