Jointly written by researchers from Huawei and Chinese AI infrastructure start-up SiliconFlow, the paper described CloudMatrix 384 as a specialised “AI supernode” that is purpose-built for handling extensive AI workloads.
Huawei expected CloudMatrix “to reshape the foundation of AI infrastructure”, according to the paper released this week. It consists of 384 Ascend 910C neural processing units (NPUs) and 192 Kunpeng server central processing units, which are interconnected through a unified bus providing ultra-high bandwidth and low latency.
The advanced large language model (LLM) serving solution, dubbed CloudMatrix-Infer, leverages that infrastructure, according to the paper. It surpassed the performance of some of the world’s most prominent systems in running DeepSeek’s 671-billion-parameter R1 reasoning model.
Data centres are facilities that house large-capacity servers and data-storage systems, with multiple power sources and high-bandwidth internet connections. More enterprises are using data centres to host or manage computing infrastructure for their AI projects.