Research teams from the University of Bologna and cineca (the largest Supercomputing Center in Italy) have been researching and developing risc-v supercomputers. Recently, the new isa designed by the team has proved its ability to run high-performance computing, which lays the foundation for building supercomputing The team used sifive's freedom u740 SOC as the basis, and the researchers named their risc-v cluster "Monte Cimone"
To create a supercomputer, you need hardware that looks like Lego bricks. These are called clusters and consist of motherboards, processors, memory, and storage. Italian researchers decided to try different solutions from Intel /amd to solve this problem, and use risc-v isa based processors.
Monte Cimone has four dual board servers, and each server adopts 1U overall dimension. Each board has a freedom u740 SOC of sifive, including four U74 cores running at a frequency of 1.4 GHz and an S7 management core. There are 8 nodes in total and 32 risc-v cores in total.
With 16GB 64 bit DDR4 memory running at 1866s mt/s, PCIe Gen 3 X8 bus running at 7.8 gb/s, a Gigabit Ethernet port and USB 3.2 Gen 1 interface, the system is powered by two 250 Watt PSUs to support future expansion and addition of acceleration cards.
The Italian team benchmarked the system using HPL and stream to determine the floating-point computing power and memory bandwidth of the machine. Although the results are not very impressive, they are the beginning of risc-v.
Each node produces a continuous performance of 1.86 gflops in the HPL, with a total computing power of 14.88 gflops and perfect linear scaling. However, the efficiency of the whole cluster is 85% and the computing power is 12.65 gflops. The node shall have a memory bandwidth of 14.928 gb/s; However, the actual result is 7760 mb/s.
These results show two things. First, the risc-v HPC software stack is mature, but further optimization and faster chips are needed to achieve major tasks such as weather simulation. Secondly, it shows that the expansion in the HPC world is very difficult and needs careful optimization to make hardware and software coexist in a world where everything can be well expanded.