The Microarchitecture of DOJO, Tesla’s Exa-Scale Computer
2023; Institute of Electrical and Electronics Engineers; Volume: 43; Issue: 3 Linguagem: Inglês
10.1109/mm.2023.3258906
ISSN1937-4143
AutoresEmil Talpes, Debjit Das Sarma, Douglas F. Williams, Sahil Arora, Thomas Kunjan, Benjamin Floering, Ankit Jalote, Christopher Hsiong, Chandrasekhar Poorna, Vaidehi Samant, John Sicilia, Anantha Kumar Nivarti, Raghuvir Ramachandran, Tim Fischer, Ben Herzberg, Bill McGee, Ganesh Venkataramanan, Pete Banon,
Tópico(s)Advanced Data Storage Technologies
ResumoThe Tesla-built DOJO system is a scalable solution targeted towards machine learning training applications. It is based on the D1 custom compute chip which packs together 354 independent processors, resulting in 362 TFLOPS of compute and 440 MB of internal static random-access memory storage. While maintaining full programmability, DOJO emphasizes distribution of resources and an extremely high bandwidth interconnect, allowing it to scale from small systems all the way to exaFLOP supercomputers.
Referência(s)