However, the eight-core 528-thread chip that Intel used for the demonstration stole the spotlight due to its unique architecture that sports 66 threads per core to enable up to 1TB/s of data throughput.
Intel’s PUMA (Programmable Unified Memory Architecture) chip is part of the DARPA HIVE program that focuses on improving performance in petabyte-scale graph analytics work to unlock a 1000X improvement in performance-per-watt in hyper-sparse workloads.
After characterizing the target workloads, Intel concluded that it needed to craft an architecture that solved the challenges associated with extreme stress on the memory subsystem, deep pipelines, branch predictors, and out-of-order logic created by the workload.
Intel fabbed the chip on TSMC’s 7nm process with 27.6 billion transistors spanning a 316mm^2 die.
The eight cores, which consume 1.2 billion transistors, run down the center of the die, flanked by eight custom memory controllers with an 8-byte access granularity.
The promise of optical interconnects has fueled an intensifying amount of research as the industry looks to future data transport methods that offer superior bandwidth, latency, and power consumption characteristics compared to traditional chip-to-chip communication techniques.
The original article contains 655 words, the summary contains 182 words. Saved 72%. I’m a bot and I’m open source!
This is the best summary I could come up with:
However, the eight-core 528-thread chip that Intel used for the demonstration stole the spotlight due to its unique architecture that sports 66 threads per core to enable up to 1TB/s of data throughput.
Intel’s PUMA (Programmable Unified Memory Architecture) chip is part of the DARPA HIVE program that focuses on improving performance in petabyte-scale graph analytics work to unlock a 1000X improvement in performance-per-watt in hyper-sparse workloads.
After characterizing the target workloads, Intel concluded that it needed to craft an architecture that solved the challenges associated with extreme stress on the memory subsystem, deep pipelines, branch predictors, and out-of-order logic created by the workload.
Intel fabbed the chip on TSMC’s 7nm process with 27.6 billion transistors spanning a 316mm^2 die.
The eight cores, which consume 1.2 billion transistors, run down the center of the die, flanked by eight custom memory controllers with an 8-byte access granularity.
The promise of optical interconnects has fueled an intensifying amount of research as the industry looks to future data transport methods that offer superior bandwidth, latency, and power consumption characteristics compared to traditional chip-to-chip communication techniques.
The original article contains 655 words, the summary contains 182 words. Saved 72%. I’m a bot and I’m open source!