The 10-petaflop "Keisoku" supercomputer, slated to become operational in 2012, will now be based on

scalar processors from Fujitsu.

Stride directed prefetching in

scalar processors. In Proceedings of the 25th Annual International Symposium on Microarchitecture (MICRO 25, Portland, OR, Dec.

Finally, massively parallel architectures rely on large numbers of relatively weak scalar processors and give very high performance when a large number of relatively small independent computations are required.

The current trend in workstation architectures toward several very fast scalar processors running in parallel and sharing the same physical memory space provides a very, good computational architecture for high-performance visualization.

Critical to this capability are high-performance three-dimensional graphics workstations with multiple scalar processors having total floating-point power of hundreds of megaflops; operating systems that support regularly scheduled preemptive lightweight threads, so that processes are reliably and regularly scheduled; and very large (more than 10 gigabytes) physical memory capability.

The modularly designed UNIX-based Model 522 uses two 64-bit ECL RISC

scalar processors and two vector coprocessors, which are based on a Cray vector instruction set.

In ORNL's case, the QMC/DCA code (specifically, an individual Markov chain in the Monte Carlo simulation) does not scale well on

scalar processors, and the algorithm the team was using became too computationally inefficient when expanded beyond four-site clusters.

Fujitsu's supercomputer provides two, 2.5Gflop vector units shared by four

scalar processors. Every fouror five years the number of processors is doubled, providing a gain of 18% per year.

At the end of the day, the conventional scalar processors used in most HPC systems are fundamentally limited in their ability to extract more parallelism from application codes.

Multi-core CPU designs increase the number of processors in a system, but still employ traditional scalar processors that cannot effectively exploit the parallelism in many computations.

Members of the Fujitsu VP-2000 series [4] also have separate vector and

scalar processors, as did their predecessors, the VP-200 and VP-400.

Other approaches to large multiprocessors do not have vector facilities, and hence may not be viable or performance/price competitive for highly vectorized applications since automatic compilation or parallel constructs to use a large number of

scalar processors don't appear to offer the speed of the vector approach for these applications.