The numbers in the black corners of these boxes correspond to the articles at the bottom that describe the routines and how they are optimized using the Intel Streaming SIMD Extensions
. The blue balloons show how many times faster the SIMD optimized routines are relative to reference implementations in C/C++ when comparing the hot cache clock cycle counts on an Intel(R) Pentium(R) 4 Processor on 90nm Technology.
BLAS Basic Linear Algebra Subprograms * FIT Fast Fourier Transform * GPGPU General-purpose Computing on Graphics Processing Units * GPU Graphics Processing Unit * KNC Knights Corner * MIC Many Integrated Core * MIMD Multiple Instruction Multiple Data * MPI Message Passing Interface * SIMD Single Instruction, Multiple Data * SIMT Single Instruction Multiple Thread * SM Streaming Multiprocessor * SMP Symmetric Multiprocessing * SSE Streaming SIMD Extensions
* TACC Texas Advanced Computer Center
Streaming SIMD extensions
applied to boundary element codes.
"Two of the Pentium's important DSP acceleration capabilities are multimedia extensions (MMX) and streaming SIMD extensions
(SSE) where the SIMD acronym means single instruction/multiple data," Mr.
He pointed out that the Streaming SIMD Extensions
3 instructions in the Intel platform were unique to Xeon.
It is fully compatible with previous x86 processors and provides plenty of features like support for MMX technology and for streaming SIMD extensions
, thus ensuring very good floating point performance.
Other features include a 100MHz system bus, 256Kb full-speed Advanced Transfer Cache and Streaming SIMD extensions
for higher system performance.
Users performing multimedia creation tasks (video coding, MP3 ripping) will likely see some gain from the P4's new Streaming SIMD Extensions
2 (SSE2) instructions, provided the application being used takes advantage of them.
All Fire GL professional graphics accelerators are fully optimized for Intel's Streaming SIMD Extensions
(SSE) and AMD Athlon(TM) processor-based systems with 3DNow!(TM) technology.
Acronyms ACML AMD Core Math Library | COTS Commodity off the Shelf | CUDA Compute United Device Archtecture | GPGPU General-Purpose Computation on Graphics Processing Units | GPU Graphics Processing Unit | MKL Math Kernel Library | NSSP Single Source Shorfest Path Problem with Non-negative Weights | SIMD Single Instruction, Multiple Data | SSE Streaming SIMD Extensions
| TLB Translation Lookaside Buffer
Both models incorporate Internet Streaming SIMD extensions
- advanced microprocessor instructions which combine with the processors' faster CPU speeds to increase performance over previous versions.