Each SM of GK-110 owns 192 single precision CUDA cores
, 64 double-precision units, 32 special function units (SFUs), and 32 load/store units , yielding a throughput of 192 SPF arithmetic, 64 DPF arithmetic, 160 32-bit integer add, and 32 32-bit integer multiplication instructions per clock circle [4, 25].
Even the TITAN X had 28 functional SMs which means that both cards have the same number of CUDA cores
For example, with the NVIDIA(R) TK1 SoC with a 4 core Cortex-A15 with 2.1 GHz and a GPU with 192 CUDA cores
. The unified memory between GPU and CPU avoids the PCIe bottleneck in data sharing between CPU and GPU.
Up till this type of memory was introduced a streaming multiprocessor had to store data into the low speed global memory and retrieve it afterwards in order to be certain of the cohesion among the CUDA cores
It hosts the NVIDIA GeForce GTX-950M processor which offers 640 CUDA cores
, 1,271 TFLOPs of floating point performance, and 4 GB of GDDR5 graphics memory.
Moreover, the parallelized code is tested on a NVIDIA GRID GPU (Kepler GK104) device, having 1,536 CUDA cores
with an 800Mhz system clock and 4GB RAM.
Experiments were conducted on a pc equipped with an Intel 2.2 GHz CPU and a NVidia GeForce GTX 550 Ti GPU with 1024 MB GDDR5 of global memory and 192 CUDA cores
. The software was CUDA driver and sdk with version 5.0 using C++ language.
The GTX Titan Z graphics card boasts two Nvidia GeForce GPUs running at 876 MHz boost clock, 12GB GDDR5 on-board memory, and a combined total of 5760 CUDA cores
to give fast and smooth stutter-free visuals even when powering multi-monitor gaming rigs or 4K/UHD (ultra-high definition) monitors.
A MP has 8 CUDA cores
. There are 112 CUDA cores
A total of 16 SMs, each with 32 SPs, form the architecture of 512 CUDA cores
. The GTX580 has a frequency of 1.54 GHz and a global memory size of 1.5 GB.