GPU Hardware: Difference between revisions
Jump to navigation
Jump to search
(Created page with "# One nVidia Tesla S1070 with four GPU cards (960 GPU cores) for programs written to use this architecture. # One nVidia 2075 GPU processor (448 GPU cores)") |
No edit summary |
||
| Line 1: | Line 1: | ||
# One nVidia Tesla S1070 with four GPU cards (960 GPU cores) for programs written to use this architecture. | # One nVidia Tesla S1070 with four GPU cards (960 GPU cores) for programs written to use this architecture. | ||
# One nVidia | |||
Output of nVidia SDK deviceQuery: | |||
<pre class="gscript"> | |||
Device 0: "Tesla T10 Processor" | |||
CUDA Driver Version / Runtime Version 4.0 / 4.0 | |||
CUDA Capability Major/Minor version number: 1.3 | |||
Total amount of global memory: 4096 MBytes (4294770688 bytes) | |||
(30) Multiprocessors x ( 8) CUDA Cores/MP: 240 CUDA Cores | |||
GPU Clock Speed: 1.44 GHz | |||
Memory Clock rate: 800.00 Mhz | |||
Memory Bus Width: 512-bit | |||
Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048) | |||
Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192) x 512 | |||
Total amount of constant memory: 65536 bytes | |||
Total amount of shared memory per block: 16384 bytes | |||
Total number of registers available per block: 16384 | |||
Warp size: 32 | |||
Maximum number of threads per block: 512 | |||
Maximum sizes of each dimension of a block: 512 x 512 x 64 | |||
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1 | |||
Maximum memory pitch: 2147483647 bytes | |||
Texture alignment: 256 bytes | |||
Concurrent copy and execution: Yes with 1 copy engine(s) | |||
Run time limit on kernels: No | |||
Integrated GPU sharing Host Memory: No | |||
Support host page-locked memory mapping: Yes | |||
Concurrent kernel execution: No | |||
Alignment requirement for Surfaces: Yes | |||
Device has ECC support enabled: No | |||
Device is using TCC driver mode: No | |||
Device supports Unified Addressing (UVA): No | |||
Device PCI Bus ID / PCI location ID: 11 / 0 | |||
Compute Mode: | |||
< Exclusive Process (many threads in one process is able to use ::cudaSetDevice() with this device) > | |||
</pre> | |||
# One nVidia C2075 GPU processor (448 GPU cores) | |||
Output of nVidia SDK deviceQuery: | |||
<pre class="gscript"> | |||
Device 0: "Tesla C2075" | |||
CUDA Driver Version / Runtime Version 5.0 / 4.0 | |||
CUDA Capability Major/Minor version number: 2.0 | |||
Total amount of global memory: 6144 MBytes (6442123264 bytes) | |||
(14) Multiprocessors x (32) CUDA Cores/MP: 448 CUDA Cores | |||
GPU Clock Speed: 1.15 GHz | |||
Memory Clock rate: 1566.00 Mhz | |||
Memory Bus Width: 384-bit | |||
L2 Cache Size: 786432 bytes | |||
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048) | |||
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048 | |||
Total amount of constant memory: 65536 bytes | |||
Total amount of shared memory per block: 49152 bytes | |||
Total number of registers available per block: 32768 | |||
Warp size: 32 | |||
Maximum number of threads per block: 1024 | |||
Maximum sizes of each dimension of a block: 1024 x 1024 x 64 | |||
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535 | |||
Maximum memory pitch: 2147483647 bytes | |||
Texture alignment: 512 bytes | |||
Concurrent copy and execution: Yes with 2 copy engine(s) | |||
Run time limit on kernels: No | |||
Integrated GPU sharing Host Memory: No | |||
Support host page-locked memory mapping: Yes | |||
Concurrent kernel execution: Yes | |||
Alignment requirement for Surfaces: Yes | |||
Device has ECC support enabled: No | |||
Device is using TCC driver mode: No | |||
Device supports Unified Addressing (UVA): Yes | |||
Device PCI Bus ID / PCI location ID: 3 / 0 | |||
Compute Mode: | |||
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > | |||
</pre> | |||
Revision as of 16:28, 13 February 2013
- One nVidia Tesla S1070 with four GPU cards (960 GPU cores) for programs written to use this architecture.
Output of nVidia SDK deviceQuery:
Device 0: "Tesla T10 Processor"
CUDA Driver Version / Runtime Version 4.0 / 4.0
CUDA Capability Major/Minor version number: 1.3
Total amount of global memory: 4096 MBytes (4294770688 bytes)
(30) Multiprocessors x ( 8) CUDA Cores/MP: 240 CUDA Cores
GPU Clock Speed: 1.44 GHz
Memory Clock rate: 800.00 Mhz
Memory Bus Width: 512-bit
Max Texture Dimension Size (x,y,z) 1D=(8192), 2D=(65536,32768), 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(8192) x 512, 2D=(8192,8192) x 512
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 2147483647 bytes
Texture alignment: 256 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: No
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): No
Device PCI Bus ID / PCI location ID: 11 / 0
Compute Mode:
< Exclusive Process (many threads in one process is able to use ::cudaSetDevice() with this device) >
- One nVidia C2075 GPU processor (448 GPU cores)
Output of nVidia SDK deviceQuery:
Device 0: "Tesla C2075"
CUDA Driver Version / Runtime Version 5.0 / 4.0
CUDA Capability Major/Minor version number: 2.0
Total amount of global memory: 6144 MBytes (6442123264 bytes)
(14) Multiprocessors x (32) CUDA Cores/MP: 448 CUDA Cores
GPU Clock Speed: 1.15 GHz
Memory Clock rate: 1566.00 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 786432 bytes
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 3 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >