User Tools

Site Tools


tech:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
tech:slurm [2020/05/27 11:47] – [Example mnist] kohofertech:slurm [2020/06/22 16:12] kohofer
Line 735: Line 735:
  
   squeue   squeue
 +
 +
 +----
 +
 +
 +===== CUDA NVIDIA TESLA =====
 +
 +root@gpu03:/usr/local/cuda/samples/bin/x86_64/linux# deviceQuery 
 +deviceQuery Starting...
 +
 + CUDA Device Query (Runtime API) version (CUDART static linking)
 +
 +Detected 2 CUDA Capable device(s)
 +
 +Device 0: "Tesla V100-PCIE-32GB"
 +  CUDA Driver Version / Runtime Version          10.2 / 10.2
 +  CUDA Capability Major/Minor version number:    7.0
 +  Total amount of global memory:                 32510 MBytes (34089730048 bytes)
 +  (80) Multiprocessors, ( 64) CUDA Cores/MP:     5120 CUDA Cores
 +  GPU Max Clock rate:                            1380 MHz (1.38 GHz)
 +  Memory Clock rate:                             877 Mhz
 +  Memory Bus Width:                              4096-bit
 +  L2 Cache Size:                                 6291456 bytes
 +  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
 +  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
 +  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
 +  Total amount of constant memory:               65536 bytes
 +  Total amount of shared memory per block:       49152 bytes
 +  Total number of registers available per block: 65536
 +  Warp size:                                     32
 +  Maximum number of threads per multiprocessor:  2048
 +  Maximum number of threads per block:           1024
 +  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
 +  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
 +  Maximum memory pitch:                          2147483647 bytes
 +  Texture alignment:                             512 bytes
 +  Concurrent copy and kernel execution:          Yes with 7 copy engine(s)
 +  Run time limit on kernels:                     No
 +  Integrated GPU sharing Host Memory:            No
 +  Support host page-locked memory mapping:       Yes
 +  Alignment requirement for Surfaces:            Yes
 +  Device has ECC support:                        Enabled
 +  Device supports Unified Addressing (UVA):      Yes
 +  Device supports Compute Preemption:            Yes
 +  Supports Cooperative Kernel Launch:            Yes
 +  Supports MultiDevice Co-op Kernel Launch:      Yes
 +  Device PCI Domain ID / Bus ID / location ID:   0 / 59 / 0
 +  Compute Mode:
 +     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
 +
 +Device 1: "Tesla V100-PCIE-32GB"
 +  CUDA Driver Version / Runtime Version          10.2 / 10.2
 +  CUDA Capability Major/Minor version number:    7.0
 +  Total amount of global memory:                 32510 MBytes (34089730048 bytes)
 +  (80) Multiprocessors, ( 64) CUDA Cores/MP:     5120 CUDA Cores
 +  GPU Max Clock rate:                            1380 MHz (1.38 GHz)
 +  Memory Clock rate:                             877 Mhz
 +  Memory Bus Width:                              4096-bit
 +  L2 Cache Size:                                 6291456 bytes
 +  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
 +  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
 +  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
 +  Total amount of constant memory:               65536 bytes
 +  Total amount of shared memory per block:       49152 bytes
 +  Total number of registers available per block: 65536
 +  Warp size:                                     32
 +  Maximum number of threads per multiprocessor:  2048
 +  Maximum number of threads per block:           1024
 +  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
 +  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
 +  Maximum memory pitch:                          2147483647 bytes
 +  Texture alignment:                             512 bytes
 +  Concurrent copy and kernel execution:          Yes with 7 copy engine(s)
 +  Run time limit on kernels:                     No
 +  Integrated GPU sharing Host Memory:            No
 +  Support host page-locked memory mapping:       Yes
 +  Alignment requirement for Surfaces:            Yes
 +  Device has ECC support:                        Enabled
 +  Device supports Unified Addressing (UVA):      Yes
 +  Device supports Compute Preemption:            Yes
 +  Supports Cooperative Kernel Launch:            Yes
 +  Supports MultiDevice Co-op Kernel Launch:      Yes
 +  Device PCI Domain ID / Bus ID / location ID:   0 / 175 / 0
 +  Compute Mode:
 +     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
 +> Peer access from Tesla V100-PCIE-32GB (GPU0) -> Tesla V100-PCIE-32GB (GPU1) : Yes
 +> Peer access from Tesla V100-PCIE-32GB (GPU1) -> Tesla V100-PCIE-32GB (GPU0) : Yes
 +
 +deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 2
 +Result = PASS
 +
  
  
/data/www/wiki.inf.unibz.it/data/pages/tech/slurm.txt · Last modified: 2022/11/24 16:17 by kohofer