User Tools

Site Tools


tech:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
tech:slurm [2020/06/22 16:12] kohofertech:slurm [2020/06/30 17:23] – [Create modules file] kohofer
Line 391: Line 391:
 ==== Create modules file ==== ==== Create modules file ====
  
 +**PYTHON**
  
   cd /opt/modules/modulefiles/   cd /opt/modules/modulefiles/
Line 407: Line 408:
  
 </code> </code>
-   
  
 +**CUDA**
  
 +  vi /opt/modules/modulefiles/cuda-10.2
 +
 +<code>
 +#%Module1.0
 +proc ModulesHelp { } {
 +global dotversion
 +
 +puts stderr "\tcuda-10.2"
 +}
 +
 +module-whatis "cuda-10.2"
 +
 +set     prefix  /usr/local/cuda-10.2
 +
 +setenv          CUDA_HOME       $prefix
 +prepend-path    PATH            $prefix/bin
 +prepend-path    LD_LIBRARY_PATH $prefix/lib64
 +</code>
  
 ===== GCC ===== ===== GCC =====
Line 740: Line 759:
  
  
-===== CUDA NVIDIA TESLA =====+===== CUDA NVIDIA TESLA Infos =====
  
-root@gpu03:/usr/local/cuda/samples/bin/x86_64/linux# deviceQuery +=== nvidia-smi === 
 + 
 + 
 +  root@gpu02:~# watch nvidia-smi 
 + 
 +<code> 
 +Every 2.0s: nvidia-smi                                           gpu02: Mon Jun 22 17:49:14 2020 
 + 
 +Mon Jun 22 17:49:14 2020 
 ++-----------------------------------------------------------------------------+ 
 +| NVIDIA-SMI 440.64.00    Driver Version: 440.64.00    CUDA Version: 10.2     | 
 +|-------------------------------+----------------------+----------------------+ 
 +| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC | 
 +| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. | 
 +|===============================+======================+======================| 
 +|    Tesla V100-PCIE...  On   | 00000000:3B:00.0 Off |                    0 | 
 +| N/A   53C    P0   139W / 250W |  31385MiB / 32510MiB |     69%      Default | 
 ++-------------------------------+----------------------+----------------------+ 
 +|    Tesla V100-PCIE...  On   | 00000000:AF:00.0 Off |                    0 | 
 +| N/A   35C    P0    26W / 250W |      0MiB / 32510MiB |      0%      Default | 
 ++-------------------------------+----------------------+----------------------+ 
 + 
 ++-----------------------------------------------------------------------------+ 
 +| Processes:                                                       GPU Memory | 
 +|  GPU       PID   Type   Process name                             Usage      | 
 +|=============================================================================| 
 +|    0      8627      C   /opt/anaconda3/bin/python3                 31373MiB | 
 ++-----------------------------------------------------------------------------+ 
 + 
 +</code> 
 + 
 +=== deviceQuery === 
 + 
 + 
 +To run the deviceQuery it is necessary to make it first! 
 + 
 +  root@gpu03:~# cd /usr/local/cuda/samples/1_Utilities/deviceQuery 
 +  make 
 + 
 +Add PATH to the system wide environment 
 + 
 +  vi /etc/environment 
 + 
 +Add this to the end 
 + 
 +  /usr/local/cuda/samples/bin/x86_64/linux/release 
 + 
 +Next enable/source it: 
 + 
 +  source /etc/environment 
 + 
 +<code> 
 +root@gpu03:~# deviceQuery 
 deviceQuery Starting... deviceQuery Starting...
  
Line 825: Line 896:
 deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 2 deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 2
 Result = PASS Result = PASS
- +</code>
  
 ===== Links ===== ===== Links =====
/data/www/wiki.inf.unibz.it/data/pages/tech/slurm.txt · Last modified: 2022/11/24 16:17 by kohofer