tech:slurm
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
tech:slurm [2020/02/11 10:46] – [Compute Nodes] kohofer | tech:slurm [2022/11/24 16:17] (current) – [Compute Nodes] kohofer | ||
---|---|---|---|
Line 14: | Line 14: | ||
===== Installation ===== | ===== Installation ===== | ||
+ | |||
===== Controller name: slurm-ctrl ===== | ===== Controller name: slurm-ctrl ===== | ||
Line 241: | Line 242: | ||
debug* | debug* | ||
- | If computer node is down | + | If computer node is <color #ed1c24>down</ |
< | < | ||
Line 247: | Line 248: | ||
PARTITION AVAIL TIMELIMIT | PARTITION AVAIL TIMELIMIT | ||
debug* | debug* | ||
+ | |||
+ | sinfo | ||
+ | PARTITION AVAIL TIMELIMIT | ||
+ | gpu* | ||
+ | gpu* | ||
+ | |||
</ | </ | ||
scontrol update nodename=gpu02 state=idle | scontrol update nodename=gpu02 state=idle | ||
scontrol update nodename=gpu03 state=idle | scontrol update nodename=gpu03 state=idle | ||
+ | scontrol update nodename=gpu02 state=resume | ||
< | < | ||
Line 258: | Line 266: | ||
</ | </ | ||
+ | < | ||
+ | sinfo -o " | ||
+ | NODELIST | ||
+ | gpu[02-03] | ||
+ | gpu04 | ||
+ | hpcmoi01, | ||
+ | </ | ||
- | ===== Compute Nodes ===== | ||
+ | ===== Compute Nodes ===== | ||
A compute node is a machine which will receive jobs to execute, sent from the Controller, it runs the slurmd service. | A compute node is a machine which will receive jobs to execute, sent from the Controller, it runs the slurmd service. | ||
Line 334: | Line 349: | ||
chown root: / | chown root: / | ||
- | | + | |
+ | === Directories === | ||
+ | |||
+ | Be sure that the nfs mounted partitions are, all there: | ||
+ | |||
+ | < | ||
+ | /data | ||
+ | / | ||
+ | / | ||
+ | / | ||
+ | /scratch | ||
+ | </ | ||
+ | |||
+ | ===== Modify user accounts ===== | ||
+ | |||
+ | Display the accounts created: | ||
+ | |||
+ | # Show also associations in the accounts | ||
+ | sacctmgr show account -s | ||
+ | # Show all columns separated by pipe | symbol | ||
+ | sacctmgr show account -s -P | ||
+ | # | ||
+ | sacctmgr show user -s | ||
+ | |||
+ | Add user | ||
+ | |||
+ | sacctmgr add user < | ||
+ | |||
+ | Modify user, give 12000 minutes/200 hours for usage | ||
+ | |||
+ | sacctmgr modify user < | ||
+ | |||
+ | Modify user by removing it from certain account | ||
+ | |||
+ | sacctmgr remove user where user=< | ||
+ | |||
+ | Delete user | ||
+ | |||
+ | sacctmgr delete user ivmilan | ||
+ | Deleting users... | ||
+ | ivmilan | ||
+ | Would you like to commit changes? (You have 30 seconds to decide) | ||
+ | (N/y): y | ||
+ | |||
+ | |||
+ | Restart the services: | ||
+ | |||
+ | systemctl restart slurmctld.service | ||
+ | systemctl restart slurmdbd.service | ||
+ | |||
+ | Check status: | ||
+ | |||
+ | systemctl status slurmctld.service | ||
+ | systemctl status slurmdbd.service | ||
+ | |||
+ | ==== Submit a job to a specific node using Slurm' | ||
+ | |||
+ | To run a job on a specific Node use this option in the job script | ||
+ | |||
+ | #SBATCH --nodelist=gpu03 | ||
+ | |||
===== Links ===== | ===== Links ===== | ||
+ | |||
+ | [[https:// | ||
[[https:// | [[https:// | ||
Line 352: | Line 430: | ||
{{ : | {{ : | ||
+ | |||
+ | |||
+ | ====== Modules ====== | ||
+ | |||
+ | The Environment Modules package provides for the dynamic modification of a user's environment via modulefiles. | ||
+ | |||
+ | Installing Modules on Unix | ||
+ | |||
+ | Login into slurm-ctrl and become root | ||
+ | |||
+ | ssh slurm-ctrl | ||
+ | sudo -i | ||
+ | |||
+ | Download modules | ||
+ | |||
+ | curl -LJO https:// | ||
+ | tar xfz modules-4.6.0.tar.gz | ||
+ | cd modules-4.6.0 | ||
+ | |||
+ | |||
+ | $ ./configure --prefix=/ | ||
+ | $ make | ||
+ | $ make install | ||
+ | |||
+ | |||
+ | |||
+ | https:// | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== SPACK ===== | ||
+ | |||
+ | |||
+ | Add different python versions using spack! | ||
+ | |||
+ | 1. First see which python versions are available: | ||
+ | |||
+ | root@slurm-ctrl: | ||
+ | ==> Safe versions (already checksummed): | ||
+ | 3.8.2 3.7.7 3.7.4 3.7.1 3.6.7 3.6.4 3.6.1 3.5.2 3.4.10 | ||
+ | 3.8.1 3.7.6 3.7.3 3.7.0 3.6.6 3.6.3 3.6.0 3.5.1 3.4.3 | ||
+ | 3.8.0 3.7.5 3.7.2 3.6.8 3.6.5 3.6.2 3.5.7 3.5.0 3.3.6 | ||
+ | ==> Remote versions (not yet checksummed): | ||
+ | 3.10.0a6 | ||
+ | 3.10.0a5 | ||
+ | ... | ||
+ | ... | ||
+ | |||
+ | 2. now select the python version you would like to install: | ||
+ | |||
+ | root@slurm-ctrl: | ||
+ | ==> 23834: Installing libiconv | ||
+ | ==> Using cached archive: / | ||
+ | ==> Staging archive: / | ||
+ | ==> Created stage in / | ||
+ | ==> No patches needed for libiconv | ||
+ | ==> 23834: libiconv: Building libiconv [AutotoolsPackage] | ||
+ | ==> 23834: libiconv: Executing phase: ' | ||
+ | ==> 23834: libiconv: Executing phase: ' | ||
+ | ==> 23834: libiconv: Executing phase: ' | ||
+ | ==> 23834: libiconv: Executing phase: ' | ||
+ | ==> 23834: libiconv: Successfully installed libiconv | ||
+ | Fetch: 0.04s. | ||
+ | [+] / | ||
+ | ==> 23834: Installing libbsd | ||
+ | ... | ||
+ | ... | ||
+ | ... | ||
+ | ==> 23834: Installing python | ||
+ | ==> Fetching https:// | ||
+ | ############################################################################################################ | ||
+ | ==> Staging archive: / | ||
+ | ==> Created stage in / | ||
+ | ==> Ran patch() for python | ||
+ | ==> 23834: python: Building python [AutotoolsPackage] | ||
+ | ==> 23834: python: Executing phase: ' | ||
+ | ==> 23834: python: Executing phase: ' | ||
+ | ==> 23834: python: Executing phase: ' | ||
+ | ==> 23834: python: Executing phase: ' | ||
+ | ==> 23834: python: Successfully installed python | ||
+ | Fetch: 1.81s. | ||
+ | [+] / | ||
+ | |||
+ | |||
+ | This will take some minutes time, depending on the type of version | ||
+ | |||
+ | |||
+ | 3. Now you need to add a modules file | ||
+ | |||
+ | root@slurm-ctrl: | ||
+ | |||
+ | < | ||
+ | #%Module1.0 | ||
+ | proc ModulesHelp { } { | ||
+ | global dotversion | ||
+ | | ||
+ | puts stderr " | ||
+ | } | ||
+ | |||
+ | module-whatis " | ||
+ | |||
+ | set | ||
+ | set-alias | ||
+ | |||
+ | prepend-path | ||
+ | prepend-path | ||
+ | |||
+ | </ | ||
+ | |||
+ | 4. New module should now be available: | ||
+ | |||
+ | root@slurm-ctrl: | ||
+ | -------------------------------------------- / | ||
+ | anaconda3 | ||
+ | bzip | ||
+ | cuda-10.2 | ||
+ | cuda-11.0 | ||
+ | |||
+ | 5. Load the new module | ||
+ | |||
+ | root@slurm-ctrl: | ||
+ | |||
+ | 6. Verify it works | ||
+ | |||
+ | root@slurm-ctrl: | ||
+ | Python 3.8.2 (default, Mar 19 2021, 11:05:37) | ||
+ | [GCC 9.3.0] on linux | ||
+ | Type " | ||
+ | >>> | ||
+ | |||
+ | 7. Unload the new module | ||
+ | |||
+ | module unload python-3.8.2 | ||
+ | |||
+ | |||
+ | ===== Python ===== | ||
+ | |||
+ | ==== Python 3.7.7 ==== | ||
+ | |||
+ | |||
+ | cd / | ||
+ | mkdir / | ||
+ | wget https:// | ||
+ | tar xfJ Python-3.7.7.tar.xz | ||
+ | cd Python-3.7.7/ | ||
+ | ./configure --prefix=/ | ||
+ | make | ||
+ | make install | ||
+ | | ||
+ | |||
+ | ==== Python 2.7.18 ==== | ||
+ | |||
+ | |||
+ | cd / | ||
+ | mkdir / | ||
+ | wget https:// | ||
+ | cd Python-2.7.18 | ||
+ | ./configure --prefix=/ | ||
+ | make | ||
+ | make install | ||
+ | |||
+ | ==== Create modules file ==== | ||
+ | |||
+ | **PYTHON** | ||
+ | |||
+ | cd / | ||
+ | vi python-2.7.18 | ||
+ | |||
+ | < | ||
+ | #%Module1.0 | ||
+ | proc ModulesHelp { } { | ||
+ | global dotversion | ||
+ | |||
+ | puts stderr " | ||
+ | } | ||
+ | |||
+ | module-whatis " | ||
+ | prepend-path PATH / | ||
+ | |||
+ | </ | ||
+ | |||
+ | **CUDA** | ||
+ | |||
+ | vi / | ||
+ | |||
+ | < | ||
+ | #%Module1.0 | ||
+ | proc ModulesHelp { } { | ||
+ | global dotversion | ||
+ | |||
+ | puts stderr " | ||
+ | } | ||
+ | |||
+ | module-whatis " | ||
+ | |||
+ | set | ||
+ | |||
+ | setenv | ||
+ | prepend-path | ||
+ | prepend-path | ||
+ | </ | ||
+ | |||
+ | ===== GCC ===== | ||
+ | |||
+ | This takes a long time! | ||
+ | |||
+ | Commands to run to compile gcc-6.1.0 | ||
+ | |||
+ | wget https:// | ||
+ | tar xfj gcc-6.1.0.tar.bz2 | ||
+ | cd gcc-6.1.0 | ||
+ | ./ | ||
+ | ./configure --prefix=/ | ||
+ | make | ||
+ | |||
+ | After some time an error occurs, and the make process stops! | ||
+ | < | ||
+ | ... | ||
+ | In file included from ../ | ||
+ | ./ | ||
+ | ./ | ||
+ | sc = (struct sigcontext *) (void *) & | ||
+ | ^~ | ||
+ | ../ | ||
+ | </ | ||
+ | |||
+ | To fix do: [[https:// | ||
+ | |||
+ | vi / | ||
+ | |||
+ | and replace/ | ||
+ | |||
+ | < | ||
+ | struct ucontext_t *uc_ = context-> | ||
+ | </ | ||
+ | |||
+ | old line: /* struct ucontext *uc_ = context-> | ||
+ | |||
+ | make | ||
+ | |||
+ | Next error: | ||
+ | |||
+ | < | ||
+ | ../ | ||
+ | | ||
+ | |||
+ | </ | ||
+ | |||
+ | To fix see: [[https:// | ||
+ | or [[https:// | ||
+ | |||
+ | Amend the files according to solution above! | ||
+ | |||
+ | Next error: | ||
+ | |||
+ | < | ||
+ | ... | ||
+ | checking for unzip... unzip | ||
+ | configure: error: cannot find neither zip nor jar, cannot continue | ||
+ | Makefile: | ||
+ | ... | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | apt install unzip zip | ||
+ | |||
+ | and run make again! | ||
+ | |||
+ | make | ||
+ | |||
+ | Next error: | ||
+ | |||
+ | < | ||
+ | ... | ||
+ | In file included from ../ | ||
+ | ../ | ||
+ | ./ | ||
+ | | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | Edit the file: / | ||
+ | |||
+ | vi / | ||
+ | |||
+ | <note warning> | ||
+ | |||
+ | < | ||
+ | // kh | ||
+ | ucontext_t *_uc = (ucontext_t *); \ | ||
+ | //struct ucontext *_uc = (struct ucontext *)_p; \ | ||
+ | // kh | ||
+ | |||
+ | </ | ||
+ | |||
+ | Next error: | ||
+ | |||
+ | <code php> | ||
+ | ... | ||
+ | In file included from ../ | ||
+ | ./ | ||
+ | // | ||
+ | | ||
+ | ../ | ||
+ | ./ | ||
+ | | ||
+ | | ||
+ | ../ | ||
+ | | ||
+ | | ||
+ | ../ | ||
+ | | ||
+ | | ||
+ | ../ | ||
+ | ../ | ||
+ | | ||
+ | | ||
+ | ../ | ||
+ | ../ | ||
+ | ../ | ||
+ | | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | ===== Examples ===== | ||
+ | |||
+ | ==== Example mnist ==== | ||
+ | |||
+ | An simple example to use nvidia GPU! | ||
+ | |||
+ | The example consists of the following files: | ||
+ | |||
+ | * README.md | ||
+ | * requirements.txt | ||
+ | * main.job | ||
+ | * main.py | ||
+ | |||
+ | Create a folder mnist and place the 4 files in there. | ||
+ | |||
+ | mkdir mnist | ||
+ | |||
+ | cat README.md | ||
+ | |||
+ | < | ||
+ | # Basic MNIST Example | ||
+ | |||
+ | ```bash | ||
+ | pip install -r requirements.txt | ||
+ | python main.py | ||
+ | # CUDA_VISIBLE_DEVICES=2 python main.py | ||
+ | ``` | ||
+ | </ | ||
+ | |||
+ | |||
+ | cat requirements.txt | ||
+ | < | ||
+ | torch | ||
+ | torchvision | ||
+ | </ | ||
+ | |||
+ | |||
+ | cat main.job | ||
+ | < | ||
+ | #!/bin/bash | ||
+ | |||
+ | #SBATCH --job-name=mnist | ||
+ | #SBATCH --output=mnist.out | ||
+ | #SBATCH --error=mnist.err | ||
+ | |||
+ | #SBATCH --partition gpu | ||
+ | #SBATCH --gres=gpu | ||
+ | #SBATCH --mem-per-cpu=4gb | ||
+ | #SBATCH --nodes 2 | ||
+ | #SBATCH --time=00: | ||
+ | |||
+ | #SBATCH --ntasks=10 | ||
+ | |||
+ | #SBATCH --mail-type=ALL | ||
+ | #SBATCH --mail-user=< | ||
+ | |||
+ | ml load miniconda3 | ||
+ | python3 main.py | ||
+ | </ | ||
+ | |||
+ | Remove < | ||
+ | |||
+ | {(xssnipper>, | ||
+ | |||
+ | from __future__ import print_function | ||
+ | import argparse | ||
+ | import torch | ||
+ | import torch.nn as nn | ||
+ | import torch.nn.functional as F | ||
+ | import torch.optim as optim | ||
+ | from torchvision import datasets, transforms | ||
+ | from torch.optim.lr_scheduler import StepLR | ||
+ | |||
+ | |||
+ | class Net(nn.Module): | ||
+ | def __init__(self): | ||
+ | super(Net, self).__init__() | ||
+ | self.conv1 = nn.Conv2d(1, | ||
+ | self.conv2 = nn.Conv2d(32, | ||
+ | self.dropout1 = nn.Dropout2d(0.25) | ||
+ | self.dropout2 = nn.Dropout2d(0.5) | ||
+ | self.fc1 = nn.Linear(9216, | ||
+ | self.fc2 = nn.Linear(128, | ||
+ | |||
+ | def forward(self, | ||
+ | x = self.conv1(x) | ||
+ | x = F.relu(x) | ||
+ | x = self.conv2(x) | ||
+ | x = F.max_pool2d(x, | ||
+ | x = self.dropout1(x) | ||
+ | x = torch.flatten(x, | ||
+ | x = self.fc1(x) | ||
+ | x = F.relu(x) | ||
+ | x = self.dropout2(x) | ||
+ | x = self.fc2(x) | ||
+ | output = F.log_softmax(x, | ||
+ | return output | ||
+ | |||
+ | |||
+ | def train(args, model, device, train_loader, | ||
+ | model.train() | ||
+ | for batch_idx, (data, target) in enumerate(train_loader): | ||
+ | data, target = data.to(device), | ||
+ | optimizer.zero_grad() | ||
+ | output = model(data) | ||
+ | loss = F.nll_loss(output, | ||
+ | loss.backward() | ||
+ | optimizer.step() | ||
+ | if batch_idx % args.log_interval == 0: | ||
+ | print(' | ||
+ | epoch, batch_idx * len(data), len(train_loader.dataset), | ||
+ | 100. * batch_idx / len(train_loader), | ||
+ | |||
+ | |||
+ | def test(args, model, device, test_loader): | ||
+ | model.eval() | ||
+ | test_loss = 0 | ||
+ | correct = 0 | ||
+ | with torch.no_grad(): | ||
+ | for data, target in test_loader: | ||
+ | data, target = data.to(device), | ||
+ | output = model(data) | ||
+ | test_loss += F.nll_loss(output, | ||
+ | pred = output.argmax(dim=1, | ||
+ | correct += pred.eq(target.view_as(pred)).sum().item() | ||
+ | |||
+ | test_loss /= len(test_loader.dataset) | ||
+ | |||
+ | print(' | ||
+ | test_loss, correct, len(test_loader.dataset), | ||
+ | 100. * correct / len(test_loader.dataset))) | ||
+ | |||
+ | |||
+ | def main(): | ||
+ | # Training settings | ||
+ | parser = argparse.ArgumentParser(description=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | |||
+ | parser.add_argument(' | ||
+ | help=' | ||
+ | args = parser.parse_args() | ||
+ | use_cuda = not args.no_cuda and torch.cuda.is_available() | ||
+ | |||
+ | torch.manual_seed(args.seed) | ||
+ | |||
+ | device = torch.device(" | ||
+ | |||
+ | kwargs = {' | ||
+ | train_loader = torch.utils.data.DataLoader( | ||
+ | datasets.MNIST(' | ||
+ | | ||
+ | | ||
+ | | ||
+ | ])), | ||
+ | batch_size=args.batch_size, | ||
+ | test_loader = torch.utils.data.DataLoader( | ||
+ | datasets.MNIST(' | ||
+ | | ||
+ | | ||
+ | ])), | ||
+ | batch_size=args.test_batch_size, | ||
+ | |||
+ | model = Net().to(device) | ||
+ | optimizer = optim.Adadelta(model.parameters(), | ||
+ | |||
+ | scheduler = StepLR(optimizer, | ||
+ | for epoch in range(1, args.epochs + 1): | ||
+ | train(args, model, device, train_loader, | ||
+ | test(args, model, device, test_loader) | ||
+ | scheduler.step() | ||
+ | |||
+ | if args.save_model: | ||
+ | torch.save(model.state_dict(), | ||
+ | |||
+ | |||
+ | if __name__ == ' | ||
+ | main() | ||
+ | |||
+ | )} | ||
+ | |||
+ | Once you have all files launch this command on slurm-ctrl: | ||
+ | |||
+ | sbatch main.job | ||
+ | |||
+ | Check your job with | ||
+ | |||
+ | squeue | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | |||
+ | ===== CUDA NVIDIA TESLA Infos ===== | ||
+ | |||
+ | === nvidia-smi === | ||
+ | |||
+ | |||
+ | root@gpu02: | ||
+ | |||
+ | < | ||
+ | Every 2.0s: nvidia-smi | ||
+ | |||
+ | Mon Jun 22 17:49:14 2020 | ||
+ | +-----------------------------------------------------------------------------+ | ||
+ | | NVIDIA-SMI 440.64.00 | ||
+ | |-------------------------------+----------------------+----------------------+ | ||
+ | | GPU Name Persistence-M| Bus-Id | ||
+ | | Fan Temp Perf Pwr: | ||
+ | |===============================+======================+======================| | ||
+ | | | ||
+ | | N/A | ||
+ | +-------------------------------+----------------------+----------------------+ | ||
+ | | | ||
+ | | N/A | ||
+ | +-------------------------------+----------------------+----------------------+ | ||
+ | |||
+ | +-----------------------------------------------------------------------------+ | ||
+ | | Processes: | ||
+ | | GPU | ||
+ | |=============================================================================| | ||
+ | | 0 8627 C / | ||
+ | +-----------------------------------------------------------------------------+ | ||
+ | |||
+ | </ | ||
+ | |||
+ | === deviceQuery === | ||
+ | |||
+ | |||
+ | To run the deviceQuery it is necessary to make it first! | ||
+ | |||
+ | root@gpu03: | ||
+ | make | ||
+ | |||
+ | Add PATH to the system wide environment | ||
+ | |||
+ | vi / | ||
+ | |||
+ | Add this to the end | ||
+ | |||
+ | / | ||
+ | |||
+ | Next enable/ | ||
+ | |||
+ | source / | ||
+ | |||
+ | < | ||
+ | root@gpu03: | ||
+ | deviceQuery Starting... | ||
+ | |||
+ | CUDA Device Query (Runtime API) version (CUDART static linking) | ||
+ | |||
+ | Detected 2 CUDA Capable device(s) | ||
+ | |||
+ | Device 0: "Tesla V100-PCIE-32GB" | ||
+ | CUDA Driver Version / Runtime Version | ||
+ | CUDA Capability Major/Minor version number: | ||
+ | Total amount of global memory: | ||
+ | (80) Multiprocessors, | ||
+ | GPU Max Clock rate: 1380 MHz (1.38 GHz) | ||
+ | Memory Clock rate: 877 Mhz | ||
+ | Memory Bus Width: | ||
+ | L2 Cache Size: | ||
+ | Maximum Texture Dimension Size (x, | ||
+ | Maximum Layered 1D Texture Size, (num) layers | ||
+ | Maximum Layered 2D Texture Size, (num) layers | ||
+ | Total amount of constant memory: | ||
+ | Total amount of shared memory per block: | ||
+ | Total number of registers available per block: 65536 | ||
+ | Warp size: 32 | ||
+ | Maximum number of threads per multiprocessor: | ||
+ | Maximum number of threads per block: | ||
+ | Max dimension size of a thread block (x,y,z): (1024, 1024, 64) | ||
+ | Max dimension size of a grid size (x,y,z): (2147483647, | ||
+ | Maximum memory pitch: | ||
+ | Texture alignment: | ||
+ | Concurrent copy and kernel execution: | ||
+ | Run time limit on kernels: | ||
+ | Integrated GPU sharing Host Memory: | ||
+ | Support host page-locked memory mapping: | ||
+ | Alignment requirement for Surfaces: | ||
+ | Device has ECC support: | ||
+ | Device supports Unified Addressing (UVA): | ||
+ | Device supports Compute Preemption: | ||
+ | Supports Cooperative Kernel Launch: | ||
+ | Supports MultiDevice Co-op Kernel Launch: | ||
+ | Device PCI Domain ID / Bus ID / location ID: 0 / 59 / 0 | ||
+ | Compute Mode: | ||
+ | < Default (multiple host threads can use :: | ||
+ | |||
+ | Device 1: "Tesla V100-PCIE-32GB" | ||
+ | CUDA Driver Version / Runtime Version | ||
+ | CUDA Capability Major/Minor version number: | ||
+ | Total amount of global memory: | ||
+ | (80) Multiprocessors, | ||
+ | GPU Max Clock rate: 1380 MHz (1.38 GHz) | ||
+ | Memory Clock rate: 877 Mhz | ||
+ | Memory Bus Width: | ||
+ | L2 Cache Size: | ||
+ | Maximum Texture Dimension Size (x, | ||
+ | Maximum Layered 1D Texture Size, (num) layers | ||
+ | Maximum Layered 2D Texture Size, (num) layers | ||
+ | Total amount of constant memory: | ||
+ | Total amount of shared memory per block: | ||
+ | Total number of registers available per block: 65536 | ||
+ | Warp size: 32 | ||
+ | Maximum number of threads per multiprocessor: | ||
+ | Maximum number of threads per block: | ||
+ | Max dimension size of a thread block (x,y,z): (1024, 1024, 64) | ||
+ | Max dimension size of a grid size (x,y,z): (2147483647, | ||
+ | Maximum memory pitch: | ||
+ | Texture alignment: | ||
+ | Concurrent copy and kernel execution: | ||
+ | Run time limit on kernels: | ||
+ | Integrated GPU sharing Host Memory: | ||
+ | Support host page-locked memory mapping: | ||
+ | Alignment requirement for Surfaces: | ||
+ | Device has ECC support: | ||
+ | Device supports Unified Addressing (UVA): | ||
+ | Device supports Compute Preemption: | ||
+ | Supports Cooperative Kernel Launch: | ||
+ | Supports MultiDevice Co-op Kernel Launch: | ||
+ | Device PCI Domain ID / Bus ID / location ID: 0 / 175 / 0 | ||
+ | Compute Mode: | ||
+ | < Default (multiple host threads can use :: | ||
+ | > Peer access from Tesla V100-PCIE-32GB (GPU0) -> Tesla V100-PCIE-32GB (GPU1) : Yes | ||
+ | > Peer access from Tesla V100-PCIE-32GB (GPU1) -> Tesla V100-PCIE-32GB (GPU0) : Yes | ||
+ | |||
+ | deviceQuery, | ||
+ | Result = PASS | ||
+ | </ | ||
+ | |||
+ | ===== Links ===== | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | |||
+ | https:// | ||
+ | |||
+ | https:// | ||
+ | |||
+ | https:// | ||
+ | |||
+ | http:// | ||
+ | |||
+ | https:// | ||
+ |
/data/www/wiki.inf.unibz.it/data/attic/tech/slurm.1581414417.txt.gz · Last modified: 2020/02/11 10:46 by kohofer