Pytorch
Facebook open-source and free framework based on the Torch library. - Home / github
see also
- PyTorch 2.0 - Get Started
Install
Docker images
You can pull a pre-built docker image from Docker Hub and run with docker v19.03+
$ podman run --gpus all --rm -ti --ipc=host -v /home/yves/DEV/:/workspace pytorch/pytorch:latestPyTorch uses shared memory to share data between processes, so if torch multiprocessing is used (e.g. for multithreaded data loaders) the default shared memory segment size that container runs with is not enough, and you should increase shared memory size with --ipc=host
with ROCm as prerequesite.
# 1.3GB Download
$ pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.2
...
Successfully installed torch-1.13.1+rocm5.2 torchaudio-0.13.1+rocm5.2 torchvision-0.14.1+rocm5.2 typing-extensions-4.4.0Verify installation
Using python repl: run python3, and copy past code below.
import torch
x = torch.rand(5, 3)
print(x)=> should output a tensor
Additionally, to check if your GPU driver and CUDA/ROCm is enabled and accessible by PyTorch, run the following commands to return whether or not the GPU driver is enabled (the ROCm build of PyTorch uses the same semantics at the python API level)
torch.cuda.is_available() # Does PyTorch see any GPUs?
torch.cuda.device_count()
torch.cuda.current_device()
torch.cuda.get_device_name(0)
#
torch.rand(10).device # Are tensors stored on GPU by default?
torch.set_default_tensor_type(torch.cuda.FloatTensor) # Set default tensor type to CUDA=> should be True / 1 / 0 / ‘AMD Radeon RX 580 Series’
If crashing with “hipErrorNoBinaryForGpu: Unable to find code object for all current devices!” then export HSA_OVERRIDE_GFX_VERSION=10.3.0
Benchmarking
Monitoring on one console with watch -n 1 rocm-smi and htop in an other
import torch
from torchvision.models import efficientnet_b0
from pytorch_benchmark import benchmark
torch.set_default_tensor_type(torch.cuda.FloatTensor) # enable GPU **this crash the GPU**
model = efficientnet_b0()
sample = torch.randn(8, 3, 224, 224) # (B, C, H, W)
results = benchmark(model, sample, num_runs=100)