Radeon Open Compute (ROCm)
ROCm is focused on using AMD GPUs to accelerate computational tasks such as machine learning, engineering workloads, and scientific computing. - github / Doc / HN
Confirm You Have a ROCm-Capable GPU (v5.0)
- ROCm officially supports AMD GPUs that use following chips:
- GFX8 GPUs / “Polaris 10” chips, such as on the AMD Radeon RX 580
- ROCm does not support gf8 devices officially. ROCm works on some gfx8 devices; however, it is not officially tested and validated.
- Installation of ROCm and TensorFlow on Ubuntu 20.4 LTS for Radeon RX580 - 2023
$ sudo lshw -class display
*-display
description: VGA compatible controller
product: Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:07:00.0
logical name: /dev/fb0
version: e7
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi vga_controller bus_master cap_list rom fb
configuration: depth=32 driver=amdgpu latency=0 resolution=2560,1440
resources: irq:72 memory:e0000000-efffffff memory:f0000000-f01fffff ioport:e000(size=256) memory:fcf00000-fcf3ffff memory:c0000-dffff
ROCm Installation
ROCm Installation Guide v5.4.2 - Ubuntu v22.04 - current setup
$ mkdir ROCm && cd ROCm
$ wget https://repo.radeon.com/amdgpu-install/5.4.2/ubuntu/jammy/amdgpu-install_5.4.50402-1_all.deb
$ sudo apt-get install ./amdgpu-install_5.4.50402-1_all.deb
# then (lengthy download process)
$ sudo amdgpu-install --usecase=rocm
Testing the ROCm Installation
$ apt show rocm-libs -a
After restarting the system, run the following commands to verify that the ROCm installation is successful. If you see your GPUs listed, you are good to go!
$ /opt/rocm/bin/rocminfo
# will display CPU + GPU found
ROCk module is loaded
...
Agent 1
Name: AMD Ryzen 5 3600 6-Core Processor
...
Agent 2
...
Name: gfx803
Marketing Name: Radeon RX 580 Series
$ /opt/rocm/opencl/bin/clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.1 AMD-APP (3513.0)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 0
Run sudo apt install mesa-opencl-icd
to fix it:
$ /opt/rocm/opencl/bin/clinfo
Number of platforms: 2
...
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 0
Platform Name: Clover
Number of devices: 1
Platform ID: 0x7faa6198fb40
Name: Radeon RX 580 Series (polaris10, LLVM 13.0.1, DRM 3.48, 5.15.0-58-l
$ rocm-smi
$ radeontop # display some GPU metrics
Consider nvtop
$ nvtop
ROCm Libraries
This is needed to compile code using ROCm => “fatal error: ‘hipblas/hipblas.h’ file not found”
$ sudo apt install rocm-hip-libraries
ROCm Examples
- missing lib hipcub => TBD
Older notes
- In this guide we stick to ROCm 3.5.1 - AMD dropped official support for the RX580 - Pytorch - 1.7.0a0+898eb06 - ROCm version: 3.5.1
Kernel installation for GPU support
Go with docker image (so that python/tensorflow version can be isolated)
Alternatives
PlaidML
As a component under Keras, PlaidML can accelerate training workloads with customized or automatically-generated Tile code. It works especially well on GPUs, and it doesn’t require use of CUDA/cuDNN on Nvidia hardware, while achieving comparable performance.
OpenCL
- using AMD GPU - OpenCL