Nvidia GPU & TensorFlow for ML in Ubuntu 24.04 LTS
Andrew Antonopoulos
Senior Solutions Architect at Sony Professional Solutions Europe
Tensorflow announced that it would stop supporting GPUs for Windows. The latest support version was 2.10. However, with 2.11 and onwards, we will need to use Windows WSL2, a Windows subsystem for Linux. With WSL2, you can run Ubuntu or other Linux distros in Windows.
Linux fans, who obviously don't like WSL2, can use Tensorflow and Nvidia GPU by implementing the following steps.
A very important step is to know the version for Tensorflow, Python, CUDA and cuDNN. While I was testing my setup, I used the following versions:
Also, more information about the above versions can be found in the following URL Tensorflow supported versions
The GPU that I used was GeForce RTX 4060 Ti 16GB, but can be checked with the command:
lspci | grep -i nvidia
which, in my case, returned the following results:
01:00.0 VGA compatible controller: NVIDIA Corporation AD106 [GeForce RTX 4060 Ti 16GB] (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 22bd (rev a1)
To avoid issues is better to remove all previous or old drivers:
sudo apt purge nvidia* -y
sudo apt remove nvidia-* -y
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt autoremove -y && sudo apt autoclean -y
sudo rm -rf /usr/local/cuda*
and update/upgrade the system:
sudo apt update && sudo apt upgrade -y
additionally will need to install some necessary packages:
sudo apt install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
Now will need to install the latest GPU drivers:
# First get the PPA repository driver
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
# Find recommended driver versions for you
ubuntu-drivers devices
#List the drivers
ubuntu-drivers list
#Install the driver
ubuntu-drivers install
At this stage will require reboot the PC, and after logging in will need to check that the GPU has been installed by using this command:
nvidia-smi
which should return the following:
In the above image, you can see the Nvidia driver (550.54), the GPU and the running processes. During the model training, you should see the GPU usage increasing for the Python process. Now will need to install CUDA 11.8:
# Download the pin file and move it into the folder
sudo wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
#Move to the internal folder
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
# This is one command
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
# This is also one command
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
# Update and upgrade
sudo apt update && sudo apt upgrade -y
# installing CUDA-11.8
sudo apt install cuda-11-8 -y
and set the paths:
领英推荐
# Add the bin folder to the bashrc file
echo 'export PATH=/usr/local/cuda-11.8/bin:$PATH' >> ~/.bashrc
# Add the lib64 folder to the bashrc file
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
# Execute the commands placed in the file
source ~/.bashrc
# Update the shared library cache
sudo ldconfig
Now is time for the cuDNN 8.6, but will need to register for the Nvidia developer program by using this URL: https://developer.nvidia.com/developer-program/signup
# Add the file name as a variable
CUDNN_TAR_FILE="cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz"
# Download the file
sudo wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0.163/local_installers/11.8/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz
# Unzip it
sudo tar -xvf ${CUDNN_TAR_FILE}
and copy the files in the appropriate folders:
# Copy the files into the cuda toolkit directory
sudo cp -P cudnn-linux-x86_64-8.6.0.163_cuda11-archive/include/cudnn.h /usr/local/cuda-11.8/include/
sudo cp -P cudnn-linux-x86_64-8.6.0.163_cuda11-archive/lib/libcudnn* /usr/local/cuda-11.8/lib64/
# Change the attributes
sudo chmod a+r /usr/local/cuda-11.8/lib64/libcudnn*
At this stage, you should be able to see the correct CUDA version:
The remaining tasks are to install Anaconda with Python, which you can find the information on this site: Anaconda for Ubuntu 24.04
The required Tensorflow version is 2.12; therefore, the following commands will help to install it (also, I had to install numpy, typing-extensions and pip):
pip install tensorflow==2.12*
pip install numpy=1.24.3
pip install typing-extensions==4.5.0
To verify the whole procedure, you can use the following Python scripts (which can also be found in GitHub: Python for GPU Check
# Get the files
git clone https://github.com/gokul-a-krishnan/python-gpu-check
# Access the folder
cd python-gpu-check/
cd tensorflow/
# Execute the script
python3 check.py
The important output from this script is the information for the GPU and the confirmation for the cuDNN version:
2024-05-13 17:57:09.802997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14030 MB memory: -> device: 0, name: NVIDIA GeForce RTX 4060 Ti, pci bus id: 0000:01:00.0, compute capability: 8.9
2024-05-13 17:57:10.938643: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:424] Loaded cuDNN version 8600
There are two options to monitor the GPU while you are training models:
nvidia-smi -l 5
Mission Center is an amazing tool and looks like this:
Hopefully, you found this article interesting, and can help you with your ML tasks
programmer at Mexmash
8 个月Hi, thank you for this post. On "pip install tensorflow==2.12*" i get an error: "Could not find a version tha satisfies the requirement (from verwions 2.16.0rc0, 2.16.1, 2.17.0rc" no matching distribution. What can i do?
programmer at Mexmash
8 个月the path is not sudo wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0.163/local_installers/11.8/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz it is: sudo wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0/local_installers/11.8/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz