登录查看更多内容

Nvidia GPU & TensorFlow for ML in Ubuntu 24.04 LTS

Andrew Antonopoulos

Senior Solutions Architect at Sony Professional Solutions Europe

发布日期: 2024年5月13日

Tensorflow announced that it would stop supporting GPUs for Windows. The latest support version was 2.10. However, with 2.11 and onwards, we will need to use Windows WSL2, a Windows subsystem for Linux. With WSL2, you can run Ubuntu or other Linux distros in Windows.

Linux fans, who obviously don't like WSL2, can use Tensorflow and Nvidia GPU by implementing the following steps.

A very important step is to know the version for Tensorflow, Python, CUDA and cuDNN. While I was testing my setup, I used the following versions:

Tensorflow 2.12.0
CUDA Toolkit 11.8
cuDNN SDK 8.6.0
Python 3.11.7
Nvidia GPU drivers (latest - 550.54)

Also, more information about the above versions can be found in the following URL Tensorflow supported versions

The GPU that I used was GeForce RTX 4060 Ti 16GB, but can be checked with the command:

lspci | grep -i nvidia

which, in my case, returned the following results:

01:00.0 VGA compatible controller: NVIDIA Corporation AD106 [GeForce RTX 4060 Ti 16GB] (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 22bd (rev a1)

To avoid issues is better to remove all previous or old drivers:

sudo apt purge nvidia* -y
sudo apt remove nvidia-* -y
sudo rm /etc/apt/sources.list.d/cuda*
sudo apt autoremove -y && sudo apt autoclean -y
sudo rm -rf /usr/local/cuda*

and update/upgrade the system:

sudo apt update && sudo apt upgrade -y

additionally will need to install some necessary packages:

sudo apt install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev

Now will need to install the latest GPU drivers:

# First get the PPA repository driver
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update

# Find recommended driver versions for you
ubuntu-drivers devices

#List the drivers
ubuntu-drivers list

#Install the driver
ubuntu-drivers install

At this stage will require reboot the PC, and after logging in will need to check that the GPU has been installed by using this command:

nvidia-smi

which should return the following:

In the above image, you can see the Nvidia driver (550.54), the GPU and the running processes. During the model training, you should see the GPU usage increasing for the Python process. Now will need to install CUDA 11.8:

# Download the pin file and move it into the folder
sudo wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin

#Move to the internal folder
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600

# This is one command
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub

# This is also one command
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"

# Update and upgrade
sudo apt update && sudo apt upgrade -y

 # installing CUDA-11.8
sudo apt install cuda-11-8 -y

and set the paths:

领英推荐

IAR News Update October

IAR 4 个月前

Qt for MCUs Porting on STM32H7B3I-DK

NXON - An IoT and AI Company 11 个月前

Hyperstack Weekly Rundown #18: JavaScript SDK Alpha…

Hyperstack 1 个月前

# Add the bin folder to the bashrc file
echo 'export PATH=/usr/local/cuda-11.8/bin:$PATH' >> ~/.bashrc

# Add the lib64 folder to the bashrc file
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc

# Execute the commands placed in the file
source ~/.bashrc

# Update the shared library cache
sudo ldconfig

Now is time for the cuDNN 8.6, but will need to register for the Nvidia developer program by using this URL: https://developer.nvidia.com/developer-program/signup

# Add the file name as a variable
CUDNN_TAR_FILE="cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz"

# Download the file
sudo wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0.163/local_installers/11.8/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz

# Unzip it
sudo tar -xvf ${CUDNN_TAR_FILE}

and copy the files in the appropriate folders:

# Copy the files into the cuda toolkit directory
sudo cp -P cudnn-linux-x86_64-8.6.0.163_cuda11-archive/include/cudnn.h /usr/local/cuda-11.8/include/

sudo cp -P cudnn-linux-x86_64-8.6.0.163_cuda11-archive/lib/libcudnn* /usr/local/cuda-11.8/lib64/

# Change the attributes
sudo chmod a+r /usr/local/cuda-11.8/lib64/libcudnn*

At this stage, you should be able to see the correct CUDA version:

The remaining tasks are to install Anaconda with Python, which you can find the information on this site: Anaconda for Ubuntu 24.04

The required Tensorflow version is 2.12; therefore, the following commands will help to install it (also, I had to install numpy, typing-extensions and pip):

pip install tensorflow==2.12*
pip install numpy=1.24.3
pip install typing-extensions==4.5.0

To verify the whole procedure, you can use the following Python scripts (which can also be found in GitHub: Python for GPU Check

# Get the files
git clone https://github.com/gokul-a-krishnan/python-gpu-check

# Access the folder
cd python-gpu-check/
cd tensorflow/

# Execute the script
python3 check.py

The important output from this script is the information for the GPU and the confirmation for the cuDNN version:

2024-05-13 17:57:09.802997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1635] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14030 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4060 Ti, pci bus id: 0000:01:00.0, compute capability: 8.9

2024-05-13 17:57:10.938643: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:424] Loaded cuDNN version 8600

There are two options to monitor the GPU while you are training models:

Use nvidia smi and let it run every 5 seconds:

 nvidia-smi -l 5

Use Mission Center but will need to install it via flathub, Mission Center

Mission Center is an amazing tool and looks like this:

Hopefully, you found this article interesting, and can help you with your ML tasks

#tensorflow #ubuntu #machinelearning #nvidia

Oybek Hodjaev

programmer at Mexmash

8 个月

Hi, thank you for this post. On "pip install tensorflow==2.12*" i get an error: "Could not find a version tha satisfies the requirement (from verwions 2.16.0rc0, 2.16.1, 2.17.0rc" no matching distribution. What can i do?

Oybek Hodjaev

programmer at Mexmash

8 个月

the path is not sudo wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0.163/local_installers/11.8/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz it is: sudo wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.6.0/local_installers/11.8/cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz

查看更多评论

要查看或添加评论，请登录

Andrew Antonopoulos的更多文章

Sustainable ML - Monitor Power Consumption

2024年5月25日

Sustainable ML - Monitor Power Consumption

Training models will also consider the power consumption of the hardware. The following paper compares the most common…
TensorFlow Serving API & gRPC

2024年5月25日

TensorFlow Serving API & gRPC

To serve models for production applications, one can use REST API or gRPC. gRPC is a high-performance, binary, and…
Blockchain & Web3 Technology

2024年5月22日

Blockchain & Web3 Technology

Blockchain is a technology that securely stores transactional information by linking blocks together in a specific…
NVIDIA Mixed Precision - Loss & Accuracy - Part 2

2024年5月20日

NVIDIA Mixed Precision - Loss & Accuracy - Part 2

Part 1 explained how Nvidia's mixed precision can help reduce power consumption. However, we also need to consider…
NVIDIA Mixed Precision & Power Consumption - Part 1

2024年5月14日

NVIDIA Mixed Precision & Power Consumption - Part 1

Deep Learning has enabled progress in many different applications and can be used for developing models for…
FreeBSD 13 & TCP BBR Congestion Control

2022年4月29日

FreeBSD 13 & TCP BBR Congestion Control

Finally TCP BBR is available for FreeBSD new release 13.x.

2 条评论
Kubernetes - Open Source Tools

2020年6月17日

Kubernetes - Open Source Tools

Kubernetes (also known as k8s or “kube”) is a very popular container orchestration platform that automates many of the…
Cache-Control Headers

2020年6月17日

Cache-Control Headers

The performance of content that is available via web sites and applications can be significantly improved by reusing…
CDN Cache and Machine Learning

2020年6月17日

CDN Cache and Machine Learning

The majority of the Internet’s content is delivered by global caching networks, also known as Content Delivery Networks…
OTT & Mobile Battle in Africa

2019年9月5日

OTT & Mobile Battle in Africa

OTT and specially SVOD is growing in Africa. Recently big OTT providers such as Netflix, muvi, Showmax, iFlix, MTN and…

See all articles

Nvidia GPU & TensorFlow for ML in Ubuntu 24.04 LTS

Andrew Antonopoulos

Senior Solutions Architect at Sony Professional Solutions Europe

领英推荐

Andrew Antonopoulos的更多文章

社区洞察

其他会员也浏览了

The Art of Creating Minimal ELF64 Executables by Unconventional Methods

Surfing the Singularity: "the Workflow is the App"

Successfully using your local NVIDIA GPU with PyTorch or TensorFlow

I Accidentally Discovered the Stuxnet of Keyboards while Writing a Userspace HID Driver

Intel Highlights Benefits of Software Optimized Silicon

Intel Highlights Benefits of Software Optimized Silicon

Think Beyond Operating Systems

It seemed like a good idea at the time

Overcoming Microsoft Defender with Assembly

Exploring eBPF, IO Visor and Beyond

领英推荐

Andrew Antonopoulos的更多文章

Sustainable ML - Monitor Power Consumption

TensorFlow Serving API & gRPC

Blockchain & Web3 Technology

NVIDIA Mixed Precision - Loss & Accuracy - Part 2

NVIDIA Mixed Precision & Power Consumption - Part 1

FreeBSD 13 & TCP BBR Congestion Control

Kubernetes - Open Source Tools

Cache-Control Headers

CDN Cache and Machine Learning

OTT & Mobile Battle in Africa

社区洞察

其他会员也浏览了

The Art of Creating Minimal ELF64 Executables by Unconventional Methods

Surfing the Singularity: "the Workflow is the App"

Successfully using your local NVIDIA GPU with PyTorch or TensorFlow

I Accidentally Discovered the Stuxnet of Keyboards while Writing a Userspace HID Driver

Intel Highlights Benefits of Software Optimized Silicon

Intel Highlights Benefits of Software Optimized Silicon

Think Beyond Operating Systems

It seemed like a good idea at the time

Overcoming Microsoft Defender with Assembly

Exploring eBPF, IO Visor and Beyond