00-系统环境

  • 操作系统:Ubuntu 22.04.4 LTS
  • 显卡型号:RTX 3090(2080Ti也行)

01-下载驱动

打开https://www.nvidia.cn/drivers/lookup/查找符合要求的驱动并下载

image

下载后的文件为NVIDIA-Linux-x86_64-550.127.05.run

该驱动支持的显卡型号如下:

image

可以看到同时支持20系列、30系列和40系列显卡

02-安装前准备

  • 上传驱动文件
    将上一步从官网下载的驱动文件上传到服务器并赋予可执行权限
sudo chmod a+x NVIDIA-Linux-x86_64-550.127.05.run
  • 安装gcc
sudo apt update
sudo apt install build-essential

出现下图时选项全部保持默认,按Tab键选择OK即可

image

验证是否成功

gcc --version
#输出结果如下说明安装成功
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  • 卸载旧驱动

无论之前是否安装过NVIDIA驱动,都推荐执行这一步骤:

# 若安装失败也是这样卸载
sudo apt-get remove --purge nvidia*
#确保卸载干净
sudo sh NVIDIA-Linux-x86_64-550.127.05.run --uninstall 
  • 禁用nouveau驱动
    Nouveau是由第三方为NVIDIA显卡开发的一个开源3D驱动,为了让用户安装完系统即可进入桌面,因此很多Linux发行版默认集成了Nouveau驱动。但是Nouveau驱动会影响安装NVIDIA官方驱动,因此在安装前要先禁用。
sudo vi /etc/modprobe.d/blacklist.conf

在文本最后添加

blacklist nouveau
options nouveau modeset=0
  • 重启服务器
sudo reboot

03-安装驱动

执行以下命令

--修改语言
export LANG='UTF-8'
export LANGUAGE='UTF-8'


sudo ./NVIDIA-Linux-x86_64-550.127.05.run -no-opengl-files -no-x-check


1. There appears to already be a driver installed on your system (version:      
  390.42).  As part of installing this driver (version: 390.42), the existing  
  driver will be uninstalled.  Are you sure you want to continue?

                 Continue installation      Abort installation 

(选择 Coninue)

2. Nouveau can usually be disabled by adding files to the modprobe configuration directories and rebuilding the initramfs.

  Would you like nvidia-installer to attempt to create these modprobe configuration files for you?

                             No                                  Yes                                                               
(选择 Yes)

3. One or more modprobe configuration files to disable Nouveau have been written.  You will need to reboot your system and possibly rebuild the initramfs before these changes can take effect.  Note
  if you later wish to reenable Nouveau, you will need to delete these files: /usr/lib/modprobe.d/nvidia-installer-disable-nouveau.conf, /etc/modprobe.d/nvidia-installer-disable-nouveau.conf

                                           OK
(选择 OK,然后重启)

4. nvidia-installer is not able to perform some of the sanity checks which detect potential installation problems while Nouveau is loaded. Would you like to continue installation without these sanity
  checks, or abort installation, confirm that Nouveau has been properly disabled, and attempt installation again later?

                    Abort installation                      Continue installation                          
(选择 Continue installation)    

 5. Install NVIDIA's 32-bit compatibility libraries?

                        No                                           Yes                                    
(选择 No)                           

  6. The initramfs will likely need to be rebuilt due to the following condition(s):
  * nvidia-installer attempted to disable Nouveau.
  * Nouveau is present in the initramfs.

  Would you like to rebuild the initramfs?

                 Do not rebuild initramfs                      Rebuild initramfs                            
(选择 Rebuild initramfs) 

 7. Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X?  Any pre-existing X configuration
  file will be backed up.

                       No                                           Yes                                     
(选择 Yes)                 

说明:上面只记录了带选项的内容,其他只有单选项的提示或警告窗口全部保持默认即可

全部执行完成以后重启服务器

sudo reboot

04-验证驱动是否正常

nvidia-smi

#输出如下类似结果说明安装成功
Mon Nov  4 09:41:56 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.05             Driver Version: 550.127.05     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090        Off |   00000000:1B:00.0 Off |                  N/A |
| 30%   38C    P0            107W /  350W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 3090        Off |   00000000:3D:00.0 Off |                  N/A |
| 30%   34C    P0             95W /  350W |       1MiB /  24576MiB |      1%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 3090        Off |   00000000:3E:00.0 Off |                  N/A |
| 30%   38C    P0            104W /  350W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GeForce RTX 3090        Off |   00000000:88:00.0 Off |                  N/A |
| 30%   36C    P0            104W /  350W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA GeForce RTX 3090        Off |   00000000:89:00.0 Off |                  N/A |
| 30%   38C    P0            100W /  350W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA GeForce RTX 3090        Off |   00000000:B1:00.0 Off |                  N/A |
| 30%   37C    P0            109W /  350W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA GeForce RTX 3090        Off |   00000000:B2:00.0 Off |                  N/A |
| 30%   36C    P0            102W /  350W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

FAQ

一、安装报错: ERROR: An error occurred while performing the step: "Building kernel modules". See /var/log/nvidia-installer.log for details.

原因:查看日志/var/log/nvidia-installer.log,可以看到原来是gcc版本不一致导致的

image

解决办法:升级gcc版本到12

sudo apt install gcc-12

在升级的时候发现本机其实已经有安装gcc-12了,只是系统默认的gcc版本没有切换过来,我们需要手工更新gcc的链接,使其指向gcc-12的二进制文件

sudo ln -s -f /usr/bin/gcc-12 /usr/bin/gcc

输入以下命令验证,可以看到gcc版本已经成功切到了12

gcc --version
gcc (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0