AI编程：开发环境配置

环境配置#

以下是不同操作系统和开发环境下的配置方法。笔者的开发环境为：

OS: Windows 11 24H2
IDE/Editor: Visual Studio Community 2022 17.14.15 (September 2025), VSCode 1.104.1
Compiler: MSVC 19.44.35217
CMake: 4.1.1
CUDA Toolkit: 13.0.1
Clangd: 21.1.1
Docker: Docker version 28.4.0, build d8eb465

以及硬件配置：

显卡: NVIDIA GeForce RTX 5060 Ti 16GB (驱动版本 581.29，CUDA 13.0 兼容，计算能力 12.0)

Windows + Visual Studio#

在 Windows 上，你先应该有一个 C++ 编译器，推荐安装 Visual Studio Community 2022，安装时选择“使用 C++ 的桌面开发”工作负载。然后，安装 CUDA Toolkit，CUDA 版本应该与你的 GPU 驱动兼容。安装完成后，在终端中执行 nvcc --version 来验证 CUDA 是否正确安装。

接下来，打开 Visual Studio，选择包含“CUDA”字样的项目模板（我这里是 “CUDA 13.0 Runtime”），创建一个新项目。这样会生成一个配置好的 CUDA 项目，你可以看见 kernel.cu 文件。你可以在这个文件中编写 CUDA 代码，并使用 Visual Studio 的构建功能（MS Build）来编译和运行你的程序。

如果你想在现有项目中添加 CUDA 支持，那这样会麻烦不少。参考NVIDIA官方文档来手动配置项目属性。

Windows + CMake + VSCode + IntelliSense#

在 Windows 上使用 CMake 配置 CUDA 环境相对简单。首先，确保你已经安装了 CMake 和 CUDA Toolkit。为了启用 CUDA 支持，你需要在 CMakeLists.txt 中将 LANGUAGES 参数设置为 CUDA。

cmake_minimum_required(VERSION 3.24)
set(CMAKE_CUDA_ARCHITECTURES native)
project(hello_cuda LANGUAGES CXX CUDA)
add_executable(hello_cuda main.cpp kernel.cu)

在add_executable中，main.cpp 是你的主程序文件，kernel.cu 是你的 CUDA 文件。然后，在终端中运行以下命令来生成构建文件：

mkdir build
cd build
cmake ..
cmake --build .

这将创建一个 hello_cuda.exe 可执行文件，你可以运行它来测试你的 CUDA 代码。

NOTE

注意 CUDA_ARCHITECTURES 设置为 native 需要 CMake 3.24 或更高版本。如果你使用的是较旧版本的 CMake，你需要手动设置 CUDA 架构，例如 set(CMAKE_CUDA_ARCHITECTURES 61)，具体值取决于你的 GPU 计算能力。
你可能在一些教程中见到 find_package(CUDA REQUIRED)，但这是较旧的用法，不推荐使用。

在 Visual Studio Code 中，你可以使用 CMake Tools 插件来管理你的 CMake 项目。确保你已经安装了 CMake Tools 插件，然后打开包含 CMakeLists.txt 的文件夹。插件会自动检测并配置项目，你可以使用命令面板中的 CMake 命令来生成和构建项目。

生成项目的日志会显示在输出面板中，你可以查看编译器调用和任何错误信息。如果一切顺利，你应该能够看到类似于以下的输出：

[main] Configuring project: Tensor
[proc] Executing command: "C:\Program Files\CMake\bin\cmake.EXE" -DCMAKE_EXPORT_COMPILE_COMMANDS:BOOL=TRUE --no-warn-unused-cli -S C:/Users/lvhao/Projects/ai-programming/hw2/Tensor -B c:/Users/lvhao/Projects/ai-programming/hw2/Tensor/build -G "Visual Studio 17 2022" -T host=x64 -A x64
[cmake] Not searching for unused variables given on the command line.
[cmake] -- The CXX compiler identification is MSVC 19.44.35217.0
[cmake] -- The CUDA compiler identification is NVIDIA 13.0.48 with host compiler MSVC 19.44.35217.0
[cmake] -- Detecting CXX compiler ABI info
[cmake] -- Detecting CXX compiler ABI info - done
[cmake] -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe - skipped
[cmake] -- Detecting CXX compile features
[cmake] -- Detecting CXX compile features - done
[cmake] -- Detecting CUDA compiler ABI info
[cmake] -- Detecting CUDA compiler ABI info - done
[cmake] -- Check for working CUDA compiler: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/bin/nvcc.exe - skipped
[cmake] -- Detecting CUDA compile features
[cmake] -- Detecting CUDA compile features - done
[cmake] -- Configuring done (10.0s)
[cmake] -- Generating done (0.0s)
[cmake] -- Build files have been written to: C:/Users/lvhao/Projects/ai-programming/hw2/Tensor/build

通过安装 NVIDIA Nsight Visual Studio Code Edition 插件，你可以获得更好的 CUDA 开发体验，包括：

Declarative language configuration for CUDA syntax highlighting, bracket matching, code folding, auto-indention, etc.
C++ language server extensions to support CUDA-specific language features.
Debugger adapter to provide CUDA debugging.
Debugger views to provide CUDA-specific debugging information.
IDE extensions to add productivity enhancements to the VS Code environment.

Windows + WSL2#

参见 CUDA on WSL User Guide — CUDA on WSL 13.0 documentation

Windows + WSL2 + Docker Desktop + VSCode + Dev Containers#

这是一个很方便的选择，你既不需要在 WSL2 中安装 CUDA Toolkit，也不需要在 WSL2 中安装 Docker CE 和 NVIDIA Container Toolkit。你只需要在 Windows 上安装 Docker Desktop，并启用 WSL 2 集成。

按照WSL 上的 Docker 容器入门安装 Docker Desktop，并启用 WSL 2 集成。

安装完成后，在 WSL 终端中执行一个CUDA示例来验证 Docker 是否正确安装并且可以访问 GPU。

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

结果应该类似于：

Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
MapSMtoCores for SM 12.0 is undefined.  Default to use 128 Cores/SM
MapSMtoArchName for SM 12.0 is undefined.  Default to use Ampere
GPU Device 0: "Ampere" with compute capability 12.0

> Compute 12.0 CUDA device: [NVIDIA GeForce RTX 5060 Ti]
36864 bodies, total time for 10 iterations: 19.088 ms
= 711.939 billion interactions per second
= 14238.788 single-precision GFLOP/s at 20 flops per interaction

为了完成 CUDA 开发，我们需要使用 nvidia/cuda 镜像中 tag 带有 devel 的版本。这里根据自己需要选择带或不带 cudnn 的版本（带 cuDNN 的版本通常大几百 MB）。我这里选择 nvidia/cuda:13.0.1-cudnn-devel-ubuntu24.04 。可以在 NVIDIA NGC 查询所有镜像和 tag。

Three flavors of images are provided:

base: Includes the CUDA runtime (cudart)

runtime: Builds on the base and includes the CUDA math libraries⁠, and NCCL⁠. A runtime image that also includes cuDNN⁠ is available. Some images may also include TensorRT⁠.

devel: Builds on the runtime and includes headers, development tools for building CUDA images. These images are particularly useful for multi-stage builds.

在终端中执行以下命令来拉取这个镜像：

docker pull nvidia/cuda:13.0.1-cudnn-devel-ubuntu24.04

或者,

docker pull nvcr.io/nvidia/cuda:13.0.1-cudnn-runtime-ubuntu24.04

然后，在项目目录下创建.devcontainer文件夹，并在其中创建三个文件：

点击展开

// modified from https://github.com/mirzaim/cuda-devcontainer/blob/main/.devcontainer/devcontainer.json
{
  "name": "cuda-dev",
  "build": {
    "dockerfile": "Dockerfile"
  },

  "customizations": {
    "vscode": {
      "extensions": [
        "ms-vscode.cpptools",
        "ms-vscode.cmake-tools",
        "nvidia.nsight-vscode-edition"
      ]
    }
  },

  "runArgs": [
    "--gpus=all",
    "--cap-add=SYS_PTRACE",
    "--security-opt", "seccomp=unconfined"
  ],

  "containerEnv": {
    "TZ": "Asia/Shanghai"
  },

  "hostRequirements": {
    "gpu": true
  },

  "workspaceMount": "source=${localWorkspaceFolder},target=/workspaces/${localWorkspaceFolderBasename},type=bind",
  "workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}"
}

FROM nvidia/cuda:13.0.1-cudnn-devel-ubuntu24.04

ARG CMAKE_VERSION="4.1.1"
ENV TZ=Asia/Shanghai

COPY ./reinstall-cmake.sh /tmp/reinstall-cmake.sh

# 基础工具 + 时区
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
 && apt-get install -y --no-install-recommends \
      wget curl git ninja-build tzdata ca-certificates gnupg \
 && ln -snf /usr/share/zoneinfo/$TZ /etc/localtime \
 && echo $TZ > /etc/timezone \
 && rm -rf /var/lib/apt/lists/*

# 重装 CMake （可选）
RUN if [ "${CMAKE_VERSION}" != "none" ]; then \
        chmod +x /tmp/reinstall-cmake.sh && /tmp/reinstall-cmake.sh "${CMAKE_VERSION}"; \
    fi \
 && rm -f /tmp/reinstall-cmake.sh

#!/usr/bin/env bash
set -e

CMAKE_VERSION=${1:-"none"}
if [ "${CMAKE_VERSION}" = "none" ]; then
  echo "No CMake version specified, skipping CMake reinstallation"
  exit 0
fi

cleanup() { EXIT_CODE=$?; set +e; [[ -n "${TMP_DIR}" ]] && rm -rf "${TMP_DIR}"; exit $EXIT_CODE; }
trap cleanup EXIT

echo "Installing CMake ${CMAKE_VERSION} ..."
apt-get -y purge --auto-remove cmake || true
mkdir -p /opt/cmake

arch=$(dpkg --print-architecture)
case "$arch" in
  amd64) ARCH=x86_64 ;;
  arm64) ARCH=aarch64 ;;
  *) echo "Unsupported architecture: $arch"; exit 1 ;;
esac

TMP_DIR=$(mktemp -d -t cmake-XXXXXXXXXX)
cd "$TMP_DIR"

BIN="cmake-${CMAKE_VERSION}-linux-${ARCH}.sh"
SUM="cmake-${CMAKE_VERSION}-SHA-256.txt"

curl -sSL "https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/${BIN}" -O
curl -sSL "https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/${SUM}" -O
sha256sum -c --ignore-missing "$SUM"

sh "$BIN" --prefix=/opt/cmake --skip-license

ln -sf /opt/cmake/bin/cmake  /usr/local/bin/cmake
ln -sf /opt/cmake/bin/ctest  /usr/local/bin/ctest
ln -sf /opt/cmake/bin/cpack  /usr/local/bin/cpack
ln -sf /opt/cmake/bin/ccmake /usr/local/bin/ccmake

NOTE

在 Linux 下，可能会遇到挂载文件权限的问题。参考Add a non-root user to a container来解决。

在 VSCode 中安装 Remote - Containers 插件。然后，打开你的项目文件夹，按下 F1，选择 Dev Containers: Open Folder in Container...。VSCode 会自动检测配置、构建镜像并启动容器。这样就可以在容器中享用配置好的开发环境，所有的代码都在容器和主机之间共享。

参见：