1. CUDA Toolkit Major Components
This section provides an overview of the major components of the CUDA Toolkit and points to their locations after installation.
- Compiler
- The CUDA-C and CUDA-C++ compiler, nvcc, is found in the bin/ directory. It is built on top of the NVVM optimizer, which is itself built on top of the LLVM compiler infrastructure. Developers who want to target NVVM directly can do so using the Compiler SDK, which is available in the nvvm/ directory.
- Please note that the following files are compiler-internal and subject to change without any prior notice.
- any file in include/crt and bin/crt
- include/common_functions.h, include/device_double_functions.h, include/device_functions.h, include/host_config.h, include/host_defines.h, and include/math_functions.h
- nvvm/bin/cicc
- bin/cudafe++, bin/bin2c, and bin/fatbinary
- Tools
- The following development tools are available in the bin/ directory (except
for Nsight Visual Studio Edition (VSE) which is installed as a plug-in to
Microsoft Visual Studio).
- IDEs: nsight (Linux, Mac), Nsight VSE (Windows)
- Debuggers: cuda-memcheck, cuda-gdb (Linux), Nsight VSE (Windows)
- Profilers: nvprof, nvvp, Nsight VSE (Windows)
- Utilities: cuobjdump, nvdisasm, gpu-library-advisor
- Libraries
- The scientific and utility libraries listed below are available in the lib/
directory (DLLs on Windows are in bin/), and their interfaces
are available in the include/ directory.
- cublas (BLAS)
- cublas_device (BLAS Kernel Interface)
- cuda_occupancy (Kernel Occupancy Calculation [header file implementation])
- cudadevrt (CUDA Device Runtime)
- cudart (CUDA Runtime)
- cufft (Fast Fourier Transform [FFT])
- cupti (Profiling Tools Interface)
- curand (Random Number Generation)
- cusolver (Dense and Sparse Direct Linear Solvers and Eigen Solvers)
- cusparse (Sparse Matrix)
- npp (NVIDIA Performance Primitives [image and signal processing])
- nvblas ("Drop-in" BLAS)
- nvcuvid (CUDA Video Decoder [Windows, Linux])
- nvgraph (CUDA nvGRAPH [accelerated graph analytics])
- nvml (NVIDIA Management Library)
- nvrtc (CUDA Runtime Compilation)
- nvtx (NVIDIA Tools Extension)
- thrust (Parallel Algorithm Library [header file implementation])
- CUDA Samples
- Code samples that illustrate how to use various CUDA and library APIs are available in the samples/ directory on Linux and Mac, and are installed to C:\ProgramData\NVIDIA Corporation\CUDA Samples on Windows. On Linux and Mac, the samples/ directory is read-only and the samples must be copied to another location if they are to be modified. Further instructions can be found in the Getting Started Guides for Linux and Mac.
- Documentation
- The most current version of these release notes can be found online at http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html. Also, the version.txt file in the root directory of the toolkit will contain the version and build number of the installed toolkit.
- Documentation can be found in PDF form in the doc/pdf/ directory, or in HTML form at doc/html/index.html and online at http://docs.nvidia.com/cuda/index.html.
- CUDA Driver
- Running a CUDA application requires the system with at least one CUDA capable GPU and a driver that is compatible with the CUDA Toolkit. For more information various GPU products that are CUDA capable, visit https://developer.nvidia.com/cuda-gpus. Each release of the CUDA Toolkit requires a minimum version of the CUDA driver. The CUDA driver is backward compatible, meaning that applications compiled against a particular version of the CUDA will continue to work on subsequent (later) driver releases. More information on compatibility can be found at https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html#cuda-runtime-and-driver-api-version.
-
Table 1. CUDA Toolkit and Compatible Driver Versions CUDA Toolkit Linux x86_64 Driver Version Windows x86_64 Driver Version CUDA 7.0 (7.0.28) >= 346.46 >= 347.62 CUDA 7.5 (7.5.16) >= 352.31 >= 353.66 CUDA 8.0 (8.0.44) >= 367.48 >= 369.30 CUDA 8.0 (8.0.61 GA2) >= 375.26 >= 376.51 CUDA 9.0 (9.0.76) >= 384.81 >= 385.54 CUDA 9.1 (9.1.85) >= 390.46 >= 391.29 CUDA 9.2 (9.2.88) >= 396.26 >= 397.44 CUDA 9.2 (9.2.148 Update 1) >= 396.37 >= 398.26 - For convenience, the NVIDIA driver is installed as part of the CUDA Toolkit installation. Note that this driver is for development purposes and is not recommended for use in production with Tesla GPUs. For running CUDA applications in production with Tesla GPUs, it is recommended to download the latest driver for Tesla GPUs from the NVIDIA driver downloads site at http://www.nvidia.com/drivers.
- During the installation of the CUDA Toolkit, the installation of the NVIDIA driver may be skipped on Windows (when using the interactive or silent installation) or on Linux (by using meta packages). For more information on customizing the install process on Windows, see http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#install-cuda-software. For meta packages on Linux, see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#package-manager-metas
- CUDA-GDB Sources
- CUDA-GDB sources are available as follows:
-
- For CUDA Toolkit 7.0 and newer, in the installation directory extras/. The directory is created by default during the toolkit installation unless the .rpm or .deb package installer is used. In this case, the cuda-gdb-src package must be manually installed.
- For CUDA Toolkit 6.5, 6.0, and 5.5, at https://github.com/NVIDIA/cuda-gdb.
- For CUDA Toolkit 5.0 and earlier, at ftp://download.nvidia.com/CUDAOpen64/.
- Upon request by sending an e-mail to mailto:oss-requests@nvidia.com.
2. Release Notes
The release notes for the CUDA Toolkit can be found online at http://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html.
Notices
Acknowledgments
NVIDIA extends thanks to Professor Mike Giles of Oxford University for providing the initial code for the optimized version of the device implementation of the double-precision exp() function found in this release of the CUDA toolkit.
NVIDIA acknowledges Scott Gray for his work on small-tile GEMM kernels for Pascal. These kernels were originally developed for OpenAI and included since cuBLAS 8.0.61.2.
Notice
ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, "MATERIALS") ARE BEING PROVIDED "AS IS." NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.
Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation.