forked from mirrors/nixpkgs
cudaPackages: add docs
This commit is contained in:
parent
8e800cedaf
commit
bfaefd0873
|
@ -68,16 +68,45 @@ All new projects should use the CUDA redistributables available in [`cudaPackage
|
|||
### Updating CUDA redistributables {#updating-cuda-redistributables}
|
||||
|
||||
1. Go to NVIDIA's index of CUDA redistributables: <https://developer.download.nvidia.com/compute/cuda/redist/>
|
||||
2. Copy the `redistrib_*.json` corresponding to the release to `pkgs/development/compilers/cudatoolkit/redist/manifests`.
|
||||
3. Generate the `redistrib_features_*.json` file by running:
|
||||
2. Make a note of the new version of CUDA available.
|
||||
3. Run
|
||||
|
||||
```bash
|
||||
nix run github:ConnorBaker/cuda-redist-find-features -- <path to manifest>
|
||||
nix run github:connorbaker/cuda-redist-find-features -- \
|
||||
download-manifests \
|
||||
--log-level DEBUG \
|
||||
--version <newest CUDA version> \
|
||||
https://developer.download.nvidia.com/compute/cuda/redist \
|
||||
./pkgs/development/cuda-modules/cuda/manifests
|
||||
```
|
||||
|
||||
That command will generate the `redistrib_features_*.json` file in the same directory as the manifest.
|
||||
This will download a copy of the manifest for the new version of CUDA.
|
||||
4. Run
|
||||
|
||||
4. Include the path to the new manifest in `pkgs/development/compilers/cudatoolkit/redist/extension.nix`.
|
||||
```bash
|
||||
nix run github:connorbaker/cuda-redist-find-features -- \
|
||||
process-manifests \
|
||||
--log-level DEBUG \
|
||||
--version <newest CUDA version> \
|
||||
https://developer.download.nvidia.com/compute/cuda/redist \
|
||||
./pkgs/development/cuda-modules/cuda/manifests
|
||||
```
|
||||
|
||||
This will generate a `redistrib_features_<newest CUDA version>.json` file in the same directory as the manifest.
|
||||
5. Update the `cudaVersionMap` attribute set in `pkgs/development/cuda-modules/cuda/extension.nix`.
|
||||
|
||||
### Updating cuTensor {#updating-cutensor}
|
||||
|
||||
1. Repeat the steps present in [Updating CUDA redistributables](#updating-cuda-redistributables) with the following changes:
|
||||
- Use the index of cuTensor redistributables: <https://developer.download.nvidia.com/compute/cutensor/redist>
|
||||
- Use the newest version of cuTensor available instead of the newest version of CUDA.
|
||||
- Use `pkgs/development/cuda-modules/cutensor/manifests` instead of `pkgs/development/cuda-modules/cuda/manifests`.
|
||||
- Skip the step of updating `cudaVersionMap` in `pkgs/development/cuda-modules/cuda/extension.nix`.
|
||||
|
||||
### Updating supported compilers and GPUs {#updating-supported-compilers-and-gpus}
|
||||
|
||||
1. Update `nvcc-compatibilities.nix` in `pkgs/development/cuda-modules/` to include the newest release of NVCC, as well as any newly supported host compilers.
|
||||
2. Update `gpus.nix` in `pkgs/development/cuda-modules/` to include any new GPUs supported by the new release of CUDA.
|
||||
|
||||
### Updating the CUDA Toolkit runfile installer {#updating-the-cuda-toolkit}
|
||||
|
||||
|
@ -99,7 +128,7 @@ All new projects should use the CUDA redistributables available in [`cudaPackage
|
|||
nix store prefetch-file --hash-type sha256 <link>
|
||||
```
|
||||
|
||||
4. Update `pkgs/development/compilers/cudatoolkit/versions.toml` to include the release.
|
||||
4. Update `pkgs/development/cuda-modules/cudatoolkit/releases.nix` to include the release.
|
||||
|
||||
### Updating the CUDA package set {#updating-the-cuda-package-set}
|
||||
|
||||
|
@ -107,7 +136,7 @@ All new projects should use the CUDA redistributables available in [`cudaPackage
|
|||
|
||||
- NOTE: Changing the default CUDA package set should occur in a separate PR, allowing time for additional testing.
|
||||
|
||||
2. Successfully build the closure of the new package set, updating `pkgs/development/compilers/cudatoolkit/redist/overrides.nix` as needed. Below are some common failures:
|
||||
2. Successfully build the closure of the new package set, updating `pkgs/development/cuda-modules/cuda/overrides.nix` as needed. Below are some common failures:
|
||||
|
||||
| Unable to ... | During ... | Reason | Solution | Note |
|
||||
| --- | --- | --- | --- | --- |
|
||||
|
|
32
pkgs/development/cuda-modules/README.md
Normal file
32
pkgs/development/cuda-modules/README.md
Normal file
|
@ -0,0 +1,32 @@
|
|||
# cuda-modules
|
||||
|
||||
> [!NOTE]
|
||||
> This document is meant to help CUDA maintainers understand the structure of the CUDA packages in Nixpkgs. It is not meant to be a user-facing document.
|
||||
> For a user-facing document, see [the CUDA section of the manual](../../../doc/languages-frameworks/cuda.section.md).
|
||||
|
||||
The files in this directory are added (in some way) to the `cudaPackages` package set by [cuda-packages.nix](../../top-level/cuda-packages.nix).
|
||||
|
||||
## Top-level files
|
||||
|
||||
Top-level nix files are included in the initial creation of the `cudaPackages` scope. These are typically required for the creation of the finalized `cudaPackages` scope:
|
||||
|
||||
- `backend-stdenv.nix`: Standard environment for CUDA packages.
|
||||
- `flags.nix`: Flags set, or consumed by, NVCC in order to build packages.
|
||||
- `gpus.nix`: A list of supported NVIDIA GPUs.
|
||||
- `nvcc-compatibilities.nix`: NVCC releases and the version range of GCC/Clang they support.
|
||||
|
||||
## Top-level directories
|
||||
|
||||
- `cuda`: CUDA redistributables! Provides extension to `cudaPackages` scope.
|
||||
- `cudatoolkit`: monolothic CUDA Toolkit run-file installer. Provides extension to `cudaPackages` scope.
|
||||
- `cudnn`: NVIDIA cuDNN library.
|
||||
- `cutensor`: NVIDIA cuTENSOR library.
|
||||
- `generic-builders`:
|
||||
- Contains a builder `manifest.nix` which operates on the `Manifest` type defined in `modules/generic/manifests`. Most packages are built using this builder.
|
||||
- Contains a builder `multiplex.nix` which leverages the Manifest builder. In short, the Multiplex builder adds multiple versions of a single package to single instance of the CUDA Packages package set. It is used primarily for packages like `cudnn` and `cutensor`.
|
||||
- `modules`: Nixpkgs modules to check the shape and content of CUDA redistributable and feature manifests. These modules additionally use shims provided by some CUDA packages to allow them to re-use the `genericManifestBuilder`, even if they don't have manifest files of their own. `cudnn` and `tensorrt` are examples of packages which provide such shims. These modules are further described in the [Modules](./modules/README.md) documentation.
|
||||
- `nccl`: NVIDIA NCCL library.
|
||||
- `nccl-tests`: NVIDIA NCCL tests.
|
||||
- `saxpy`: Example CMake project that uses CUDA.
|
||||
- `setup-hooks`: Nixpkgs setup hooks for CUDA.
|
||||
- `tensorrt`: NVIDIA TensorRT library.
|
27
pkgs/development/cuda-modules/modules/README.md
Normal file
27
pkgs/development/cuda-modules/modules/README.md
Normal file
|
@ -0,0 +1,27 @@
|
|||
# Modules
|
||||
|
||||
Modules as they are used in `modules` exist primarily to check the shape and content of CUDA redistributable and feature manifests. They are ultimately meant to reduce the repetitive nature of repackaging CUDA redistributables.
|
||||
|
||||
Building most redistributables follows a pattern of a manifest indicating which packages are available at a location, their versions, and their hashes. To avoid creating builders for each and every derivation, modules serve as a way for us to use a single `genericManifestBuilder` to build all redistributables.
|
||||
|
||||
## `generic`
|
||||
|
||||
The modules in `generic` are reusable components meant to check the shape and content of NVIDIA's CUDA redistributable manifests, our feature manifests (which are derived from NVIDIA's manifests), or hand-crafted Nix expressions describing available packages. They are used by the `genericManifestBuilder` to build CUDA redistributables.
|
||||
|
||||
Generally, each package which relies on manifests or Nix release expressions will create an alias to the relevant generic module. For example, the [module for CUDNN](./cudnn/default.nix) aliases the generic module for release expressions, while the [module for CUDA redistributables](./cuda/default.nix) aliases the generic module for manifests.
|
||||
|
||||
Alternatively, additional fields or values may need to be configured to account for the particulars of a package. For example, while the release expressions for [CUDNN](./cudnn/releases.nix) and [TensorRT](./tensorrt/releases.nix) are very close, they differ slightly in the fields they have. The [module for CUDNN](./modules/cudnn/default.nix) is able to use the generic module for release expressions, while the [module for TensorRT](./modules/tensorrt/default.nix) must add additional fields to the generic module.
|
||||
|
||||
### `manifests`
|
||||
|
||||
The modules in `generic/manifests` define the structure of NVIDIA's CUDA redistributable manifests and our feature manifests.
|
||||
|
||||
NVIDIA's redistributable manifests are retrieved from their web server, while the feature manifests are produced by [`cuda-redist-find-features`](https://github.com/connorbaker/cuda-redist-find-features).
|
||||
|
||||
### `releases`
|
||||
|
||||
The modules in `generic/releases` define the structure of our hand-crafted Nix expressions containing information necessary to download and repackage CUDA redistributables. These expressions are created when NVIDIA-provided manifests are unavailable or otherwise unusable. For example, though CUDNN has manifests, a bug in NVIDIA's CI/CD causes manifests for different versions of CUDA to use the same name, which leads to the manifests overwriting each other.
|
||||
|
||||
### `types`
|
||||
|
||||
The modules in `generic/types` define reusable types used in both `generic/manifests` and `generic/releases`.
|
Loading…
Reference in a new issue