GNU tar will apparently silently include mtime of files if --mtime is
passed with an unrecognized date format. This led to hash instability
from those mtimes and this fixes it to force all mtimes to epoch
timestamp 0.
Previously, we stored the tarballs from the hosted Git providers directly in the cache. However, as we've seen with `fetchFromGitHub` etc, these files may change subtly.
Given this, this commit repacks the dependencies before storing them in the cache.
A `package-lock.json` file can contain multiple instances of the same dependency, which caused unnecessary downloads and duplicate index entries in the generated cache.
The `sparseCheckout` argument allows the user to specify directories or
patterns of files, which Git uses to filter files it should check-out.
Git expects a multi-line string on stdin ("newline-delimited list", see
`git-sparse-checkout(1)`), but within nixpkgs it is more consistent to
use a list of strings instead. The list elements are joined to a
multi-line string only before passing it to the builder script.
A deprecation warning is emitted if a (multi-line) string is passed to
`sparseCheckout`, but for the time being it is still accepted.
PostScript Printer Description (ppd) files
describe printer features and capabilities.
They are usually evaluated by CUPS to convert
print jobs into a format suitable for a printer.
The conversion is often accomplished by commands
or even short shell scripts inside the ppd files.
ppd files are included in many printer driver packages.
Their scripts sometimes refer to other executables;
some of them are more common (like `perl`),
others are more exotic (like `rastertohp`).
If an executable is called with its name alone,
the effects of the ppd file depend on whether
the executable is in the PATH of CUPS,
and on the executable's version.
If an executable is called with an absolut path
(like `/usr/bin/perl`), it won't work at all in NixOS.
The commit at hand adds a setup hook that uses
the `fixupPhase` to substitute certain executable's
invocations in pdd files with absolute paths.
To use it, add the hook to `nativeBuildInputs` and
provide a list of executable names in `ppdFileCommands`.
Each executable must be available in the
package that is being built, or in `buildInputs`.
The setup hook's script then looks for ppd files in
`share/cups/model` and `share/ppds` in each output,
and replaces executable names with their absolute paths.
If ppd files need to be patched in unorthodox locations or
the setup hook needs to be invoked manually for other reasons,
one may leave the list `ppdFileCommands` empty to
avoid automatic processing of ppd files, then call
the shell function `patchPpdFileCommands` directly.
Details are described in the file `patch-ppd-hook.sh`.
Notes on the motivation for this setup hook:
Most packages in nixpkgs that provide
ppd files do not patch those ppd files at all.
This is not fatal when the executables are just called
with their names since the user can add packages
with the executables to `services.printing.drivers`.
E.g. if the user adds `pkgs.perl`, then all ppd
files that invoke `perl` will work as expected.
Nevertheless, to make these ppd files independent of
their execution environment, command invocations should
be substituted with absolut paths into the nix store.
This is similar to patching shebang lines so scripts can be
called independently of having the interpreter in the PATH.
The hook script in this commit is meant to support new packages
`foomatic-db*` which will generate several thousands of
ppd files referencing a plethora of different executables.
During development of these packages, I realized that
it's quite hard to patch ppd files in a robust way.
While binary names like `rastertokpsl` seem to be sufficiently
unique to be patched with `sed`, names like `date` or `gs`
are hard to patch without producing "false positives",
i.e., coincidental occurences of the executable's name that do
*not* refer to the executable and should not be patched at all.
As this problem also affects other packages,
it seems reasonable to put a robust implementation
in its own setup hook so that other
packages can use it without much effort.
Notes on the implementation:
The ppd file format is far from trivial.
The basic structure are key-value pairs;
keys may occur multiple times.
Only a small subset of keys may contain
executable names or shell scripts in their values.
Some values may span multiple lines;
a linebreak might even occur in the middle of a token.
Some executable names also occur in other keys by accident
where they must not be substituted (e.g. `gs` or `date`).
It is necessary to provide the list of command
names that will be patched for two reasons:
ppd files often contain "tokens" that might look
like commands (e.g. "file" or "host") but aren't;
these would erroneously get patched.
Also, looking for everything that might be a command
would slow down the patching process considerably.
The implementation uses `awk` to detect
keys that might contain executable names;
only their values are treated for substitution.
This avoids most cases of "overzealous" substitutions.
Since values may span multiple lines,
`sed` alone (while faster than `awk`) cannot focus
its substitution capabilities on relevant keys.
An elaborate set of regular expressions further helps
to minimize the probability of "false positives".
Several tricks are employed to speed up `awk`.
Notably, relevant files are identified with
`grep` before `awk` is applied to those files only.
Note that the script probably cannot handle fancy command
names (like spaces or backslashes as part of the name).
Also, there are still edge cases that the script would
mistakenly skip, e.g. if a shell script contains a
line break in the middle of an executable's name;
although ppd files permit such constellations,
I have yet to see one.
ppd files may be gzipped.
The setup hook accepts gzipped ppd files:
It decompresses them, substitutes paths, then recompresses them.
However, Nix cannot detect substituted paths as
runtime dependencies in compressed ppd files.
To ensure substituted paths are propagated as
runtime dependencies, the script adds each substituted
path to the variable `propagatedBuildInputs`.
Since this might not be enough for multi-output packages,
those paths are also written directly to
`nix-support/propagated-build-inputs`.
See the comment in `patch-ppd-hook.sh` for details.
Finally, the setup hook comes with a small test that
probes some edge cases with an artificial ppd file.
References:
* https://www.cups.org/doc/spec-ppd.html
* general ppd file specification
* lists some keys that may contain
executable names or shell scripts
* https://refspecs.linuxfoundation.org/LSB_4.0.0/LSB-Printing/LSB-Printing/ppdext.html
* lists some keys that may contain
executable names or shell scripts
* https://en.wikipedia.org/wiki/PostScript_Printer_Description#CUPS
* lists the usual locations of ppd files
There are two problems: first that we end up splitting on spaces in the
loop. Even when that is fixed, we still would split on spaces in the
`export` inside the loop. We need to guard against both.
Fixes#199298
Confirmed that it fixes the case mentioned in the ticket:
```console
[nix-develop]$ $(nix-build -I nixpkgs=/home/shana/programming/nixpkgs Cargo.nix -A rootCrate.build --no-out-link)/bin/nix-rustc-env-escape-repro
Expecting three words, got: first second third
```
I think this is going to cause a rebuild of every Rust package even if
they were unaffected, not much we can do here.