🡅 up A tale of /dev/fd

A tale of /dev/fd

Many versions of Unix provide a /dev/fd directory to work with open file handles as if they were regular files. As usual, the devil is in the details.

Stevens writes:

The /dev/fd feature was developed by Tom Duff and appeared in the 8th Edition of the Research Unix System. It is supported by SVR4 and 4.3+BSD. It is not part of POSIX.1.

To the dispassionate hacker (or a reader of Stevens), it's pretty clear that a syscall like open("/dev/fd/5", O_RDONLY) should be similar to dup(5). However, on Linux that is not the case: in fact, it's not possible to open a /dev/fd entry of which you don't have permission to read, even if you could just read from the open file handle.

Let's verify this quickly:

# uname -sr
Linux 6.4.7_1
# su amber -c "date > /dev/fd/1" >/tmp/out
zsh:1: permission denied: /dev/fd/1

The shell creates /tmp/out as root, then su switches to my user account, then the inner shell tries to open /dev/fd/1, but this now fails.

That is particularily funny as the /dev/fd entry appears to have write permission:

# su amber -c "ls -l /dev/fd/1" >/tmp/out
# cat /tmp/out
l-wx------ 1 amber amber 64 Oct 17 17:39 /dev/fd/1 -> /tmp/out

(But note that ls itself prints fine to standard output, as expected.)

Meanwhile, on FreeBSD this just works:

# uname -sr
# su amber -c "date > /dev/fd/1" >/tmp/out
# cat /tmp/out 
Wed Sep 13 18:15:24 UTC 2023

However, FreeBSD by default only has three entries in /dev/fd, with fixed device nodes:

# ls -l /dev/fd/
total 0
crw-rw-rw-  1 root  wheel  0xb Aug 20 14:48 0
crw-rw-rw-  1 root  wheel  0xd Aug 20 14:48 1
crw-rw-rw-  1 root  wheel  0xf Aug 20 14:48 2

If we want a proper /dev/fd, we need to mount the fdescfs(5) pseudo file system:

# mount -t fdescfs fdesc /dev/fd
# ls -l /dev/fd/
total 0
cr-xr-xr-x  1 root  wheel  0x3 Aug 20 14:48 0
cr-xr-xr-x  1 root  wheel  0x4 Aug 20 14:48 1
cr-xr-xr-x  1 root  wheel  0x5 Aug 20 14:48 2
cr-xr-xr-x  1 root  wheel  0x6 Aug 20 14:48 3
cr-xr-xr-x  1 root  wheel  0x7 Aug 20 14:48 4

Now we also see the two file descriptors ls needed to open . (why?) and /dev/fd.

Reading the manpage for fdescfs(5) we can also find the mount option nodup:

 nodup     For file descriptors referencing vnodes, instead of the dup(2)
           semantic described above, implement re-opening of the
           referenced vnode.

With this option, a Linux-style permission check happens again. Finally, the man page suggests why this may be useful:

 In particular, if the file descriptor was opened with the O_PATH flag,
 then either O_EMPTY_PATH or open() over fdescfs mount with nodup option
 allows one to convert it to a regularly opened file, assuming that the
 current permissions allow the requested mode.

Now, Linux doesn't have O_EMPTY_PATH, at least not yet.

So in total this feature seems questionable at best to me, and simple dup semantics are easier.

Let's spend a short glimpse at Solar^WIllumos, and we see it does a simple dup as well.

P.S.: Why do I care about this? I noticed it when I saw that zsh on FreeBSD uses FIFOs by default for process redirections:

# echo <(date)

But this is inherently racy (and lots of fun if nothing opens the FIFO... think about it), and /dev/fd should be used instead to make this robust, but because ports are not built with fdescfs enabled, zsh's configure script doesn't detect it and falls back to creating FIFOs.

P.P.S.: FreeBSD ksh93 fails in the other way, finds /dev/fd during configure and assumes it works properly... but in the default configuration only fd 0, 1, 2 exist there, so process redirection is broken by default.

I have thus recompiled zsh locally and added fdescfs to my fstab(5).