🡅 up More lessons on /dev/fd

More lessons on /dev/fd

In a previous post, we looked at the /dev/fd directory.

I found out a few more things as well as FreeBSD and Linux special cases related to it.

The dangers of stat

Recently I wanted to check if standard input was connected to a pipe. I looked at test(1) if there was something like -t, however, I only found -p which needs a file name, not a file descriptor.

So I tried [ -p /dev/fd/0 ] but it didn't work.

This is, because in the FreeBSD setup without fdescfs(8), or with fdescfs(8) and default mount options, the files in /dev/fd are special character devices nodes:

% ls -l /dev/fd
total 0
cr-xr-xr-x  1 root wheel 0x3 Dec  7  2024 0*
cr-xr-xr-x  1 root wheel 0x4 Dec  7  2024 1*
cr-xr-xr-x  1 root wheel 0x5 Dec  7  2024 2*
cr-xr-xr-x  1 root wheel 0x6 Dec  7  2024 3*
cr-xr-xr-x  1 root wheel 0x7 Dec  7  2024 4*

However, if you mount the fdescfs(8) with the option -o nodup, the entries correspond to the vnodes:

crw--w----  1 root tty   0x6d Sep 24 22:23 0
crw--w----  1 root tty   0x6d Sep 24 22:23 1
crw--w----  1 root tty   0x6d Sep 24 22:23 2
drwxr-x---  3 root wheel   17 Sep 24 22:22 3/
dr-xr-xr-x  2 root wheel  512 May 28 18:14 4/

Buuut, while opening file descriptors that belong to a regular file or a directory show up as a regular file or directory, pipes and FIFOs show up as... character devices.

Let's write some C to analyze this:

#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>

int
main()
{
        struct stat st;
        stat("/dev/fd/0", &st);
        printf("%o\n", st.st_mode);

        fstat(0, &st);
        printf("%o\n", st.st_mode);

        int fd = open("/dev/fd/0", O_RDONLY);
        fstat(fd, &st);
        printf("%o\n", st.st_mode);
}

These are all the possible ways we can stat fd 0.

Now let's check:

% ./a.out </etc/passwd 
100644
100644
100644

% mkfifo /tmp/myfifo
% ./a.out </tmp/myfifo 
20555
10644
10644

% date | ./a.out
20555
10000
10000

So indeed to get the proper st_mode we need to open the /dev/fd entry, thus test(1) cannot be used (on FreeBSD, on Linux it works).

fdescfs(5) also supports the options rdlnk and linrdlnk. These completely break opening a fifo via /dev/fd...

Non-blocking woes

Working with non-blocking file handles is a conundrum that already djb complained about:

> The problem with O_NONBLOCK is that it's too heavy:
> it hits an entire ofile, not just an fd.

Indeed, there's spooky action at a distance, when a different process changes your O_NONBLOCK setting:

#include <sys/wait.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>

int
main()
{
        int fds[2];
        pipe(fds);

        dprintf(1, "%o\n", fcntl(fds[1], F_GETFL));

        int pid = fork();
        if (pid == 0) {
                fcntl(fds[1], F_SETFL, O_NONBLOCK);
                dprintf(1, "in child: %d\n", fcntl(fds[1], F_GETFL));
                _exit(0);
        }
        waitpid(pid, 0, 0);

        dprintf(1, "%o\n", fcntl(fds[1], F_GETFL));
}

We run it and see:

% ./a.out
2
in child: 6
6

(That's why we need F_SETFL, the flags of F_SETFD are per file descriptor!)

As far as I can tell, there's no way in FreeBSD to get a second ofile on a pipe with different O_NONBLOCK setting.

However, in Linux it funnily works if you use /dev/fd:

#include <sys/wait.h>
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

int
main()
{
        int fds[2];
        pipe(fds);

        dprintf(1, "%o\n", fcntl(fds[1], F_GETFL));

        int pid = fork();
        if (pid == 0) {
                char *devfd;
                asprintf(&devfd, "/dev/fd/%d", fds[1]);
                int fd = open(devfd, O_WRONLY);
                fcntl(fd, F_SETFL, O_NONBLOCK);
                dprintf(1, "in child: %o\n", fcntl(fd, F_GETFL));
                while (write(fd, "spam", 4) > 0)
                      ;
                dprintf(1, "error=%s\n", strerror(errno));
                _exit(0);
        }
        waitpid(pid, 0, 0);

        dprintf(1, "%o\n", fcntl(fds[1], F_GETFL));
        write(fds[1], "spam", 4);
}

This program will print (on Linux 6.16):

% ./a.out
1
in child: 104001
error=Resource temporarily unavailable
1

and then hangs in the final write to prove it's indeed blocking.