Originally I wanted to write a post on why FIFOs are insufficient for properly implementing process substitution in shells, but then I researched a bit and found a POSIX violation in FreeBSD and an inconsistency in the POSIX standard itself, so fasten your seatbelts...
In a shell,
you usually have linear pipelines which connect the stdout
of one process to the stdin
of the next one.
But sometimes,
having just one input is not enough,
for example how would you diff the outputs of two programs without using temporary files?
This is why some shells such as ksh(1),
zsh(1),
bash(1) or even rc(1) implement process substitution:
diff <(program1) <(program2)
Since diff operates on file names as arguments, we need to construct a pipe between the standard output of program1
to the first argument of diff, and of program2
to the second argument of diff. This requires giving the pipes a file name that can be opened.
Naively, you could think that named pipes are a good solution for that. The rest of this post should convince you it really isn't. This is why all these shells prefer to use /dev/fd together with an (anonymous) pipe, if possible.
FIFOs (first-in first-out), or named pipes, as they are sometimes called, behave like normal pipes made by pipe(2), or at least that's what people want you to believe.
While a connected FIFO really pretty much behaves like a pipe, for a FIFO we must distinguish four different states:
As you can see, the critical issue is state 2. While a pipe created by pipe(2) is already connected between the two file descriptors, opening a FIFO does block, even before we try to read. This can result in all kinds of funky hangups when trying to use FIFOs instead of regular files.
In particular, when trying to use a FIFO for process substitution, it could happen that the FIFO never is actually opened, but the process on the other end already started.
When we use /dev/fd instead, the pipe is open on both sides and we are always in step 4 (until the ends are closed).
As suggested above, there is a trick we can do. You can open the FIFO with the flag O_NONBLOCK
. This will result in the FIFO being open immediately, even if there is no writer yet.
However, every subsequent read operation from the unconnected FIFO results in reading zero bytes, that is, the EOF condition.
Now, I thought I have a clever idea: We can just open the FIFO with O_NONBLOCK
set, and after opening the FIFO, we fcntl(2) the O_NONBLOCK
away again, and then perhaps it behaves like a regular pipe, blocking on read?
It turns out that this works on FreeBSD! Let's look at some code:
int main() { int fd = open("myfifo", O_RDONLY | O_NONBLOCK); printf("fd=%d\n", fd); fcntl(fd, F_SETFL, fcntl(fd, F_GETFL) & ~O_NONBLOCK); char buf[8]; int r = read(fd, buf, sizeof buf); printf("r=%d\n", r); }
We open the FIFO non-blocking, then make the fd blocking, and then we try to read 8 bytes. FreeBSD 13.2 will happily block on reading until a writer connects and writes something.
Unfortunately, this behavior is not POSIX compliant; to cite:
When attempting to read from an empty pipe or FIFO:
- If no process has the pipe open for writing, read() shall return 0 to indicate end-of-file.
- ...
Thus, a POSIX compliant system should immediately return reading zero bytes.
I tested various kernels, and indeed the following systems as specified in POSIX:
Curiously, macOS behaves differently than FreeBSD here. Due to lack of a coherent Darwin history, I could not find out precisely when this behavior changed.
OpenBSD behaves like FreeBSD, and blocks.
Now, you may ask, when a regular read on a half-open FIFO does not block according to the standards, how is one supposed to wait for a write to appear? It turns out that select(2) can be used for this; at least it works for the mentioned systems above.
However, this behavior seems inconsistent with the POSIX specification:
A descriptor shall be considered ready for reading when a call to an input function with
O_NONBLOCK
clear would not block, whether or not the function would transfer data successfully. (The function might return data, an end-of-file indication, or an error other than one indicating that it is blocked, and in each of these cases the descriptor shall be considered ready for reading.)
As we have verified, calling read(2) does not block (and returns EOF immediately), yet select(2) is blocking on these file descriptors! This seems to be an omission in the standard, as the current behavior is essential.
I don't know why FIFOs work the way they do; I think it would be more reasonable to never block on opening and just block on I/O as needed. (Feel free to tell me if you have a convincing argument.)
Indeed, many programs that use FIFOs as a control mechanism open both ends of it to ensure it's always in the connected state. (Opening with O_RDWR
works too, according to my tests.)