comm: remove double reads, which cause data from named pipes to be skipped
comm (when passing two file names) opens the input files twice:
1. once to perform its normal operation of comparing the file contents;
2. a second time, in `are_files_identical()` to verify if the two files
have exactly the same contents, in order to set the
`should_check_order` flag.
When the file is opened in `are_files_identical()`, new file descriptors
are created, and those new file descriptors are read until a difference
is found or until EOF.
When the inputs are regular files, this mechanism is generally not a
problem (with some caveats, see below). However, when using named pipes,
`are_files_identical()` will effectively eat data that won't be
available for comparison anymore.
This problem can be seen with this minimal reproducible example:
```
# create a file larger than BufReader's internal buffer (8 KiB)
$ for i in {00000..2000}; do echo $i; done > f
# run comm with two regular files: this works and reports no errors
$ comm /dev/null f
00000
...
02000
# run comm with two named pipes: the following will expand to something
# like `comm /dev/fd/63 /dev/fd/62`; the output should be the same as
# above, but because `are_files_identical()` consumes some blocks of
# data, the file will appear not in sorted order to `comm` and some
# bytes will be missing
$ comm <(< /dev/null) <(< f)
00000
...
01364
comm: file 2 is not in sorted order
01
comm: input is not in sorted order
```
This commit fixes the problem by removing `are_files_identical()`, and
instead keeping track of whether the files have the same contents using
a flag (`files_differ`) through the main loop. This implementation
matches more closely the behavior of GNU comm.
It's worth noting that the implementation using `are_files_identical()`
was prone to race conditions, and was not fully matching the behavior of
GNU comm, which allows two files to be *partially* identical and not
sorted.