Linux File I/O
Gabriel M.
Linux Systems Engineer | IT Infrastructure | Security | Virtualization | Automation | AI | C and Shell Scripting
When using system calls for dealing with file I/O, open(), read(), write() and close() are the four functions used to perform their namesake operations. These functions make use of file descriptors to reference open files. A file descriptor can be think of as a small "handle" used to get to the file and is represented by a non-negative integer. The cool thing about the Linux I/O model is that it is a universal I/O model, that is, a file descriptor can refer to all types of files, that is, terminals, devices, pipes, sockets, FIFOs, as well as regular files.
The three basic file descriptors, standard input, output and error, are made available to running programs, which inherit them from the shell that runs them. These descriptors are identified by their default ID, being stdin id 0, stdout id 1 and stderr id 2.
This is why those numbers are mostly seen right after I/O redirection operators inside shell scripts. As an example, the standard output from a ls command could be redirected to ls_output.txt file and, at the same time, the error output for that command could be redirected to another file ls_output_errors.txt, as follows:
ls > ls_output.txt 2> ls_output_errors.txt
For placing both outputs to the same file, the above command could be rewritten as:
ls > ls_ouptut_all.txt 2>&1
Where >2&1 means "redirect file descriptor 2 to the same place as file descriptor 1".
For using I/O file descriptors inside a program, this small C code shows how this can be done. The following program simply copies data from input to output, making use of the four system calls shown above.
#include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <stdlib.h> #include <string.h> #include <unistd.h> int main(int argc, char *argv[]) { int in = 0; int out = 0; int openFlags = 0; mode_t filePermissions = 0; ssize_t bytesRead = 0; char copyBuffer[4096]; /* Check command line args */ if (argc != 3 || strcmp(argv[1], "--help") == 0) { fprintf(stdout,"Usage: %s src-file dst-file\n", argv[0]); exit(EXIT_FAILURE); } /* Try to open input/output files */ in = open(argv[1], O_RDONLY); if ( in == -1) { fprintf(stdout, "Error opening %s\n", argv[1]); exit(EXIT_FAILURE); } openFlags = O_CREAT | O_WRONLY | O_TRUNC; filePermissions = S_IRUSR | S_IWUSR | /* user is rw- */ S_IRGRP | /* group is r-- */ S_IROTH; /* others are r-- */ out = open(argv[2], openFlags, filePermissions ); if ( out == -1) { fprintf(stdout, "Error opening %s\n", argv[2]); exit(EXIT_FAILURE); } /* While we have data to read from source (or don't get input error) */ while ( ( bytesRead = read( in, copyBuffer, COPY_BUFFER_SIZE )) > 0 ) /* we write to the output and check if all was written */ if ( write( out, copyBuffer, bytesRead ) != bytesRead ) fprintf(stdout, "Fatal error! Could not write whole buffer\n"); /* Did we read anything? */ if ( bytesRead == -1) { fprintf(stdout, "Fatal error [read]\n"); exit(EXIT_FAILURE); } if (close( in ) == -1) { fprintf(stdout, "Error closing input!\n"); exit(EXIT_FAILURE); } if (close( out ) == -1) { fprintf(stdout, "Error closing output!\n"); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); }
The resulting program can be run as:
gcc copy.c -o copy ./copy copy.c copy2.c ./copy /dev/tty1 copy_of_terminal.log ./copy netinst.iso copy.iso
The interesting thing here is that, because of the universal I/O model, the system calls that were used inside the program do not care about where the data is coming from or going to. They just see "input" and "output", where these could be a regular file, as in the example above, as well as terminals (/dev/tty) or other device/file types. That happens because all the details regarding wetter data comes from/to file system or device are handled by the kernel and the programmer does not need to care about that.
Happy coding!
-- FIN