What REALLY happens when you type ls -l in the shell
Before you carry on reading,
If you're familiar with the shell and the command ls -l, you can skip the introduction.
If you are not, then the introduction is for you.
***
Overview:
Introduction
What does the shell actually do?
Shell prompt
Read entered line
About getline
Split line to array of words
Types of commands
What is the PATH?
Deeper look/ Stat system call
Command execution/ fork, execve and wait
What is fork() and why do we need it?
What happens after execution?
Time to exit the shell
***
INTRODUCTION
What is the shell?
The shell is a program that takes commands from the keyboard via the terminal and gives them to the operating system to perform.
A terminal is simply a program that opens a window and lets you interact with the shell.
An operating system is the software that runs the computer. It manages the computer's memory and processes, as well as all of its software and hardware.
The kernel is the core of the operating system.
How does this work?
Most people are used to simply use their mouse and click on whatever they like. For example, if they want to open a file, they simply double click on it.
BUT when you interact with the shell, You can't use your mouse! you, instead:
1- Open the terminal in order to interact with the shell
2- Type your command to tell your shell what you want to do
3- The shell receives your command and gives it to the operating system to perform
4- Your command is executed
What is ls? (it is "LS", not "is"!)
The ls command is used to list the content of a directory. That means, it lists the names of all files and directories that are in your current directory.
For this example above, in my home directory, I typed the command ls Then I pressed ENTER. The result is: the directories and files that are in my home directory were printed out (Bureau, examples.desktop Modèles Public Vidéos Documents Images Musique Téléchargements).
What does -l refer to?
"-l" is an argument of the ls command that means Long Listing Format.
In other words, when we enter "ls –l" to the shell, we will get a file listing that contains the filename, modification time, size, group, owner and permissions of each file per line.
What does the shell actually do?
Shell prompt
First of all, before you type anything, the shell prints a prompt to you which normally ends with a $ sign. The prompt or command line is where you'll type your command.
Read entered line
Second, after you type your command, the shell reads what you typed using the getline function.
The getline function reads the entered line as one string, from the standard input and stores it in a buffer.
About getline:
ssize_t getline(char **lineptr, size_t *n, FILE *stream);
The function getline reads using the system call read. It receives a file pointer (stream) which is the file to be read, an address of a pointer to a string (*lineptr), which will be used to store the read line, and a pointer to an unsigned int/size_t (*n), which is the size of the buffer (*lineptr).
If the buffer doesn't point to any allocated memory area, then getline allocates memory area. Also, If the buffer is not large enough to hold the line, getline() resizes it with realloc, updating *lineptr and *n as necessary.
After reading the line and storing it in the buffer, the getline returns an int/ssize_t which is equal to:
- The number of characters read, on success, without including the terminating null byte of the string.
OR
2. -1, on failure to read a line (including end-of-file condition).
N.B The programmer must free the allocated memory of the buffer at the end of the program.
Split line to array of words
After having the whole command in a single string, the shell splits the string into an array of words using the function strtok and the delimiters " " (space) and "\n" (the newline character). That means, if we entered "ls –l /tmp", the string becomes this array: {"ls" , "-l", "/tmp"}.
- First element of array = "ls"
- Second element of array = "-l"
- Third element of array = "/tmp"
- Forth element of array = NULL
You can see here that, the first word in our array, is the entered command, which is "ls" and the rest of words are (generally) the arguments of that same command.
Now, it's easy for the shell to know what command we entered.
Types of commands
Before moving to the next steps, let's see the 4 types of commands:
Commands can be one of 4 different kinds:
1- An alias. Commands that you can define, built from other commands.
2- A command built into the shell itself. The cd command, for example, is a shell built-in.
3- An executable program like the ls command.
4- A shell function. These are miniature shell scripts incorporated into the environment.
After putting the command as the first element in the array of words, the shell checks the type of the command. If the command is an executable program, then the shell needs to parse the PATH.
What is the PATH?
The PATH is an environmental variable that tells the shell which directories to search for executable files. The directories are written separated by colons ':'
You can print all of your environmental variables using the command "env" or "printenv".
In the picture below, I entered the command "env" and as you can see, the PATH variable is in the list.
When the command is an executable program, the shell parses the directories in PATH, one after another, in order to find which directory is the container of the given command. For example, the directory that contains "ls" is "/bin". So the pathname of "ls" is "/bin/ls".
Deeper look/ Stat system call
For more details, the shell takes the name of the directory and concatenates it to "/" and to the command name then uses the "stat" system call to check if the result string is a valid pathname.
int stat(const char *pathname, struct stat *statbuf);
For example, the first directory in our PATH variable is /usr/local/sbin. The Shell concatenates the 3 strings: "/usr/local/sbin" + "/" + "ls" together and the result string is "/usr/local/sbin/ls". Then, the shell passes the result to the stat system call.
The stat system call returns 0 if the pathname is valid or -1 if the directory is wrong.
In our example, the stat returns -1 because the command ls exists in the directory /bin, NOT in /usr/local/sbin.
The shell keeps parsing PATH directories and testing with stat each time, till it finds the right directory for "ls".
Command execution/ fork, execve and wait
After finding the location/pathname of the command, the shell executes it.
The execution of commands happens using the system calls: fork(), execve() and wait().
The function execve executes the program referred to by the pathname. The pathname must be either a binary executable, or a script starting with a line of the form: #!interpreter [optional-arg]
The function execve receives 3 arguments:
int execve(const char *pathname, char *const argv[], char *const envp[]);
1- The pathname of the program. For example, "/bin/ls"
2- The array of argument strings of the program while the first element of the array should be the pathname of the program. For example, {"/bin/ls", "-l", "/tmp", NULL}
3- Array of environment variables
If there's an error, For example, if execve is given an invalid command, execve returns -1.
On success, execve() does not return. That means the current process EXITS !
What is fork() and why do we need it?
In order to avoid exiting the shell after running the function execve, the role of system call "fork()" comes into play.
d = fork();
The fork() creates a new process by duplicating the calling process. The new process is referred to as the child process. The calling process is referred to as the parent process.
The need for child process is when we want the child to do some tasks and exits without having the parent process exiting.
So before using the execve function, the shell creates a child process, using fork(). Then, uses the execve function only inside the child process. When the execve succeeds and exits the process, only the child process exits while the parent process keeps on running.
In the meanwhile, the parent process uses the wait() system call in order to suspends its execution until its child process terminates.
Wait(&status);
What happens after execution?
After the execution of the command with its arguments, the shell frees the allocated memory areas and reprints the prompt again waiting for the next command.
The process is repeated over and over till the user exits the shell.
Time to exit the shell
Exiting the shell can happen in 3 ways:
1- Typing the command "exit" OR "exit n" (while n is an integer)
2- Pressing on Ctrl + D (which marks the EOF, end of standard input file)
3- Pressing on Ctrl + C (which is defined as a signal to exit)
Written by:
Chokri Inès
Matri Mariem
Software Engineer
4 年Richard East?Richard East?Thank you very much for correcting all English mistakes and making the article looks perfect. I really appreciate that.