How ls -l *.c works in?shell?
Today, we are going to see how shell works and in our journey we will see how ls -l *.c does its job. We will be explaining, in detail, how the command you type can order a computer to do so many things. At the end, hopefully, you will have a deeper understanding of shell.
Let first discuss about?shell.?
A shell is a program that takes input from the user and passes the input to our computers to be executed. Shell can be a command-line interface, where command is given by typing on a terminal, or a graphic interface, the much more user friendly software like Windows.
How Command-line Runs?
Shell is an infinitely running “$”, also called a PROMPT, that is waiting for you to type something. The reason for that is, it was programmed to run on an infinite loop. We have all had our share of infinite loops on our codes, but this was actually done on purpose. We can create them using for(;;), while(1)….etc. for more detailed explanation and other options click here
Every time a process starts, it will have standard-input(0), standard-output(1), standard-error(2). They are represented by stream, that is saved in a file with their own file descriptors(their numbers). So to make sure that the input is coming from the terminal isatty is used. It checks the STDIN_FILENO(0) is from the terminal & returns 1 if it is true. If the command is not from the terminal it will show error saying Inappropriate ioctl for device.
When isatty returns 1 it means we are in our own terminal. We can use write to print, to the STDOUT_FILENO(1) our prompt. Write is one of the many system calls out there. When System is called the execution is done by the operating system, after it finishes executing, the command is returned to the user again.
How Command?Works?
So you might ask “why isn’t it just printing “$” a million times like my code does?”
The reason is, it is waiting for you to enter a command. Some of us might have used gets, scanf, getchar or the new one getline. They all have one similar task of waiting for an input. I saw how getline was used to create a shell, so I will discuss a bit on that. This is the syntax for getline.
size_t getline(char **string, size_t *n, FILE *stream);
We now have ls -l *.c in string with ‘\n’ or newline character at the end of the string. When a shell is processing ls -l *.c, it needs to separately check for the arguments and execute them. For this reason we can call the strtok function. Strtok takes a big string and separates those strings into smaller strings according to the delimiter. Delimiter is set by the user that can be used to create small string. In the case of ls -l *.c, we can use ” \n” or space and newline.?So we will then have 3 strings, ls, -l, *.c.
In the picture we see that each small string before “ ” stored in string called token. Each token is the stored in tok, tok is a null terminating pointer to an array of string.
Where does the command?go?
The commands of shell can be one of the above 4. Built-in are those that come with the shell and to modify them requires modification of the shell. Executable are external programs that need to be loaded and executed and they are saved in the PATH. Alias are those the user gave a shortcut name to. Shell Scripting is a program to write a series of commands for the shell to execute.
In the above image we see three commands, their command type, and their location. As we can see ‘cd’ doesn’t have a storage because it is part of the shell, a different program doesn’t need to be called for it to execute.
You might be asking “how can shell call another program to execute when it is already a program itself?”
I will answer that question with a question “did you know that programs can have children that look exactly like them?”
Technically speaking they aren’t exactly like the parent cause they defer in their ID. Yup, programs have ID. For fun try to see what the function getpid and getppid do.
If you read those then you might have seen a system call by the name fork. Fork is used to create a child process that is a duplicate of the process that is running, but with a different ID.
领英推荐
pid shows the return value of the fork. Pid is 0, in the first one, fork return 0 in the child process and greater than 0 in the parent process. So logically speaking, if one has 2 duplicate programs, ‘where does another program that needs to be executed go in?’ In the child.
The EXECUTION
What executioners exist in c? There are many, all with the same first name ‘exec’. Their purpose is to execute external programs inside the programs they are called in.
Lets us see at one of them
int execve(const char *pathname, char *const argv[],
char *const envp[]);
?
execve doesn’t return on success, and the program that comes after it doesn’t get to be executed either. As it has been mentioned earlier execution happens in the child process.
While execution happens in the child process, what happens to the parent process?
The parent process waits patiently till the child process gives a signal after it executes or after it has been interrupted. When we call wait, it instructs for parent to wait for report to come from child process before it can proceed.
How does ls -l *.c fit in all of?this
So we have seen ls -l *.c gets to be broken down to small strings. We have also seen ls is an alias that is stored in /usr/bin/ls.
When a command is given it will be searched in built-in, alias and path. When it is found, it will get executed.
typedef struct alias {
char *long_name;
char *short_name;
} alias;
alias aliases[] = {
{“ls”, “ls — color=auto”},
{NULL, NULL}
};
while (aliases[x].long_name != NULL)
{
if (strcmp(args[0], aliases[x].long_name) == 0)
{
return (aliases[x].short_name);
}
x++;
}
Here we will create an alias, and if our first argument is the same as the long name then it will execute the small one we associated it with.
typedef struct builtin{
char *name;
int (*func)(void);
} built;
built built_in[] = {
{"cd", env_function},
{"help", help_function},
{"exit", exit_function},
{NULL, NULL}
};
while (built_in[x].name != NULL)
{
if (strcmp(args[0], built_in[x].name) == 0)
{
return (built_in[x].func);
}
x++;
}
Here we created a function for built in. If our argument is a one of the built in, then it will call the function to execute them.
If command isn’t found in alias or built in, then it will be searched in the environment, more specifically in the path.
This big collection is called environment and it has “name=value” format, we saw this earlier in envp, from all of these we will be needing PATH.
extern char **environ;
while (environ[x] != NULL)
{
if (strcmp(environ[x], "PATH") == 0)
return(environ[x]);
x++;
}
/** ENVIRON IS A POINTER TO AN ARRAY OF STRINGS, SO WE HAVE TO GO THROUGH IT TO FIND PATH. SO ENVIRON[0] = SHELL AND ENVIRON[1] = PWD, ACCORDING TO THE FIGURE ABOVE. **/
environ[x] = "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
/** THIS IS PATH IN ITS UNTOUCHED FORM. BUT WE ONLY JUST NEED THE INDIVIDUAL DIRECTORIES, SO WE WILL USE STRTOK **/
token = strtok(path, "=:");
while (token != NULL)
{
dirs[x] = strdup(token);
token = strtok(NULL, "=:");
x++;
}
dirs[x] = NULL;
/**NOW WE HAVE ALL THE DIRECTORIES IN THEIR OWN STRING. SO DIRS[0] = "/usr/local/sbin", AND DIRS[1] = "/usr/local/bin" AND SO ON**/
argv[0] = "ls"; <--- tok[0] of the tokenized string
char *cmd;
while (dirs[x] != NULL)
{
cwd = strcat(dir[x], "/");
cwd = strcat(cwd, argv[0]);
if (access(cwd, F_OK) == 0)
{
args[0] = cwd;
break;
}
x++;
}
/*EACH DIRECTORY IS CONCATENATED WITH FORWARD SLASH "/" AND COMMAND GIVE AND EXISTANCE OF THE CONCATENATED IS CHECKED. IF IT EXISTS THEN THE CONCATENATED WILL BE EXECUTED B EXECVE*/
How will “-l” and “*.c” work then? As it was mentioned earlier execve uses command-line arguments and it uses them in the new process. “-l” is argv[1] and “*.c” is argv[2].
“ls” will list files, “-l” will make that list long and “*.c” will only list those files with “.c” extensions.
Acknowledgement
We acknowledge the articles written by Robert Malmstein, Ricardo Hincapie, Carlos Barros as our guide to get into the matter.
Authors
Samra Solomon
Zelalem Welelaw