How a Process Dies
When a process dies, it either dies normally by electing to exit, or abnormally as the result of receiving a signal. Processes leave behind a status code (also called return value) when they die, in the form of an integer. (Recall the bash shell, which uses the $? variable to store the return value of the previously run command.) When a process exits, all of its resources are freed, except the return code (and some resource utilization accounting information). It is the responsibility of the process’s parent to collect this information and free up the last remaining resources of the dead child. For example, when the bash shell forks and execs the chmod command, it is the parent bash shell’s responsibility to collect the return value from the exited chmod command.
If it is a parent’s responsibility to clean up after their children, what happens if the parent dies before the child does? The child becomes an orphan. One of the special responsibilities of the first process started by the kernel is to “adopt” any orphans. (Notice that in the output of the pstree command, the first process has a disproportionately large number of children. Most of these were adopted as the orphans of other processes).
In between the time when a process exits, freeing most of its resources, and the time when its parent collects its return value, freeing the rest of its resources, the child process is in a special state referred to as a Zombie. Every process passes through a transient zombie state. Usually, users need to be looking at just the right time (with the ps command, for example) to witness a zombie. They show up in the list of processes, but take up no memory, no CPU time, or any other system resources. They are just the shadow of a former process, waiting for their parent to come and finish them off.
Negligent Parents and Long Lived Zombies
Occasionally, parent processes can be negligent. They start child processes, but then never go back to clean up after them. When this happens (usually because of a programmer’s error), the child can exit, enter the zombie state, and stay there. This is usually the case when users witness zombie processes using, for example, the ps command.
Getting rid of zombies is perhaps the most misunderstood basic Linux (and Unix) concept. Many people will say that there is no way to get rid of them, except by rebooting the machine. Using the clues discussed in this post, can you figure out how to get rid of long-lived zombies?
The 5 Process States
The previous section discussed how processes are started, and how they die. While processes are alive they are always in one of five process states, which affect how and when they are allowed to have access to the CPU. The following lists each of the five states, along with the conventional letter that is used by the ps, top, and other commands to identify a process’s current state.
Processes in the Runnable state are processes that, if given the opportunity to access the CPU, would take it. Multiple processes can (and often are) in the runnable state, but because only one process may be executing on the CPU at any given time, only one of these processes will actually be running at any given instance. Because runnable processes are switched in and out of the CPU so quickly, however, the Linux system gives the appearance that all of the processes are running simultaneously.
Voluntary (Interruptible) Sleep (S)
As the name implies, a process that is in a voluntary sleep elected to be there. Usually, this is a process that has nothing to do until something interesting happens. A classic example is a networking daemon, such as the httpd process that implements a web server. In between requests from a client (web browser), the server has nothing to do, and elects to go to sleep. Another example would be the top command, which lists processes every five seconds. While it is waiting for five seconds to pass, it drops itself into a voluntary sleep. When something that the process in interested in happens (such as a web client makes a request, or a five second timer expires), the sleeping process is kicked back into the Runnable state.
Involuntary (Non-interruptible) Sleep (D)
Occasionally, two processes try to access the same system resource at the same time. For example, one process attempts to read from a block on a disk while that block is being written to because of another process. In these situations, the kernel forces the process into an involuntary sleep. The process did not elect to sleep, it would prefer to be runnable so it can get things done. When the resource is freed, the kernel will put the process back into the runnable state.
Although processes are constantly dropping into and out of involuntary sleep, they usually do not stay there long. As a result, users do not usually witness processes in an involuntary sleep except on busy systems.
Stopped (Suspended) Processes (T)
Occasionally, users decide to suspend processes. Suspended processes will not perform any actions until they are restarted by the user. In the bash shell, the CTRL-Z key sequence can be used to suspend a process. In programming, debuggers often suspend the programs they are debugging when certain events happen (such as breakpoints occur).
Zombie Processes (Z)
As mentioned above, every dyeing process goes through a transient zombie state. Occasionally, however, some get stuck there. Zombie processes have finished executing, and have freed all of their memory and almost all of their resources. Because they are not consuming any resources, they are little more than an annoyance that can show up in process listings.
Viewing Process States
When viewing the output of commands such as ps and top, process states are usually listed under the heading STAT. The process is identified by one of the following letters.
- Runnable – R
- Sleeping – S
- Stopped – T
- Uninterruptible sleep – D
- Zombie – Z
Use the command shown below and monitor the column STAT to view the state of the process:
$ ps -alx