New processes are being created with fork system call. After the fork is completed there are two processes available - parent and child.
Zombie is a process that was finished before the parent and the parent hadn't made any attempts to reap it.
When the child process was stopped or terminated the kernel still holds some minimal information about it in case the parent process will attempt to get it. This information is a task_struct where pid, exit code, resource usage information, etc. could be found. All other resources(memory, file descriptors, etc.) should be released. For details you can dig into the code of the do_exit function in kernel/exit.c in the linux kernel sources.
The parent should be notified about the child's death with SIGCHLD signal.
When the parent receives SIGCHLD it can get the information about the child by calling wait/waitpid/... system call. In the interval between the child terminating and the parent calling wait, the child is said to be a zombie. Even it's can't be in running or idle states it still takes place in the process table.
As soon as parent process receives SIGCHLD signal and it's ready to get information about the dead child this information is passed to it and all the information about the child is being removed from the kernel. In wait_task_zombie function(kernel/exit.c) you should find the details.
In fact the memory that is hold by a zombie is really small but it still in the process table and processed by the scheduler and also as the process table has a fixed number of entries it is possible for the system to run out of them.
When the parent terminates without waiting for the child zombie process is adopted by 'init' process which calls wait to clean up after it.
Let's look at the common zombie and its code.
#include <unistd.h> #include <stdio.h> int main(int argc, char **argv) { if (fork() == 0) { printf("%d\n", getpid()); fflush(stdout); _exit(0); } else { printf("%d\n", getpid()); fflush(stdout); while (1) sleep(10); } return 0; }The output should be something like
$./zombie 4090 4091Grepping ps's output I got
$ps aux | grep -E "(4090|4091)" niam 4090 0.0 0.0 1496 340 pts/2 S+ 12:24 0:00 ./zombie niam 4091 0.0 0.0 0 0 pts/2 Z+ 12:24 0:00 [zombie]You can see that child process became a zombie.
Knowing that zombies if are not a complete evil but are very close to it, there existence should be prevented.
There are some possibilities to do that.
First of all if you want to care why the child was finished you should call wait. This is an only way I know to do that.
I know two modes of wait: blocking and non-blocking. Both methods are listed below.
- Blocking method that will suspend parent until the SIGCHLD is received.
#include <unistd.h> #include <stdio.h> #include <signal.h> int main(int argc, char **argv) { if (fork() == 0) { printf("%d\n", getpid()); fflush(stdout); _exit(0); } else { int status; wait(&status); printf("%d\n", getpid()); fflush(stdout); while (1) sleep(10); } return 0; }
And the resulting output for this code was$./zombie 4949 4950 $ps aux | grep -E '(4949|4950)' niam 4949 0.0 0.0 1496 340 pts/2 S+ 13:12 0:00 ./zombie
- Non-blocking which won't put the parent into the sleep state but requires multiply calls of waitpid.
#include <unistd.h> #include <stdio.h> #include <signal.h> #include <sys/wait.h> int main(int argc, char **argv) { if (fork() == 0) { printf("%d\n", getpid()); fflush(stdout); _exit(0); } else { printf("%d\n", getpid()); fflush(stdout); int status; while (1) { waitpid(-1, &status, WNOHANG); sleep(10); } } return 0; }
The output was$./zombie 4932 4931 $ps aux | grep -E '(4931|4932)' niam 4931 0.0 0.0 1496 336 pts/2 S+ 13:07 0:00 ./zombie
Another approach is to disregard child's exit status and detach it.
- Redefine SIGCHLD signal handler to specify SA_NOCLDSTOP flag for it.
#include <unistd.h> #include <stdio.h> #include <signal.h> int main(int argc, char **argv) { struct sigaction sa; sigaction(SIGCHLD, NULL, &sa); sa.sa_flags |= SA_NOCLDWAIT;//(since POSIX.1-2001 and Linux 2.6 and later) sigaction(SIGCHLD, &sa, NULL); if (fork() == 0) { printf("%d\n", getpid()); fflush(stdout); _exit(0); } else { printf("%d\n", getpid()); fflush(stdout); while (1) sleep(10); } return 0; }
The output should be something like$./zombie 4416 4417 $ps aux | grep -E '(4416|4417)' niam 4416 0.0 0.0 1496 340 pts/2 S+ 12:41 0:00 ./zombie
- Set SIGCHLD signal handler to SIG_IGN(ignore this signal).
#include <unistd.h> #include <stdio.h> #include <signal.h> int main(int argc, char **argv) { struct sigaction sa; sigaction(SIGCHLD, NULL, &sa); sa.sa_handler = SIG_IGN; sigaction(SIGCHLD, &sa, NULL); if (fork() == 0) { printf("%d\n", getpid()); fflush(stdout); _exit(0); } else { printf("%d\n", getpid()); fflush(stdout); while (1) sleep(10); } return 0; }
This code should produce the following output.$./zombie 4458 4459 $ps aux | grep -E '(4459|4458)' niam 4458 0.0 0.0 1496 340 pts/2 S+ 12:45 0:00 ./zombie
Note that POSIX.1-1990 disallowed setting the action for SIGCHLD to SIG_IGN. POSIX.1-2001 allows this possibility, so that ignoring SIGCHLD can be used to prevent the creation of zombies
static int ignoring_children(struct task_struct *parent) { int ret; struct sighand_struct *psig = parent->sighand; unsigned long flags; spin_lock_irqsave(&psig->siglock, flags); ret = (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN || (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT)); spin_unlock_irqrestore(&psig->siglock, flags); return ret; }The kernel checks signal handler first.
There is other problem when the software wasn't developed by you but it produces zombies during the execution. There is a trick with gdb to kill process' zombies. You can attach to the parent process and manually call wait.
$./zombie 4980 4981 $ps aux | grep -E '(4980|4981)' niam 4980 0.0 0.0 1496 336 pts/2 S+ 13:19 0:00 ./zombie niam 4981 0.0 0.0 0 0 pts/2 Z+ 13:19 0:00 [zombie]$gdb -p 4980 .... (gdb) call wait() $1 = 4981 $ps aux | grep -E '(4980|4981)' niam 4980 0.0 0.0 1496 336 pts/2 S+ 13:19 0:00 ./zombie
No comments:
Post a Comment