Tuesday, December 2, 2008

linux: zombies

New processes are being created with fork system call. After the fork is completed there are two processes available - parent and child.

Zombie is a process that was finished before the parent and the parent hadn't made any attempts to reap it.

When the child process was stopped or terminated the kernel still holds some minimal information about it in case the parent process will attempt to get it. This information is a task_struct where pid, exit code, resource usage information, etc. could be found. All other resources(memory, file descriptors, etc.) should be released. For details you can dig into the code of the do_exit function in kernel/exit.c in the linux kernel sources.

The parent should be notified about the child's death with SIGCHLD signal.

When the parent receives SIGCHLD it can get the information about the child by calling wait/waitpid/... system call. In the interval between the child terminating and the parent calling wait, the child is said to be a zombie. Even it's can't be in running or idle states it still takes place in the process table.
As soon as parent process receives SIGCHLD signal and it's ready to get information about the dead child this information is passed to it and all the information about the child is being removed from the kernel. In wait_task_zombie function(kernel/exit.c) you should find the details.

In fact the memory that is hold by a zombie is really small but it still in the process table and processed by the scheduler and also as the process table has a fixed number of entries it is possible for the system to run out of them.

When the parent terminates without waiting for the child zombie process is adopted by 'init' process which calls wait to clean up after it.

Let's look at the common zombie and its code.

#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    if (fork() == 0)
    {   
        printf("%d\n", getpid());
        fflush(stdout);
        _exit(0);
    }   
    else
    {   
        printf("%d\n", getpid());
        fflush(stdout);
        while (1) 
            sleep(10);
    }   
    
    return 0;
}
The output should be something like
$./zombie 
4090
4091
Grepping ps's output I got
$ps aux | grep -E "(4090|4091)"
niam      4090  0.0  0.0   1496   340 pts/2    S+   12:24   0:00 ./zombie
niam      4091  0.0  0.0      0     0 pts/2    Z+   12:24   0:00 [zombie] 
You can see that child process became a zombie.

Knowing that zombies if are not a complete evil but are very close to it, there existence should be prevented.

There are some possibilities to do that.

First of all if you want to care why the child was finished you should call wait. This is an only way I know to do that.

I know two modes of wait: blocking and non-blocking. Both methods are listed below.
  • Blocking method that will suspend parent until the SIGCHLD is received.
    #include <unistd.h>
    #include <stdio.h>
    #include <signal.h>
    
    int main(int argc, char **argv)
    {
        if (fork() == 0)
        {   
            printf("%d\n", getpid());
            fflush(stdout);
            _exit(0);
        }   
        else
        {   
            int status;
            wait(&status);
            printf("%d\n", getpid());
            fflush(stdout);
            while (1) 
                sleep(10);
        }   
        
        return 0;
    }
    And the resulting output for this code was
    $./zombie 
    4949
    4950
    
    $ps aux | grep -E '(4949|4950)'
    niam      4949  0.0  0.0   1496   340 pts/2    S+   13:12   0:00 ./zombie
  • Non-blocking which won't put the parent into the sleep state but requires multiply calls of waitpid.
    #include <unistd.h>
    #include <stdio.h>
    #include <signal.h>
    #include <sys/wait.h>
    
    int main(int argc, char **argv)
    {
        if (fork() == 0)
        {   
            printf("%d\n", getpid());
            fflush(stdout);
            _exit(0);
        }   
        else
        {   
            printf("%d\n", getpid());
            fflush(stdout);
            int status;
            while (1) 
            {
                waitpid(-1, &status, WNOHANG);
                sleep(10);
            }
        }   
        
        return 0;
    }
    The output was
    $./zombie 
    4932
    4931
    
    $ps aux | grep -E '(4931|4932)'
    niam      4931  0.0  0.0   1496   336 pts/2    S+   13:07   0:00 ./zombie

Another approach is to disregard child's exit status and detach it.
  • Redefine SIGCHLD signal handler to specify SA_NOCLDSTOP flag for it.
    #include <unistd.h>
    #include <stdio.h>
    #include <signal.h>
    
    int main(int argc, char **argv)
    {
        struct sigaction sa;
        sigaction(SIGCHLD, NULL, &sa);
        sa.sa_flags |= SA_NOCLDWAIT;//(since POSIX.1-2001 and Linux 2.6 and later)
        sigaction(SIGCHLD, &sa, NULL);
    
        if (fork() == 0)
        {   
            printf("%d\n", getpid());
            fflush(stdout);
            _exit(0);
        }   
        else
        {   
            printf("%d\n", getpid());
            fflush(stdout);
            while (1) 
                sleep(10);
        }   
        
        return 0;
    }
    The output should be something like
    $./zombie 
    4416
    4417
    
    $ps aux | grep -E '(4416|4417)'
    niam      4416  0.0  0.0   1496   340 pts/2    S+   12:41   0:00 ./zombie
  • Set SIGCHLD signal handler to SIG_IGN(ignore this signal).
    #include <unistd.h>
    #include <stdio.h>
    #include <signal.h>
    
    int main(int argc, char **argv)
    {
        struct sigaction sa;
        sigaction(SIGCHLD, NULL, &sa);
        sa.sa_handler = SIG_IGN;
        sigaction(SIGCHLD, &sa, NULL);
    
        if (fork() == 0)
        {   
            printf("%d\n", getpid());
            fflush(stdout);
            _exit(0);
        }   
        else
        {   
            printf("%d\n", getpid());
            fflush(stdout);
            while (1) 
                sleep(10);
        }   
        
        return 0;
    }
    This code should produce the following output.
    $./zombie 
    4458
    4459
    
    $ps aux | grep -E '(4459|4458)'
    niam      4458  0.0  0.0   1496   340 pts/2    S+   12:45   0:00 ./zombie
    Note that POSIX.1-1990 disallowed setting the action for SIGCHLD to SIG_IGN. POSIX.1-2001 allows this possibility, so that ignoring SIGCHLD can be used to prevent the creation of zombies
According to the linux-2.6.27 sources setting signal handler to SIG_IGN might give a small benefit in performance. Here is a piece of code from kernel/exit.c
static int ignoring_children(struct task_struct *parent)     
{                               
    int ret;                                                        
    struct sighand_struct *psig = parent->sighand;     
    unsigned long flags;        
    spin_lock_irqsave(&psig->siglock, flags);            
    ret = (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||     
           (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT));     
    spin_unlock_irqrestore(&psig->siglock, flags);     
    return ret;                
}
The kernel checks signal handler first.

There is other problem when the software wasn't developed by you but it produces zombies during the execution. There is a trick with gdb to kill process' zombies. You can attach to the parent process and manually call wait.
$./zombie 
4980
4981

$ps aux | grep -E '(4980|4981)'
niam      4980  0.0  0.0   1496   336 pts/2    S+   13:19   0:00 ./zombie
niam      4981  0.0  0.0      0     0 pts/2    Z+   13:19   0:00 [zombie] 

$gdb -p 4980
....
(gdb) call wait()
$1 = 4981

$ps aux | grep -E '(4980|4981)'
niam      4980  0.0  0.0   1496   336 pts/2    S+   13:19   0:00 ./zombie

No comments: