Linux进程等待

💦 进程等待的必要性

之前讲过，子进程退出，父进程如果不管不顾，就可能造成 “ 僵尸进程 ” 的问题，进而造成内存泄漏。另外，进程一旦变成僵尸状态，那就刀枪不入，“ 杀人不眨眼 ” 的kill -9也无能为力，因为谁也没有办法杀死一个已经死去的进程。最后，父进程派给子进程的任务完成的如何，我们需要知道。如，子进程运行完成，结果对还是不对，或者是否正常退出。父进程通过进程等待的方式，回收子进程资源，获取子进程退出信息。

回收僵尸进程，避免内存泄漏。
需要获取子进程的运行结束状态、结果。

结束状态和结果不是必须的。注意区分运行状态和运行结果，两者是有区别的。
尽量保证父进程要晚于子进程退出，可以规范化进行资源回收。

将来我们写代码时，所有要做的事情都交给子进程，子进程把事办完了，由父进程统一回收。这点其实是与编码相关的策略，而并非属于系统级别的要求。

其实信号部分结束我们就可以知道有一种方案可以让父进程既不等子进程又没有内存泄漏。

💦 进程等待的方法

1、wait方法

#include<sys/types.h>
#include<sys/wait.h>

pid_t wait(int* status);

返回值：
	成功则返回被等待进程的pid，失败则返回-1。
参数：
	输出型参数，获取子进程退出状态，不关心则可以设置为NULL。wait 的参数 int* status 会重点在下面的 waitpid 学习。

✔ 测试用例一：

父进程等待子进程退出后，wait 取子进程的 pid。

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id < 0)
    {
        perror("fork");
        return 1;
    }
    else if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            sleep(1);
        }                                                                                     
        printf("child quit...\n");
        exit(0);
    }
    else
    {
        printf("father is waiting...\n");
        pid_t ret = wait(NULL);
        printf("father is wait done, ret: %d\n", ret);
    }
    return 0;
}

💨运行结果：

✔ 测试用例二：

相比测试用例一，更直观的等待，进程从无到有，从有到无。

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id < 0)
    {
        perror("fork");
        return 1;
    }
    else if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            sleep(1);
        }                                                                                     
        printf("child quit...\n");
        exit(0);
    }
    else
    {
        printf("father is waiting...\n");
        sleep(10);
        pid_t ret = wait(NULL);
        printf("father is wait done, ret: %d\n", ret);
        sleep(3);
        printf("father quit...\n");
    }
    return 0;
}

💨运行结果：

监控脚本：while :; do ps ajx | head -1 && ps ajx | grep process | grep -v grep; sleep 1; echo "####################"; done

✔ 测试用例三：

fork 5 个子进程后，父进程依次等待，并回收僵尸进程。

#include<stdio.h> 
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    int i = 0;
    while(i < 5)
    {    
        pid_t id = fork();
        if(id < 0)
        {
            perror("fork");
            return 1;
        } 
        if(id == 0)
        {
            int count = 5;
            while(count)
            {
                printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
                sleep(1);
            }                                                                                     
            printf("child quit...\n");
            exit(0);        
        }
        i++;
    }
    for(i = 0; i < 5; i++)
    { 
        printf("father is waiting...\n");
        sleep(10);
        pid_t ret = wait(NULL);
        printf("father is wait done, ret: %d\n", ret);
        sleep(3);
        printf("father quit...\n");
    } 
    return 0;
}

💨运行结果：

子进程僵尸了，父进程也退出了 ❓

此时 ps ajx 能否看到僵尸进程是不确定的。因为父进程退出，子进程会被操作系统领养。那么这个僵尸进程是在被操作系统领养后立马回收，还是积累到一定的僵尸进程再回收，这是由操作系统的策略决定的，同时也跟当前操作系统的状态有关系，如果操作系统发现内存资源已经很紧张了，就会提前回收。

父进程是如何知道子进程的退出结果呢 ❓

2、waitpid方法

#include<sys/types.h>
#include<sys/wait.h>

pid_t waitpid(pid_t pid, int* status, int options);

返回值：
	当正常返回时，waitpid返回收集到的子进程的进程ID；
	如果设置了选项WNOHANG，而调用waitpid时，发现没有已退出的子进程可收集，则返回0；
	如果调用中出错，则返回-1，这时errno会被设置成相应的值以指示错误所在；
参数：
	pid，
		pid=-1，等待任何一个子进程，同wait；
		pid>0，等待其进程ID与pid相等的子进程；
		因为父进程返回的是子进程的pid，所以父进程就可以等待指定的子进程，等待本质是管理的一种方式；
	status，
		输出型参数，我们传了一个整数地址进去，最终通过指针解引用把期望的数据拿出来。与之对应的是实参传递给形参是输入型参数；
		WIFEXITED(status)，查看进程是否正常退出，是则真，不是则假；
		WEXITSTATUS(status)，查看进程退出码，需要WIFEXITED(status)返回true，WIFEXITED(status)正常退出则返回true；
		WTERMSIG(status)，返回导致子进程终止的信号的编号，需要WIFSIGNALED(status)返回true，WIFSIGNALED(status)子进程被信号终止返回true；
	options，
		WNOHANG，若pid指定的子进程没有结束，则waitpid()函数返回0，本次不予以等待，需要我们再次等待；若非正常结束，则返回该子进程的ID；或者小于0，失败了。
		0，阻塞式等待，同wait————子进程没退出、回收，父进程等待；

status ❓

wait 和 waitpid，都有一个 status 参数，该参数是一个输出型参数，由操作系统填充。
如果传递NULL，表示不关心子进程的退出状态信息。
否则，操作系统会根据该参数，将子进程的退出信息反馈给父进程。
status 不能简单的当作整型来看待，可以当作位图来看待，具体细节如下图(只研究 status 低16 比特位)。

阻塞和非阻塞 ❓

这个概念我们是第一次接触，也不会深入，后面再学习文件和网络时会经常接触。如果 waitpid 中的 options 传 WNOHANG ，那么等待方式就是非阻塞；如果传 0，那么等待方式就是阻塞。

比如你的学习很差，所以打电话给楼上学习好的同学张三，说：张三，你下来，我请你吃个饭，然后你帮我复习一下。张三说：行，没问题，但是我在写代码，半个小时之后再来。一般一个班，学习好的人总是少数，所以你怕你电话一挂，有人又跟张三打电话求助，导致你不能及时复习，所以你又跟张三说：张三，你电话不要挂，你把电话放你旁边，我喜欢看你写代码的样子。然后你什么事都不做，就在那等待，直到张三下来。当然现实中很少有这种情况，但是这样的场景是存在的，一般是比较紧急的情况，比如你爸打电话让你做件事且告诉你不要挂电话。此时张三不下来，电话就不挂就类似于调用函数，这种等待方式就叫做阻塞等待。我们目前所调用的函数，全部是阻塞函数，不管是你自己写的、库里的、系统的，阻塞函数最典型的特征是调用 ➡ 执行 ➡返回 ➡ 结束，其中调用方始终在等待，什么事情都没做。

又比如，你跟张三说：明天要考试了，一会我们去吃个饭，然后去自习室，你帮我复习下。张三说：没问题，但是我在写代码，你得等我下。你说：行吧，我在食堂等你。然后挂电话。过了两分钟，你给张三打电话说：张三，你来了没。张三说：我还得一会，你再等下。你说：行吧。然后挂电话。又过了两分钟，你又给张三打电话说：张三，你来了没 … … 。你不断重复的给张三打电话，这种场景在生活中比较多，我们经常催一个人做一件事时，他老是不动，你就不断重复给他打电话。你本质并不是给张三打电话，而是检测张三的状态，张三有没有达到我所期望的状态，每次检测张三是不一定立马就就绪的，如他有没有写完、开始下楼等。这里的检测张三的状态，只是想查看进度，所以这里打电话过程并不会把我卡住，我通过多次打电话来检测张三的进度。每次打电话挂电话的过程就叫做非阻塞等待。我们只要看了它的状态不是就绪，就立马返回。这种基于多次的非阻塞的调用方案叫做非阻塞轮询检测方案。

为什么现实世界中大部分选择非阻塞轮询？？？

这种高效体现在：主要是对调用方高效，你给张三打电话，张三就要 10 分钟，那就是 10 分钟，类似于计算机，你再怎么催都没用，所以我们就不会死等，我们可以先做其它的事，反正不会让因为等待你，而让我做不了事情。

那为什么我们写的代码大部分都是阻塞调用？？

根本原因在于我们的代码都是单执行流，所以选择阻塞调用更简单。

为什么是 WNOHANG ？？？

在服务器资源即将被吃完时，卡住了，我们一般称服务器hang住了，进而导致宕机。所以 W 表示等待，NO 表示不要，HANG 表示卡了，所以这个宏的意思是等待时不要卡住。

如何理解父进程等子进程中的 “ 等 ” ？？？

所谓的等并不是把父进程放在 CPU 上，让父进程在 CPU 上边跑边等。本来父子进程都在运行队列中等待 CPU 运行，当子进程开始被 CPU 运行后，就把父进程由 R 状态更改为 !R 状态，并放入等待队列中，此时父进程就不运行了，它就在等待队列中等待。当子进程运行结束后，操作系统就会把父进程放入运行队列，并将状态更改为 R 状态，让 CPU 运行，这个过程叫做唤醒等待的过程。

操作系统是怎么知道子进程退出时就应该唤醒对应的父进程呢？？

wait 和 waitpid 是系统函数，是由操作系统提供的，你是因为调用了操作系统的代码导致你被等待了，操作系统当然知道子进程退出时该唤醒谁。

这里，我们只要能理解等待就是将当前进程放入等待队列中，将状态设置为 !R 状态。所以一般我们在平时使用计算机时，肉眼所发现的一些现象，如某些软件卡住了，根本原因是要么进程太多了，导致进程没有被 CPU 调度；要么就是进程被放到了等待队列中，长时间不会被 CPU 调度。我们曾经在写 VS 下写过一些错误代码，一旦运行，就会导致 VS 一段时间没有反应。所谓的没有反应就是因为程序导致系统出现问题，操作系统在处理问题区间，把 VS 进程设置成 !R 状态，操作系统处理完，再把 VS 唤醒。

验证子进程僵尸后，退出结果会保存在 PCB 中？？？

可以看到在 Linux 2.6.32 源码中，task_struct 里包含了退出码和退出信息。

✔ 测试用例一：

同 wait 测试用例二。

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h> 
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            sleep(1);
        }                                                                                     
        printf("child quit...\n");
        exit(0);
    }
    //father
    sleep(8);
    pid_t ret = waitpid(id, NULL, 0);//同waitpid(-1, NULL, 0)
    printf("father wait done, ret: %d\n", ret);
    sleep(3);
	
	return 0;
}

💨运行结果：

✔ 测试用例二：

父进程 fork 派生一个子进程干活，父进程通过 status 可以知道子进程把活做的怎么样。

#include<stdio.h>
#include<stdlib.h> 
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            sleep(1);
        }                                                                                     
        printf("child quit...\n");
        exit(123);
    }
    //father
    int status = 0;
    pid_t ret = waitpid(-1, &status, 0);
    int code = (status >> 8) & 0xFF;
    printf("%d\n", status);
    printf("father wait done, ret: %d, exit code: %d\n", ret, code);
    if(code == 0)
    {
        printf("做好了\n");     
    }
    else  
    {  
        printf("没做好\n");  
    }             

	return 0;
}

💨运行结果：

(31488)10 = (0111 1011 0000 0000)2 ;

0111 1011 0000 0000 >> 8 = 0111 1011;

(0111 1011)2 = (123)10 ;

子进程已经退出了，子进程的退出码放在哪 ❓

换句话说，父进程通过 waitpid 要拿子进程的退出码应该从哪里去取呢，明明子进程已经退出了。子进程是结束了，但是子进程的状态是僵尸，也就是说子进程的相关数据结构并没有被完全释放。当子进程退出时，进程的 task_struct 里会被填入当前子进程退出时的退出码，所以 waitpid 拿到的 status 值是通过 task_struct 拿到的。

✔ 测试用例三：

针对测试用例二，父进程无非就是想知道子进程的工作完成的结果，那全局变量是否可以作为子进程退出码的设置，以此告知父进程子进程的退出码。

#include<stdio.h> 
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int code = 0;

int main() {
    pid_t id = fork();
    if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            sleep(1);
        }                                                                                     
        printf("child quit...\n");
        code = 123;
        exit(0);
    }
    //father
    int status = 0;
    pid_t ret = waitpid(-1, &status, 0);
    printf("father wait done, ret: %d, exit code: %d\n", ret, code);
    if(code == 0)
    {
        printf("做好了\n");     
    }
    else
    {
        printf("没做好\n");
    }

	return 0;
}

💨运行结果：

很显然，不可以。这里对于全局变量，发生了写时拷贝，在进程地址空间里我们说过父子是具有独立性的，虽然变量是同一个，但实际上子进程或父进程所写的数据，它们都是无法看到彼此的，所以不可能让父进程拿到子进程的退出结果。

✔ 测试用例四：

模拟异常终止 —— 野指针。

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            sleep(1);
            //err
            int* p = 0x12345;                                                                 
            *p = 100;
        }                         
        printf("child quit...\n");
        exit(123);
    }              
    //father 
    int status = 0;                     
    pid_t ret = waitpid(-1, &status, 0);
    int code = (status >> 8) & 0xFF;                                              
    int sig = status & 0x7F;//0111 1111 
    printf("father wait done, ret: %d, exit code: %d, sig: %d\n", ret, code, sig);
             
    return 0;       
}

💨运行结果：

子进程崩溃后，立马退出，变成僵尸，并不会影响父进程，这叫做父子具有独立性，父进程等待成功(不管你是正常还是非正常退出)，随后进行回收。此时子进程的退出码是无意义的，子进程的异常终止导致父进程获得了子进程退出时的退出信号，我们发现它的信号是第 11 号信号(SIGSEGV)，它一般都是段错误。

✔ 测试用例五：

模拟异常终止 —— 使用kill -9信号亲手杀死子进程。

#include<stdio.h> 
#include<stdlib.h> 
#include<unistd.h> 
#include<sys/types.h> 
#include<sys/wait.h> 
  
int main() {  
    pid_t id = fork();  
    if(id == 0)  
    {  
        int count = 50;                                                                       
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            sleep(1);
        }                                                                                     
        printf("child quit...\n");
        exit(123);
    }
    //father 
    int status = 0;  
    pid_t ret = waitpid(-1, &status, 0);  
    int code = (status >> 8) & 0xFF;      
    int sig = status & 0x7F;//0111 1111 
    printf("father wait done, ret: %d, exit code: %d, sig: %d\n", ret, code, sig);  
                                                                                    
    return 0;                                                                       
}

💨运行结果：

当我们把正在运行的子进程亲手杀掉后，父进程立马做回收工作，此时退出码是什么已经不重要了，父进程拿到的信号是第 9 号信号(SIGKILL)，此时我们就知道子进程连代码都没跑完，是被别人杀掉才退出的。

✔ 测试用例五：

父进程完整的等待子进程的全过程。

#include<stdio.h> 
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            //err
            //int* p = 0x12345;
            //*p = 100;
            sleep(1);
        }
        printf("child quit...\n");
        exit(123);
    }
    //father
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if(ret > 0)
    {
        printf("wait success!\n");
        if((status & 0x7F) == 0)
        {
            printf("process quit normal!\n");
            printf("exit code: %d\n", (status >> 8) & 0xFF);
        }
        else
        {
            printf("process quit error!\n");
            printf("sig: %d\n", status & 0x7F);
        }
    }
    
    return 0;
}

💨运行结果：

正常，

异常，

✔ 测试用例五：

可以看到需要对数据进行加工才可以获取退出码和退出信号，比较麻烦，我们一般也不会自己加工。其实系统有提供一些宏(函数)，可以直接使用，我们主要学习 3 个 —— WIFEXITED(status)、WEXITSTATUS(status)、WTERMSIG(status)，其相关介绍可在 waitpid 手册里查找。

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id == 0)
    {
        int count = 5;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            //err
            //int* p = 0x12345; 
            //*p = 100;
            sleep(1);
        }
        printf("child quit...\n");
        exit(123);
    }
    //father
    int status = 0;
    pid_t ret = waitpid(id, &status, 0);
    if(ret > 0)
    {
        printf("wait success!\n");
        if(WIFEXITED(status))
        {
            printf("normal quit!\n");
            printf("quit code: %d\n", WEXITSTATUS(status));
        }
        else
        {
            printf("process quit error!\n");
            printf("sig: %d\n", WTERMSIG(status));
        }
    }
    
    return 0;
}

💨运行结果：

正常，

异常，

✔ 测试用例六：

非阻塞等待。

#include<stdio.h> 
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>

int main() {
    pid_t id = fork();
    if(id == 0)
    {
        int count = 3;
        while(count)
        {
            printf("child is running: %d, ppid: %d, pid: %d\n", count--, getppid(), getpid());
            //err
            //int* p = 0x12345;
            //*p = 100;
            sleep(1);
        }
        printf("child quit...\n");
        exit(123);
    }

    //father 非阻塞等待
    int status = 0;
    while(1)
    {
        pid_t ret = waitpid(id, &status, WNOHANG);
        if(ret == 0)
        {
            printf("wait next!\n");
            printf("father do other thing!\n");
        }
        else if(ret > 0)
        {
            printf("wait success, ret: %d, pid: %d\n", ret, WEXITSTATUS(status));
            break;
        }
        else
        {
            printf("wait failed!\n");
            break;
        }
    }

    //father 阻塞等待
    //int status = 0;
    //pid_t ret = waitpid(id, &status, 0);
    //if(ret > 0)
    //{
    // printf("wait success!\n");
    // if(WIFEXITED(status))
    // {
    // printf("normal quit!\n");
    // printf("quit code: %d\n", WEXITSTATUS(status));
    // }
    // else
    // {
    // printf("process quit error!\n");
    // printf("sig: %d\n", WTERMSIG(status));
    // }
    //}

    return 0;
}

💨运行结果：

我正在参与掘金技术社区创作者签约计划招募活动，点击链接报名投稿。

今天的文章Linux进程等待分享到此就结束了，感谢您的阅读。

版权声明：本文内容由互联网用户自发贡献，该文观点仅代表作者本人。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容，请发送邮件至举报，一经查实，本站将立刻删除。
如需转载请保留出处：https://bianchenghao.cn/15473.html

💦 进程等待的必要性

💦 进程等待的方法

1、wait方法

2、waitpid方法

相关推荐

发表回复