求教大神 已经影响线上业务了 急 [崩溃] worker[WenJuanTest:1651545] exit with status 139

东村二狗

报错内容:
worker[WenJuanTest:1651545] exit with status 139
worker[WenJuanTest:1651545] exit with status 139
worker[WenJuanTest:1651545] exit with status 139

报错

一些信息

php 扩展

1321 2 0
2个回答

walkor 打赏

https://www.workerman.net/search?keyword=exit+with+status+139
php的bug 或者 某个php扩展bug,或者是用了一个什么特殊的用法触发了php的bug

  • 东村二狗 2022-05-23

    大哥 这些相关提问 我提问前都看了 我现在用的是php 8.0.8 我得换掉吗 ?

东村二狗

一旦遇到 就必须 重启 restart 才能解决

用laravel的orm的位置 容易报这样的错

这时候 业务页面 会出现 502 - nginx

求大神分析 ~~

  • walkor 2022-05-23

    strace -ttp 进程pid,直到进程coredump退出,大概能看出来是执行到哪里触发了bug

  • 东村二狗 2022-05-23

    [{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], WSTOPPED, NULL) = 1705592
    15:32:24.671550 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=1705592, si_uid=0, si_status=SIGSEGV, si_utime=3, si_stime=0} ---
    15:32:24.671702 getpid() = 1697111
    15:32:24.671791 openat(AT_FDCWD, "/www/fastwenjuan_test/runtime/logs/workerman.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 7
    15:32:24.671895 fstat(7, {st_mode=S_IFREG|0600, st_size=29562389, ...}) = 0
    15:32:24.671974 lseek(7, 0, SEEK_CUR) = 0
    15:32:24.672044 lseek(7, 0, SEEK_CUR) = 0
    15:32:24.672138 flock(7, LOCK_EX) = 0
    15:32:24.672214 write(7, "2022-05-23 15:32:24 pid:1697111 "..., 81) = 81
    15:32:24.672304 close(7) = 0
    15:32:24.672393 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd610346410) = 1709712
    15:32:24.673907 wait4(-1,

  • 东村二狗 2022-05-23

    老大 出来了

  • walkor 2022-05-23

    你这个是主进程,要strace 子进程,strace wenjuantest进程

  • 东村二狗 2022-05-23

    Active Internet connections (only servers)
    Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
    tcp 0 0 127.0.0.1:6379 0.0.0.0: LISTEN 2694/redis-server 1
    tcp 0 0 0.0.0.0:80 0.0.0.0:
    LISTEN 1980/nginx: master
    tcp 0 0 0.0.0.0:8786 0.0.0.0: LISTEN 2720/WorkerMan: mas
    tcp 0 0 0.0.0.0:8787 0.0.0.0:
    LISTEN 2700/WorkerMan: mas
    tcp 0 0 0.0.0.0:22 0.0.0.0: LISTEN 1845/sshd
    tcp 0 0 0.0.0.0:443 0.0.0.0:
    LISTEN 1980/nginx: master
    tcp6 0 0 :::3306 ::: LISTEN 2538/mysqld
    udp 0 0 127.0.0.1:323 0.0.0.0:
    857/chronyd
    udp6 0 0 ::1:323 :::* 857/chronyd

  • walkor 2022-05-23

    php start.php status 看子进程pid

  • 东村二狗 2022-05-23

    我开始是 strace 2720 我应该 strace 2720 的子进程吗 好像有很多个

  • 东村二狗 2022-05-23

    pstree -p

    ─php(2720)─┬─php(2721)
    │ ├─php(2722)
    │ ├─php(2723)
    │ ├─php(2724)
    │ ├─php(2725)
    │ ├─php(2726)
    │ ├─php(2727)
    │ ├─php(2729)
    │ ├─php(2730)
    │ ├─php(2732)
    │ ├─php(2734)
    │ ├─php(2735)
    │ ├─php(2736)
    │ ├─php(2737)
    │ ├─php(2738)
    │ ├─php(2739)
    │ └─php(2740)

  • 东村二狗 2022-05-23

    我是不是应该只开启 一个进程 再strace 子进程

  • walkor 2022-05-23

    php start.php status 看所有子进程pid,找一个wenjuantest进程的pid,或者多开一些终端,把每个wenjuantest进程的pid进程的pid都执行strace,等进程退出

  • walkor 2022-05-23

    只开一个进程就更好了

  • 东村二狗 2022-05-23

    0x1a10850, 32, 23269) = -1 EINTR (Interrupted system call)
    16:17:02.242368 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=24412, si_uid=0} ---
    16:17:02.242426 write(9, "\n", 1) = 1
    16:17:02.242534 rt_sigreturn({mask=[]}) = -1 EINTR (Interrupted system call)
    16:17:02.242626 epoll_wait(7, [{EPOLLIN, {u32=8, u64=8}}], 32, 19165) = 1
    16:17:02.242709 read(8, "\n", 1024) = 1
    16:17:02.242779 read(8, 0x7f9818604340, 1024) = -1 EAGAIN (Resource temporarily unavailable)
    16:17:02.242892 getpid() = 24413
    16:17:02.242985 getpid() = 24413
    16:17:02.243061 epoll_ctl(7, EPOLL_CTL_DEL, 6, 0x7ffdc55dd06c) = 0
    16:17:02.243142 close(6) = 0
    16:17:02.243336 close(15) = 0
    16:17:02.243617 openat(AT_FDCWD, "/www/log/php.log", O_WRONLY|O_CREAT|O_APPEND, 0644) = 6
    16:17:02.243723 write(6, "[23-May-2022 16:17:02 Asia/Shang"..., 218) = 218
    16:17:02.243814 close(6) = 0
    16:17:02.243895 write(1, "\nFatal error: Uncaught TypeError"..., 177) = 177
    16:17:02.244021 close(11) = 0
    16:17:02.244136 close(2) = 0
    16:17:02.244205 close(1) = 0
    16:17:02.244285 close(0) = 0
    16:17:02.244363 rt_sigaction(SIGINT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.244508 rt_sigaction(SIGQUIT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.244613 rt_sigaction(SIGHUP, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.244726 rt_sigaction(SIGTSTP, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.244916 rt_sigaction(SIGUSR1, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.245038 rt_sigaction(SIGUSR2, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.245123 rt_sigaction(SIGABRT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.245244 rt_sigaction(SIGIO, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f981c6c5400}, NULL, 8) = 0
    16:17:02.245401 sendto(12, "\1\0\0\0\1", 5, MSG_DONTWAIT, NULL, 0) = 5
    16:17:02.245539 close(12) = 0
    16:17:02.245707 sendto(10, "\1\0\0\0\1", 5, MSG_DONTWAIT, NULL, 0) = 5
    16:17:02.245866 close(10) = 0
    16:17:02.245969 epoll_ctl(7, EPOLL_CTL_DEL, 8, 0x7ffdc55dea4c) = 0
    16:17:02.246041 close(8) = 0
    16:17:02.246112 close(9) = 0
    16:17:02.246177 close(7) = 0
    16:17:02.246247 sendto(13, "\1\0\0\0\1", 5, MSG_DONTWAIT, NULL, 0) = 5
    16:17:02.246397 close(13) = 0
    16:17:02.246614 fcntl(3, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
    16:17:02.246780 munmap(0x7f9816c00000, 2097152) = 0
    16:17:02.247095 munmap(0x7f9818121000, 2794632) = 0
    16:17:02.247390 munmap(0x7f9818a30000, 2274632) = 0
    16:17:02.247521 munmap(0x7f9818829000, 2121744) = 0
    16:17:02.247619 munmap(0x7f9818605000, 2240904) = 0
    16:17:02.247696 munmap(0x7f98183cc000, 2328408) = 0
    16:17:02.248599 munmap(0x7f982309d000, 196608) = 0
    16:17:02.249752 close(5) = 0
    16:17:02.249870 exit_group(1) = ?
    16:17:02.252169 +++ exited with 1 +++

  • 东村二狗 2022-05-23

    我只开启了一个进程 自动退出了 子进程pid 自动变化了

  • walkor 2022-05-23

    16:17:02.243617 openat(AT_FDCWD, "/www/log/php.log", O_WRONLY|O_CREAT|O_APPEND, 0644) = 6
    16:17:02.243723 write(6, "[23-May-2022 16:17:02 Asia/Shang"..., 218) = 218
    16:17:02.243814 close(6) = 0
    16:17:02.243895 write(1, "\nFatal error: Uncaught TypeError"..., 177) = 177

    php向 /www/log/php.log 写了日志,看起来是一个Fatal error,看下日志

  • walkor 2022-05-23

    把swoole扩展先屏蔽试下

  • 东村二狗 2022-05-23

    屏蔽了swoole 目前好像 没出问题了 我先观察几天

  • liziyu 2022-05-23

    哈哈哈。。。

  • 真的是你呀 2022-05-23

    不给作者发个红包感谢感谢?

  • ersic 2022-05-24

    这真的是手把手教学啊,给大佬发个捐赠感谢感谢吧。

  • 码龍 2022-05-24

    支持给大佬红包感谢、收不收教学可求不可遇

  • ab0029 2022-05-24

    6666

  • iot.workerman.net物联网平台 2022-05-25

    这不发个红包?看不下去了

  • walkor 2022-05-25

    捐赠自愿哈,你们这搞的跟强迫交易似的😂。码农何苦为难码农,要捐赠也是找老板要

  • 东村二狗 2022-05-26

    其实还没解决 我已经蒙了 我现在的解决方案是 发现报错 就自动重启 并用邮箱通知自己 不知道该如何是好了
    https://test2022-5-23.oss-cn-hangzhou.aliyuncs.com/111.png
    https://test2022-5-23.oss-cn-hangzhou.aliyuncs.com/222.jpg

  • 东村二狗 2022-05-26

    我在使用中还发现了一个新问题 就是 如果长时间不访问页面 页面的第一次访问需要近20秒 只要第一次页面访问成功 第二次 第三次 就快了 我开始以为是网络或是浏览器的问题 结果换了也是一样 我在服务器 用curl本地访问 也是一样

  • walkor 2022-05-26

    没解决的话继续strace wenjuantest 进程,还是先把swoole扩展屏蔽掉,排除(null)()干扰。上次你strace的结果看起来是执行了reload,没有等到exit 139发生。strace wenjuantest 进程,直到这个进程报错退出。

年代过于久远,无法发表回答
×
🔝