Re: SIGFPE in s6-rc-init 3.0.0 etc,...

From: Laurent Bercot <ska-skaware_at_skarnet.org>
Date: Fri, 15 Dec 2017 12:08:56 +0000

>I noticed in the kernel log that the module was in fact correctly
>loaded, but s6 was stuck waiting for something (I noticed this process
>when doing a "ps"):
>
>1213 root s6-ipcserverd -1 -- s6-ipcserver-access -v0 -E -l0 -i
>data/rules -- s6-sudod -t 2000 -- /usr/libexec/s6-rc-oneshot-run -l
>../.. --

  The existence of this process is normal: it's the daemon that spawns
the oneshot processes (so they're always run with a reproducible
environment). For every oneshot, s6-rc (spawns a s6-sudo process that)
connects to it and tells it to run the appropriate up (or down) script,
which can be found in the compiled database.


>Around 1/10 boots, it got stuck here, the rest of them booted up just
>fine.

  What probably happened is that you hit either the 2 second timeout or
a race condition that was present in s6-rc-0.2.0.1. Both of these are
fixed in later versions, such as 0.3.0.1 (the latest).


>When buildroot 2017.11 was released, I was thrilled to see new versions
>of s6 in it (s6-rc 0.3.0.0, skalibs 2.6.0.1 and s6 2.6.1.1), but when
>booting up I immediately got a SIGFPE in s6-rc-init,... I ran it
>through strace and the result can be seen here:
>http://steffe.net/s6-sigfpe.txt
>
>It crashes at the exact same place if I re-run it, and it crashes every
>time.

  Thanks for the report - it's very interesting. As Rasmus said, a
similar error during s6-rc-init happened before, but was never
reproducible. I've gone over the incriminated code several times, but
was unable to find a bug. :/
  The SIGFPE sounds very weird because s6-rc doesn't use floating point
anywhere, but if it's hitting undefined behaviour then everything is
possible.
  Can you somehow valgrind or gdb your s6-rc-init binary? I understand
it's on an embedded system, which can make it hard, but on my
development
machines I just cannot reproduce the problem :(

  Short of that, if the latest versions are not working for you (see
below on how to successfully build them), s6-rc-0.4.0.0 is coming out
shortly, probably early next week, and it significantly refactors the
s6-rc-init code, which would:
  - make it possible that it magically works for you, if the buggy
code was replaced
  - make it easier for me to debug, if the problem still occurs.
I'm sorry I don't have a better solution for now.


>src/daemontools-extras/s6-applyuidgid.c:57:29: error: implicit
>declaration of function 'setgroups_and_gid'
>[-Werror=implicit-function-declaration]
> if (gidn != (size_t)-1 && setgroups_and_gid(gid ? gid : getegid(),
>gidn, gids) < 0)
> ^~~~~~~~~~~~~~~~~
>cc1: some warnings being treated as errors

  The sysdep that guards this function was added in skalibs-2.6.0.0,
so if you're using a Buildroot version made for skalibs-2.5.1.1, it's
lacking the sysdep and it makes sense that the build fails.
  Try building the latest versions of skalibs/execline/s6/s6-rc with
the latest version of Buildroot, which should define all the relevant
sysdeps. I can't tell whether all the issues you encountered will be
fixed, but chances are the build will work, at least.

--
  Laurent
Received on Fri Dec 15 2017 - 12:08:56 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:38:49 UTC