Re: process supervisor - considerations for docker

From: Gorka Lertxundi <glertxundi_at_gmail.com>
Date: Fri, 27 Feb 2015 11:19:53 +0100

Dreamcat4, pull request are always welcomed!

2015-02-27 0:40 GMT+01:00 Laurent Bercot <ska-supervision_at_skarnet.org>:

> On 26/02/2015 21:53, John Regan wrote:
>
>> Besides, the whole idea here is to make an image that follows best
>> practices, and best practices state we should be using a process
>> supervisor that cleans up orphaned processes and stuff. You should be
>> encouraging people to run their programs, interactively or not, under
>> a supervision tree like s6.
>>
>
> The distinction between "process" and "service" is key here, and I
> agree with John.
>
> <long design rant>
> There's a lot of software out there that seems built on the assumption
> that
> a program should do everything within a single executable, and that
> processes
> that fail to address certain issues are incomplete and the program needs to
> be patched.
>
> Under Unix, this assumption is incorrect. Unix is mostly defined by its
> simple and efficient interprocess communication, so a Unix program is best
> designed as a *set* of processes, with the right communication channels
> between them, and the right control flow between those processes. Using
> Unix primitives the right way allows you to accomplish a task with minimal
> effort by delegating a lot to the operating system.
>
> This is how I design and write software: to take advantage of the design
> of Unix as much as I can, to perform tasks with the lowest possible amount
> of code.
> This requires isolating basic building blocks, and providing those
> building
> blocks as binaries, with the right interface so users can glue them
> together on the command line.
>
> Take the "syslogd" service. The "rsyslogd" way is to have one executable,
> rsyslogd, that provides the syslogd functionality. The s6 way is to combine
> several tools to implement syslogd; the functionality already exists, even
> if it's not immediately apparent. This command line should do:
>
> pipeline s6-ipcserver-socketbinder /dev/log s6-envuidgid nobody
> s6-applyuidgid -Uz s6-ipcserverd ucspilogd "" s6-envuidgid syslog
> s6-applyuidgid -Uz s6-log /var/log/syslogd
>
>
I love puzzles.


> Yes, that's one unique command line. The syslogd implementation will take
> the form of two long-running processes, one listening on /dev/log (the
> syslogd socket) as user nobody, and spawning a short-lived ucspilogd
> process
> for every connection to syslog; and the other writing the logs to the
> /var/log/syslogd directory as user syslog and performing automatic
> rotation.
> (You can configure how and where things are logged by writing a real s6-log
> script at the end of the command line.)
>
> Of course, in the real world, you wouldn't write that. First, because s6
> provides some shortcuts for common operations so the real command lines
> would be a tad shorter, and second, because you'd want the long-running
> processes to be supervised, so you'd use the supervision infrastructure
> and write two short run scripts instead.
>
> (And so, to provide syslogd functionality to one client, you'd really have
> 1 s6-svscan process, 2 s6-supervise processes, 1 s6-ipcserverd process,
> 1 ucspilogd process and 1 s6-log process. Yes, 6 processes. This is not as
> insane as it sounds. Processes are not a scarce resource on Unix; the
> scarce resources are RAM and CPU. The s6 processes have been designed to
> take *very* little of those, so the total amount of RAM and CPU they all
> use is still smaller than the amount used by a single rsyslogd process.)
>
> There are good reasons to program this way. Mostly, it amounts to writing
> as little code as possible. If you look at the source code for every single
> command that appears on the insane command line above, you'll find that
> it's
> pretty short, and short means maintainable - which is the most important
> quality to have in a codebase, especially when there's just one guy
> maintaining it.
> Using high-level languages also reduces the source code's size, but it
> adds the interpreter's or run-time system's overhead, and a forest of
> dependencies. What is then run on the machine is not lightweight by any
> measure. (Plus, most of those languages are total crap.)
>
> Anyway, my point is that it often takes several processes to provide a
> service, and that it's a good thing. This practice should be encouraged.
> So, yes, running a service under a process supervisor is the right design,
> and I'm happy that John, Gorka, Les and other people have figured it out.
>
> s6 itself provides the "process supervision" service not as a single
> executable, but as a set of tools. s6-svscan doesn't do it all, and it's
> by design. It's just another basic building block. Sure, it's a bit special
> because it can run as process 1 and is the root of the supervision tree,
> but that doesn't mean it's a turnkey program - the key lies in how it's
> used together with other s6 and Unix tools.
> That's why starting s6-svscan directly as the entrypoint isn't such a
> good idea. It's much more flexible to run a script as the entrypoint
> that performs a few basic initialization steps then execs into s6-svscan.
> Just like you'd do for a real init. :)
> </long design rant>
>
>
>
>> Heck, most people don't *care* about this kind of thing because they
>> don't even know. So if you just make /init the ENTRYPOINT, 99% of
>> people will probably never even realize what's happening. If they can
>> run `docker run -ti imagename /bin/sh` and get a working, interactive
>> shell, and the container exits when they type "exit", then they're
>> good to go! Most won't even question what the image is up to, they'll
>> just continue on getting the benefits of s6 without even realizing it.
>>
>
> Ideally, that's what would happen. We must ensure that the abstraction
> holds steadily, though - there's nothing worse than a leaky abstraction.
>
>
> The main thing I'm concerned about is about preserving proper shell
>>> quoting, because sometimes args can be like --flag='some thing'.
>>>
>>
> This is a solved problem.
> The entrypoint we're talking about is trivial to write in execline,
> and I'll support Gorka, or anyone else, who does that. Since the
> container will already have execline, using it for the entrypoint
> costs nothing, and it makes command line handling and transmission
> utterly trivial: it's exactly what I wrote it for.
>
>
I'll really appreciate your help! Using execline as the default scripting
will facilitate any conversion to other bases like busybox.


> --
> Laurent
>
>
Received on Fri Feb 27 2015 - 10:19:53 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC