Re: emergency IPC with SysV message queues

From: Jeff <sysinit_at_yandex.com>
Date: Sun, 19 May 2019 19:54:50 +0200

On Thu, May 16, 2019 at 09:25:09PM +0000, Laurent Bercot wrote:
> Oh? And the other complaints haven't given you a clue?
> We are a friendly community, and that includes choosing to follow
> widely adopted threading conventions in order to make your readers
> comfortable, instead of breaking them because you happen not to like
> them. Please behave accordingly and don't be a jerk.

breaking long threads that went in a direction that has nothing to do
with the original thread topic is in no way unfriendly or offensive,
nor does that make me a "jerk".

> Okay, so your IPC mechanism isn't just message queues, it's a mix
> of two different channels: message queues *plus* signals.

well, no. the mechanism is SysV msg queues and the protocol for
clients to use to communicate includes - among other things - notifying
the daemon (its PID is well known) by sending a signal to wake it up and
have it processes the request input queue.
you do not use just fifos (the mechanism), there is also a protocol
involved that clients and server use.

> Signals for notification, message queues for data transmission. Yes,
> it can work, but it's more complex than it has to be, using two Unix
> facilities instead of one.

indeed, this is more complex than - say - just sockets. on the other
hand it does not involve any locking to protect against concurrently
accessing the resource as it would have done with a fifo.

and again: it is just an emergency backup solution, the preferred way
are (Linux: abstract) unix sockets of course. such complicated ipc is
not even necessary in my case, but for more complex and integrated
inits it is. that was why i suggested in order to make their ipc
independent of rw fs access.

and of course one can tell a reliable init by the way it does ipc.

> You basically need a small library for the client side. Meh.

right, the client has to know the protocol.
first try via the socket, then try to reach init via the msg queue.
for little things like shutdown requests signaling suffices.

> A fifo or a socket works as both a notification mechanism and a
> data transmission mechanism,

true, but the protocol used by requests has to be desinged as well.
and in the case of fifos: they have to be guarded against concurrent
writing by clients via locking (which requires rw fs access).

> and it's as simple as it gets.

the code used for the msg queueing is not complicated either.

> Yes, they can... but on Linux, they are implemented via a virtual
> filesystem, mqueue. And your goal, in using message queues, was to
> avoid having to mount a read-write filesystem to perform IPC with
> process 1 - so that eliminates them from contention, since mounting
> a mqueue is just as heavy a requirement as mounting a tmpfs.

indeed, they usually live in /dev/mqueue while posix shared memory
lives in /dev/shm.

that was reason that i did not mention them in the first place
(i dunno if OpenBSD has them as they usually lag behind the other
unices when it comes to posix conformance).

i just mentioned them to point out that you can be notified about
events involving the posix SysV ipc successors.
i never used them in any way since they require a tmpfs for this.

> Also, it is not clear from the documentation, and I haven't
> performed tests, but it's even possible that the Linux implementation
> of SysV message queues requires a mqueue mount just like POSIX ones,
> in which case this whole discussion would be moot anyway.

which in fact is not the case, try it with "ipcmk -Q", same for the
other SysV ipc mechanisms like shared memory and semaphores.
you can see that easily when running firefox. it uses shared memory
without semaphores akin to "epoch" (btw: if anyone uses "epoch" init
it would be interesting to see what ipcs(1) outputs).
this is in fact a very fast ipc mechanism (the fastest ??), though
a clever protocol must be used to avoid dead locks, concurrent accesses
and such. the msg queues have the advantage that messages are already
separated and sorted in order of arrival.

> You've lost me there. Why do you want several methods of IPCs in
> your init system? Why don't you pick one and stick to it?

since SysV msg queues are a quite portable ipc mechanism that does
not need any rw access. so they make up for a reliable ipc backup
emergency method.

> Sockets are available on every Unix system.

these days (IoT comes to mind). but i guess SysV init (Linux) does
not use them since there might have been kernels in use without
socket support (?? dunno, just a guess).
on the BSDs this should be true since it was said that they implement
piping via socketpairs.

> So are FIFOs.

i do not like to use them at all, especially since they need rw
(is that true everywhere ??).

> If you're going to use sockets by default, then use sockets,
> and you'll never need to fall back on SysV IPC, because sockets work.

true for abstract sockets (where available), dunno what access rights
are needed to use unix sockets residing on a fs.

> Uh, yes, I'm writing an init system for 2019, not for 1992.
> And *even* in 1992, there was a writable filesystem: /dev.
> Now I'm not saying that creating fifos in /dev is good design, but
> I am saying that if you need a writable place to create a fifo,
> you always have one. Especially nowadays with /dev being a tmpfs,
> so even if you're reluctant to mount an additional tmpfs at boot
> time, you can always do stuff in /dev!

is /dev in any case always writable ?
what about platforms that have a static /dev residing on disc
(maybe in the root fs or as separate partition) ?
although chances are it is writable, that is why people place unrelated
things into it.

> Needing a writable filesystem to create a fifo or a socket has
> never been a serious limitation,

really NEVER ??

> and nowadays it is even less of a
> limitation than before. The "must not use the filesystem at all"
> constraint is artificial and puts a much greater burden on your
> design than needing a rw fs does.

we were discussing correct/safe/reliable behaviour in preferably all
possible situations that might arise.

> It's really not limiting, and the *only* correct behaviour.

hu ? any proof here ?

> The need to have a rw fs does not even come from the daemontools-like
> architecture with a writable scandir. It comes from the need to store
> init's logs.

when the console device is not enough ...
in fact "working" with read-only access might not be very pleasant
but this might be true in some scenarios.

> Storing logs from an init system is not easy to do. Some systems,
> including sysvinit, choose to not even attempt it, and keep writing
> to either /dev/console (which is very transient and not accessible
> remotely) or /dev/null. Some systems do unholy things with
> temporary logging daemons.

bootlogd ? to be honest:
i agree with SysV init, when was logging process #1's duty ?
it is nice to have though but IMO not a requirement per se,
maybe you could enlighten me a bit in case you disagree

as you already did in the case of respawning subprocesses.
placing such functionality into process #1 looks indeed safer and
exploits process #1 being protected against signaling by the kernel.

> Making a tmpfs is *easy*.

nowadays.

> Oh, so it's not a problem to need a writable filesystem on BSD then?
> So, why all the contortions to avoid it on other systems? If you're
> fine with Unix domain sockets, then you're fine with Unix domain
> sockets and that's it. And there's nothing wrong with that. And
> you don't need a "portable backup/emergency method" - that just
> bloats your init system for zero benefit. Z-e-r-o. Zero.

> It is true that I didn't back my claim on this page that SysV IPCs
> have terrible interfaces. At the time of writing, I had tried to
> use them for an unrelated project and found them unusable;

i have not advocated their general use.

> Since then, I have had to work with
> *one* project using SysV message queues, and my initial impressions
> were confirmed. I managed to make it work, but it was really
> convoluted, and a lot more complex than it needed to be; it's
> *definitely* not an IPC I would choose for a notification mechanism
> for a supervision suite.

that is true, but i recommended them just for process #1 and even there
as emergency backup.

i personally would not use fifos for anything either.
IMO their usage is convoluted and complex aswell except that they
offer the advantage of notification.

> I don't know, something about only being usable for data transmission
> and needing another IPC mechanism to the side for notification makes
> me think it wouldn't be a good mechanism to use for notification.

unix sockets (Linux: abstract in case one does not want them to reside on
a fs) are the solution i prefer for ipc between unrelated processes.
but in situations where one wants to bypass them with a faster mechanism
i would suggest SysV shared memory (in fact its usage is not so uncommon).
of course protecting the memory area against concurrent accesses has to
be ensured by a clever protocol ... :-/

(btw: firefox and epoch init seem not to use SysV semaphores for that
purpose)
Received on Sun May 19 2019 - 17:54:50 UTC

This archive was generated by hypermail 2.3.0 : Sun May 09 2021 - 19:44:19 UTC