On 4/21/2015 2:56 PM, Buck Evan wrote:
> My understanding of s6 socket activation is that services should open, 
> hold onto their listening socket when they're up, and s6 relies on the 
> OS for swapping out inactive services. It's not socket activation in 
> the usual sense. http://skarnet.org/software/s6/socket-activation.html
>
I apologize, I was a bit hasty and I think I need more sleep.  I'm 
confusing socket activation with some other s6 feature, perhaps I was 
confusing it with how s6-notifywhenup is used... 
http://skarnet.org/software/s6/s6-notifywhenup.html
> So I wonder what the "full guarantee" provided by s6 that you 
> mentioned looks like.
> It seems like in such a world all services would race and the 
> determinism of the race would depend on each service's implementation.
This I do understand, having gone through it with supervision-scripts.  
The basic problem is that a running service does not mean a service is 
ready, it only means it's "up".
Dependency handling "with guarantee" means there is some means by which 
the child service itself signals "I'm fully up and running", vs. "I'm 
started but not ready".  Because there is no polling going on, this 
allows the start-up of the parent daemon to sleep until it either is 
notified or times out.  And you get a "clean" start-up of the parent 
because the children have directly signaled that "we're all ready".
Dependency handling "without guarantee" is what my project does as an 
optional feature - it brings up the child process and then calls the 
child's ./check script to see if everything is OK, which is polling the 
child (and wasting CPU cycles).  This is fine for "light" use because 
most child processes will start quickly and the parent won't time out 
while waiting.  There are trade-offs for using this feature.  First, 
./check scripts may have unintended bugs, behaviors, or issues that you 
can't see or resolve, unlike the child directly signalling that it is 
ready for use.  Second, the polling approach adds to CPU overhead, 
making it less than ideal for mobile computing - it will draw more power 
over time.  Third, there are edge cases where it can make a bad 
situation worse - picture a heavily loaded system that takes 20+ minutes 
to start a child process, and the result being the parent spawn-loops 
repeatedly, which just adds even more load.  That's just the three I can 
think off off the top of my head - I'm sure there's more.  It's also why 
it's not enabled by default.
Received on Tue Apr 21 2015 - 23:52:39 UTC