著者: Simon Hobson 日付: To: dng@lists.dyne.org 題目: Re: [DNG] Apparently Jessie has runit
Rainer Weikusat <rainerweikusat@???> wrote:
>> The preceding paragraph makes a good point: If I make a daemon
>> requiring network connectivity, I should quickly determine whether such
>> connectivity exists, and exit if it doesn't.
>
> 'network connectivity' is not a static property: It may be available
> now, no more avaible in two seconds and again available half a second
> after that. This means the following scenario is possible
>
> 1. Daemon check succeeds.
> [internet break down]
> 2. Daemon tries to do something. What now?
>
> or
>
> 1. Daemon checks fails
> [internet back up]
> 2. Daemon exits.
>
> Cursing admin the starts it manually.
And the answer is ?
I think reality is that you cannot avoid that possibility.
Is there a risk that in striving for (some approximation) of perfection, we end up in an endless loop of "not good enough". Because regardless of what you do or how you do it, there is *always* the possibility that connectivity state will change between "decision" and "running" - whether "connected" is determined by looking at interfaces, by some state file, by internal messaging systems, or whatever ...
I'd argue that the sysvinit method of starting things in order and waiting for them to complete is "good enough" for most people for most of the time - and has the advantage of deterministic (subject to external influences such as network availability) service start order, and being "fixable" when it does go wrong. So what if "network starts but then goes down before ntp starts" ? Does it really make any difference whether ntp was started by sysvinit after network returned "success", or by systemd when network reports connected (or whatever it does), or by runit just firing everything and ntp checking for itself ? The end result is a TOCTOU failure whatever init system is running the show.
FWIW, I do agree with you that a better system would be to "just start everything and have them wait". But while you criticise init scripts for complexity, my feeling is that the result would be that a higher level of complexity gets put into every daemon - instead of having a bit of cruft where it's easy to see, we get the same effect as systemd, a lot of complexity obfuscated in C.
Sticking with the "ntp and network" scenario, what if ntp has to work out for itself when the network is up. How does it do that ? How does the admin tell it what the network should look like ? Does that mean defining the network twice - once to the network system and once to ntp ? Or does it mean that "network" publishes status somehow - perhaps a system wide messaging "bus" ? And if the latter - does that just get us back to some of the things people here are saying are bad ?
And of course, assume we find a nice system that fires everything up, and the complexity is hidden in each program that has to determine if the system has got far enough for this activity to start, how do you debug it when it breaks ?