:: Re: [DNG] OpenRC: was s6-rc, a s6-…
Góra strony
Delete this message
Reply to this message
Autor: Timo Buhrmester
Data:  
Dla: Laurent Bercot
CC: dng
Temat: Re: [DNG] OpenRC: was s6-rc, a s6-based service manager for Unix systems
On Tue, Sep 29, 2015 at 06:45:25PM +0200, Laurent Bercot wrote:
> On 29/09/2015 17:34, Timo Buhrmester wrote:
> >>It can't respawn
> >Probably because people don't want this behavior. Auto-respawn only
> >makes sense when you're "relying" on buggy software you already expect
> >to blow up, *and* are unwilling to debug it. "Try turning it off
> >and on again", "A restart will fix it" is the Windows-way...
>
> In an ideal world, process supervision may not be necessary, but we
> don't live in an ideal world. Software crashes happen.

The question is, how often do they happen. Even in the real world,
we have mature software that does *not* blow up out of the blue.
And, in the real world, occasionally we debug our software, if it
does blow up.

> Process supervision is *not*, and should not be, a crutch to help
> buggy software run. Pretending that it is its goal is a straw man
> argument.

I didn't even mean to start an argument.

> if an attacker can crash the service, what is better:
> that the attacker can trivially DoS your service with one attack,
> or that he has to try again and again in order to DoS you?

I forgot that it's 2015 and the only thing attackers do is DoS anymore.

If an attacker can crash the service, he might deliver a payload in the
process of crashing it. What is better, an attacker having one, or
and attacker having unlimited attempts to do this?

> >Or it could have crashed because there's an environmental problem
> >that isn't directly under the program's control, in which case
> >restarting it would just be pointless, because it likely can't start
> >at all.
>
> You don't know that in advance.

Hence the words "could" and "likely". And yes, I do think there is a
correlation between "cannot run" and "cannot start", but YMMV.

> Other failures are temporary, and that's where process supervision is
> a good thing to have.

Yes, in that case, it's a good thing to have. But as you said, you
can't know this in advance, and in other situations the service
being restarted automatically can have severe security implications,
in which case it would definitely *not* be a good thing to have.


> >Bonus points if the logs of the initial problem get rotated away due to
> >excessive retrying, or the core dump of the initial crash gets
> >overwritten...
>
> If your admins did not prepare for this and write correct scripts
> to save the core dumps to a safe place, or save crash logs to a place
> where they won't be rotated away, this is a problem with your admins,
> not with process supervision.

You might not be able to keep every core dump and every log file for a
service that core dumps when starting and gets respwaned over and over
again.

> Ultimately, process supervision is a tool, and a good tool. It should
> be a decision for the sysadmin to use it or not to use it; the decision
> should not be enforced by the rc system. As Steve says, it is an
> oversight of OpenRC to not provide the *possibility* of process
> supervision.

Yes, and yes, maybe. In the end, I don't care what OpenRC is doing,
and the purpose of my reply wasn't to argue for or against supervision,
but merely to explain why OpenRC does what it does (or does not).

Anyway, let's not turn this into a full-scale argument on the pros and
cons of supervision and/or OpenRC.