Author: Simon Walter Date: To: dng New-Topics: Re: [DNG] Supervision scripts Subject: Re: [DNG] Supervision scripts (was Re: OpenRC and Devuan)
On 05/05/2016 03:18 AM, Stephanie Daugherty wrote: > Process supervision is something I'm very opinionated about. In a
> number of high availability production environments, its a necessary
> evil.
>
> However, it should *never* be an out of the box default for any
> network-exposed service, Service failures should be extraordinary
> events, and we should strive to keep treating them as such, so that we
> continue to pursue stability. Restarting a service automatically
> doesn't improve stability of that software, it works around an
> instability rather than addressing the root cause - it's a band-aid
> over a festering wound.
>
> The failure of a service is analogous in my eyes to the tripping of a
> circuit breaker - it happened for a reason, and that underlying reason
> is probably serious. Circuit breakers in houses generally don't reset
> themselves, and either should network-facing services.
>
> The biggest concern in any service failure is that a failure was
> caused by an exploit attempt - attacks which exploit bad
> memory-management tend to crash whatever they are exploiting, even on
> a failed attempt. In an environment where such an event has been
> reduced to routine, and automatic restarts are the norm, that attacker
> gets as many attempts as they need, reducing one of the first signs of
> an intrusion to barely a blip on the radar if the systems are even
> being monitored at all.
>
>
> The second reason is that it will reduce the number of high-quality
> bug reports developers receive - if failure is part of the routine, it
> tends not to get investigate very thoroughly, if at all.
>
> A third reason is convention and expectation. We've lived without
> process supervision in the *nix world for almost 4 decades now, those
> decades of experienced admins generally expect to be able to kill off
> a process and have it stay down.
>
> Please consider these factors in any implementation of process
> supervision - while it's certainly it's a needed improvement for many
> organizations,, it's not something that should just be on by default.
>
>
I couldn't agree more. Some systems I've administered had monitoring
daemons, but they would only warn the admin via email and not act
automatically.
When you are working with many servers, you want to have your own
monitoring like icinga for example. I think warning notifications by
default are a good thing.
On 05/05/2016 05:45 AM, Rainer Weikusat wrote: > It greatly reduces the number of "low-quality" (or rather, "no quality")
> bug reports I receive as I don't (usually) get frantic phone calls at
> 3am UK time because a server in Texas terminated itself for some
> reason. Instead, I can collect the core file as soon as I get around to
> that and fix the bug.
>
> NB: I deal with appliances (as developer) and not with servers (as
> sysadmin).
So, for example, would something like daemontools be what you use with
your field deployed software?
I tend to think that something like automatic restarts are the exception
rather than the rule, and so no default support needs to be provided.
I would not like to, for example, install apache and mod php and have it
restart after it has crashed due to a crappy PHP application. I am of
the opinion that is a big security risk. I am sure much thought has been
spent on the subject of sane defaults for a server.