Stephanie Daugherty <sdaugherty@???> writes:
> Process supervision is something I'm very opinionated about. In a number of
> high availability production environments, its a necessary evil.
>
> However, it should *never* be an out of the box default for any
> network-exposed service, Service failures should be extraordinary events,
> and we should strive to keep treating them as such,
That's based on a particular assumption about how 'automatic restarts'
will be used, namely, instead of fixing server errors and not as
complement to that: I treat 'server failures' as 'extraordinary events'
but users don't (and shouldn't): They should experience as litte down
time as technically possible.
[...]
> The second reason is that it will reduce the number of high-quality bug
> reports developers receive - if failure is part of the routine, it tends
> not to get investigate very thoroughly, if at all.
It greatly reduces the number of "low-quality" (or rather, "no quality")
bug reports I receive as I don't (usually) get frantic phone calls at
3am UK time because a server in Texas terminated itself for some
reason. Instead, I can collect the core file as soon as I get around to
that and fix the bug.
NB: I deal with appliances (as developer) and not with servers (as
sysadmin).