:: Re: [Dng] Why daemontools is so coo…
Top Page
Delete this message
Reply to this message
Author: Jude Nelson
Date:  
To: devuan.kn
CC: dng@lists.dyne.org
Subject: Re: [Dng] Why daemontools is so cool
Okay, one technical correction (on myself):

> What are you talking about?
>
> root@t510:/home/jude# runlevel
> N 2
> root@t510:/home/jude# fgrep -r "sleep" /etc/rc2.d /etc/rcS.d
> root@t510:/home/jude#
>
> You do *not* need timeouts to boot the system sequentially.


I should have typed "fgrep -r "sleep" /etc/rc2.d/* /etc/rcS.d/*", since it
slipped my mind that fgrep does not follow symlinks unless specified on the
command-line.

While this command yields many instances of "sleep", a cursory inspection
of when they occur indicates that the init scripts almost always execute
"sleep" as part a restart, reload, upgrade, or failsafe. They do not
"sleep" on the start/stop critical paths. Thus, I stand by my earlier
point that timeouts aren't an expected part of the boot process.

-Jude

On Mon, Mar 30, 2015 at 10:55 PM, Jude Nelson <judecn@???> wrote:

> > If you need to generically support a wide range of setups and include
> > the fun stuff listed above into the mix, then your init system will
> > need to do a lot more than what is necessary to bring up one
> > individual system. Debian did run all of the above during its boot
> > sequence (if the package was installed), just in case somebody
> > actually went ahead and used it.
>
> Except that it doesn't. Debian will only do the "extra fun stuff" if it's
> installed and usable. Go read the init scripts in /etc/rcS.d if you don't
> believe me.
>
> > Dependencies enable you to have a robust boot sequence even with
> > hardware like this.
>
> No. No they do not. Let me say it again: if the device with your root
> filesystem is unavailable, the boot process *will* wait for it to become
> available. There is no alternative, besides panicking. There is no
> parallelization before root gets mounted, because nothing else *can* start.
>
> > If you want to support a wide range of combinations of all the goodies
> > Linux comes with, incl. software raid, logical volume management, disk
> > encryption and any combination thereof, then you can not do so in a
> > generic way using a strictly sequential init system. You can configure
> > all of these for each individual system, but you can not have generic
> > support for all of them in any valid combination in your distribution.
>
> First, the extra packages are not used unless you install them, and will
> only get invoked if they are needed. Second, Debian has initialized them
> all sequentially, in the right order, without fail for the better part of
> its existence. In fact, if you go look at Debian bug tracker, the
> introduction of init script parallelization strategies have lead to
> hard-to-reproduce boot bugs due to race conditions.
>
> > You will also need to introduce timeouts (which are either too short
> > or too long, depending on the system) into each and every step along
> > the way.
>
> What are you talking about?
>
> root@t510:/home/jude# runlevel
> N 2
> root@t510:/home/jude# fgrep -r "sleep" /etc/rc2.d /etc/rcS.d
> root@t510:/home/jude#
>
> You do *not* need timeouts to boot the system sequentially.
>
> > My point is "A init system should be robust to work with any valid
> > configuration I can put the system in" and I understand your reply to
> > be "Complexity makes programs less robust". I agree with both
> > statement and see no contradiction whatsoever.
>
> Um, no. That has *never*, *EVER* been the case. If you break something,
> it stays broken until you fix it. The OS does exactly what you tell it
> to do, as it should. It's not the OS's responsibility to clean up your
> mistakes if you put it into a state where it can't boot.
>
> > A strictly sequential boot is not able to do that in a generic way.
> > You can configure *your* system to do that by tweaking the sequence,
> > but it is impossible for a distribution to ship a sequential init
> > system that can robustly set up any filesystem that you can mount
> > manually.
>
> Then how, pray tell, has Debian been able to boot sequentially all these
> years, with all these features?
>
> Okay. I hate that it's come to this, but at this point in reading your
> reply I can no longer tell whether or not you're a troll. You talk about
> the low-level details of userspace like you think you understand it, but
> it's pretty clear that you do not (or, you do a very bad job at
> communicating it). Either way, this thread has gone wildly off-topic, so I
> will no longer reply to it.
>
> -Jude
>
> On Mon, Mar 30, 2015 at 7:06 PM, <devuan.kn@???> wrote:
>
>> Hi Jude, hello Isaac,
>>
>> I did express myself poorly when I spoke of hardware detection. You
>> both are fully correct to call me out on that. Yes, hardware detection
>> happens in the kernel, and indeed some of it is done in parallel
>> there.
>>
>> I was thinking about all the user-space tools that scan drives and
>> create new devices based on their findings when writing this mail.
>> These little things like btrfs scan, lvm --forgot-the-parameter,
>> cryptsetup open, and the others that allow you to do cool things with
>> your filesystems.
>>
>> Sorry for the confusion this caused and my poor choice of words.
>>
>> On Tue, Mar 31, 2015 at 12:05 AM, Jude Nelson - judecn@???
>> <devuan.kn.ae5676beef.judecn#gmail.com@???> wrote:
>> <snip>
>>
>> > If your boot sequence is taking too long because it's loading
>> unnecessary
>> > drivers, then your boot sequence is misconfigured.
>>
>> If you need to generically support a wide range of setups and include
>> the fun stuff listed above into the mix, then your init system will
>> need to do a lot more than what is necessary to bring up one
>> individual system. Debian did run all of the above during its boot
>> sequence (if the package was installed), just in case somebody
>> actually went ahead and used it.
>>
>> >> And some hardware can take a very
>> >> long time to register, so you need to be generous with those
>> >> time-outs. On the other hand anybody without the hardware will stuck
>> >> for the entire timeout, so you need to keep them as short as possible.
>> >
>> > If your hardware is taking a long time to spin up, it's due to bug in
>> either
>> > the hardware or its driver, not the boot sequence.
>>
>> Dependencies enable you to have a robust boot sequence even with
>> hardware like this.
>>
>> > Again, if you're talking about waiting for the device with your root
>> > filesystem, the boot process is *supposed* to block until it's ready.
>> Boot
>> > cannot proceed until the root device is found and mounted; otherwise you
>> > obviously cannot load programs. This process cannot be sped up through
>> > parallelization (Amdahl's Law and all that). Same goes for programs
>> that
>> > cannot be started until /usr is mounted (if you have a separate /usr).
>>
>> I think I poorly worded my original mail.
>>
>> Please let me try again:
>>
>> If you want to support a wide range of combinations of all the goodies
>> Linux comes with, incl. software raid, logical volume management, disk
>> encryption and any combination thereof, then you can not do so in a
>> generic way using a strictly sequential init system. You can configure
>> all of these for each individual system, but you can not have generic
>> support for all of them in any valid combination in your distribution.
>>
>> You will also need to introduce timeouts (which are either too short
>> or too long, depending on the system) into each and every step along
>> the way.
>>
>> This is all about file system detection. Once everything is mounted,
>> it is still rather easy to run into trouble with services not being
>> fully up even though their start script finished, network not being
>> there yet and whatnot. But that is a entirely different issue.
>>
>> <snip>
>>
>> >> The killer argument for parallel startup with dependency handling is
>> >> robustness, not speed.
>> >
>> > No, the opposite is true. Programs with multiple instances of execution
>> > (processes, threads, coroutines) in practice tend to be much more
>> > error-prone, because they are much harder to reason about. This is
>> because
>> > the number of states such a program can be in increases with the
>> *factorial*
>> > of the number of instances of execution it has. This is such a problem
>> that
>> > determinism is often a design requirement for mission-critical software
>> > whose failure will result in huge costs and/or loss of life.
>>
>> I think I missed something here;-)
>>
>> My point is "A init system should be robust to work with any valid
>> configuration I can put the system in" and I understand your reply to
>> be "Complexity makes programs less robust". I agree with both
>> statement and see no contradiction whatsoever.
>>
>> >> Maybe it is my tendency to mess around with cryptsetup and co. that
>> >> gets me into trouble, but I did have unbootable systems with sysv-init
>> >> due to "unexpected setup" problems. Nothing I could not fix, but still
>> >> an annoyance that I would be happy to get rid of.
>> >
>> > Parallel boot won't fix misconfigurations you introduced by messing
>> around
>> > with it.
>>
>> If I can mount a filesystem manually, then it is not misconfigured. I
>> do expect my init system to also be able to bring up that filesystem
>> -- provided I hand it all the necessary data (keys, etc.) and that it
>> can handle all the individual pieces of software that I used when
>> setting up that filesystem.
>>
>> A strictly sequential boot is not able to do that in a generic way.
>> You can configure *your* system to do that by tweaking the sequence,
>> but it is impossible for a distribution to ship a sequential init
>> system that can robustly set up any filesystem that you can mount
>> manually.
>>
>> That is a severe limitation of sequential init system and one that a
>> distribution needs to be aware of.
>>
>> >> The whole consecutive boot thing hinges on timeouts and that is
>> >> neither generic nor robust.
>> >
>> > The boot sequence does *not* hinge on timeouts. If anything, timeouts
>> are a
>> > fallback mechanism for working around other programs not making forward
>> > progress (i.e. due to bugs, a down network, or faulty hardware). If
>> your
>> > boot sequence is encountering timeouts, then something's wrong with your
>> > boot sequence.
>>
>> A *generic* consecutive boot sequence does. You can tweak the setup
>> for your system so that it does not (or it least does less), but you
>> can not ship one system that supports all (or at least most) valid
>> combinations of filesystem setups and does not heavily rely on
>> timeouts.
>>
>> BR
>> Karl
>>
>> _______________________________________________
>> Dng mailing list
>> Dng@???
>> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
>>
>
>