:: Re: [DNG] /usr to merge or not to m…
Top Page
Delete this message
Reply to this message
Author: Roger Leigh
Date:  
To: dng
Subject: Re: [DNG] /usr to merge or not to merge... that is the question
On 21/11/2018 16:11, Alessandro Selli wrote:
> On 21/11/18 at 13:17, Roger Leigh wrote:
>> Hi folks,
>>
>> I've been following the discussion with interest.
>
>
>   No, you definitely have not followed it.  In fact you are disregarding
> all the points that were expressed against the merge.


Let me begin by stating that I found your reply (and others) to be rude,
unnecessarily aggressive, and lacking in well-reasoned objective
argument. It's poor communication like this which caused me to
unsubscribe from the Debian lists, and also to this list a good while
back (I only read the digest summary on occasion, and rarely
participate). I find it fosters an unfriendly, unpleasant and
unproductive environment which I don't enjoy working in. When you're
doing this type of work as a part-time volunteer, it's extremely
demotivating and disheartening to be treated this way. It would be
unacceptable in a professional setting, and it's equally unacceptable
here. Please do think about what you have written before sending it; it
costs nothing to be nice, even when you are in disagreement with someone.


Before I follow up on any of the points you (and others) made in
response, let me begin with some history you may be unaware of. It
actually predates systemd, and is largely unrelated to systemd.


6-7 years ago, back when I was one of the Debian sysvinit maintainers,
we had a problem. The problem was that an increasing number of systems
could no longer be booted up successfully. The reason for this was that
the boot process was becoming increasingly complex. The process could
be summarised like this with an initramfs:

- mount / in initramfs
- [early boot]
- mount /usr and other filesystems
- [late boot]

or directly, without an initramfs

- mount /
- [early boot]
- mount /usr and other filesystems
- [late boot]

The problems arose from the mounting of /usr part way through the boot
process. An increasing number of system configurations required tools
and libraries from /usr, before it was mounted. These could be NSS
modules, LDAP stuff, dependencies of network filesystem tools, or
others. In some cases, this was solved by moving individual tools and
libraries from /usr to /[s]bin and /lib. But this became increasingly
untenable as the requirements became more and more complex. Not only
did we have to move individual tools and libraries and datafiles from
/usr to the root filesystem, we also had to move every dependency as
well for them to be functional. Also, service dependencies wanted
services starting before /usr was mounted, moving chunks of the
dependency graph into the early boot stage. This was a losing battle,
since we couldn't move /everything/ to the root filesystem. Or could we?

It was due to logistical challenges like this that we first considered
the merge of / and /usr. Once low-level tools start requiring
interpreters like Perl or Python, or libraries like libstdc++, or
datafiles like timezone data, it was clear we needed a more general
solution which would solve the problems for the long term.

The question arose of how we might handle the migration, or avoid the
need for a migration entirely. As the sysvinit maintainer, I did most
of the investigation and implementation of this work, and you're likely
using that solution right now yourself. The solution I chose was one
which would allow for making /usr available in early boot without the
need for physically merging the filesystems, so that it wouldn't break
any of the installed systems on upgrade. We would add to the initramfs
the ability to mount additional filesystems other than the rootfs,
directly from the rootfs fstab. And we would cater for local and NFS
filesystems just as we do for the rootfs. This was one of the more
costly solutions (in terms of implementation complexity and testing
needed), but it retained the flexibility some people required. This was
implemented 5 years back, and the result is this with an initramfs:

- mount / and /usr in initramfs
- [early boot]
- mount other filesystems
- [late boot]

or directly, without an initramfs:

- mount /
- [early boot]
- mount other filesystems
- [late boot]

Thus we could guarantee the availability of files in /usr from the
moment the system starts, and is independent of all init systems.

The tradeoff is that we no longer supported direct booting of a system
with a separate /usr; you had to use an initramfs. You could still boot
directly, but / and /usr had to be on the same filesystem to guarantee
the availability of /usr. But with this solution in place, all stages
of the boot could rely on tools, libraries and datafiles present in /usr.

This has been in production use since wheezy, and because it was so
transparent, very few people would even realise that the filesystems had
been (effectively) unified since then, because I took great care to
support (and test) every possible combination of filesystem types, with
the one exception mentioned above. / and/or /usr could be local or
remote, and we supported the default initramfs case as well as a number
of other less common cases.

The point of all this work was to achieve the *practical effect* of
unification without actually requiring any disruptive or breaking
changes. It had to be transparent and robust. And it succeeded in
these goals.

This work was not the end, it was the key prerequisite for any future
usrmerge transition. We had solved the availability of files in /usr in
early boot by means of a clever but complex bandaid in the initramfs,
but the long term solution would be to deprecate use of a separate /usr,
while still supporting it via the initramfs for compatibility. The
usrmerge itself would be a breaking change for some non-standard setups,
so would have to be deferred for a later release.

This marks the point I left the Debian project, so I have had no further
involvement with usrmerge. I was one of the original people testing
dpkg for bugs in its merged /usr operation, but never did any serious
implementation work.


The last point I'll make here before moving onto your points is one
about the tradeoffs of complexity and flexibility of the system boot.
As many people have pointed out, it's *possible* to set up the system to
boot in a number of diverse and interesting ways. While ultimate
flexibility and choice is nice, it also has a serious cost. Someone
(me) had to manually test every single one of these variations to ensure
that they never broke across upgrades of the initscripts for both
routine upgrades and whole distribution upgrades. That was extremely
expensive in time, effort and resources. There are good reasons why
some distributions only allow booting from an initramfs, and it is
largely down to minimising the testing and support burden by having a
single tested and reliable means of booting. The efforts we went to in
Debian went well above and beyond what companies like RedHat do, but
even we had limits upon what we could test and support. We're only
human, and we don't have endless time and enthusiasm for supporting
esoteric boot strategies, particularly when only a handful of people use
them. At some point, we have to suggest they use the recommended way,
or have them do the work to support it.

This is one of the major factors why I would question the use of
esoteric methods of partitioning and booting the system. The "what"
might be interesting. But the "why" needs justifying. In supporting
the different methods of booting, we have concrete use cases and
workflows for all the common and not-so-common scenarios. "Just
because" can be neat, but it's not sustainable or reasonable to expect
that to be supported well, if at all.

I just want to make the point that the argument that you should be able
to boot the system any way you like is not cost free. It's a nice
ideal. But. Someone needed to personally spend many man hours to make
that work, and in a large part that was me. I spent many, many evenings
doing nothing but testing all this stuff. Local, NFS, Local+NFS, with
and without initramfs, multiple architectures, a huge test matrix to run
through for *every change*. Again and again over the course of years.
You have me to thank that much of the esoteric options work *at all*.
Because instead of spending combined man months of effort, I could have
just blown that all off and told everyone to use an initramfs. But I
didn't. I wanted to ensure that we would never knowingly break a system
on upgrade, even the esoteric custom ones, unless there was no way to
avoid it (as above).

As an aside, in retrospect, I would probably chose to only support boot
from an initramfs if given the choice today. It's the default, used by
almost every installation. The cost/benefit of the more esoteric
methods just isn't there.

>>   It's certainly not a new discussion, since I remember debating it a
>> good few years back, but there are still the same opinions and
>> thoughts on the topic that I remember from back then.
>>
>> Some general points to consider:
>>
>> 1) A separate /usr serves no practical purpose on a Debian/Devuan system
>
>   Yes it does, and they were already listed:
>
> 1) mounting /usr with different mount options (like barrier, ro, nodev etc);


Could you describe the specific goals of the separation?

In particular, could you take a step back, and think about the specific
problems which you are really trying to solve, rather than tying the
solution to this specific mountpoint. Knowing more about the specifics
of your use case would help.

As an example of my own thinking. Most of / should be ro+nodev just
like for /usr. So one deeper question is which bits of / shouldn't be
ro/nodev? /etc? /var? Maybe the separation shouldn't be between / and
/usr. Maybe it should be between / and /etc and /var? Others?

I should point out that I wrote a set of patches for mounting /etc in
the initramfs as well as /usr, specifically so that you could do this.
Including over NFS. You could have separate mount options, encryption,
whatever, for a separate /etc filesystem. The approach is extensible to
other directories as well, should you wish. These patches were never
integrated into initramfs-tools, but can be resurrected or redone with
relatively little effort; the /usr support required adding the generic
infrastructure for mounting arbitrary filesystems.

> 2) having /usr mounted over the network keeping / local;


The supported use cases here are:

- having / mounted locally
- having / mounted over the network (including /usr) -- the recommended
setup for NFS

While the initramfs does support both local / and remote /usr (by *my
own design and intent*), this is purely to avoid breakage on upgrades.
It's not a recommended setup.

All the cluster nodes I've set up in the past used NFS / + local
writable overlay and worked well.

The supported use cases will not be impacted by usrmerge (again, by my
own design since I did the groundwork for it), but the local / and
remote /usr will be affected.

> 3) having a /usr partition shared by several local installs that are
> booted on different / filesystems;


It's important to point out here that this has never, *ever*, been a
supported or recommended way of running a Debian system. It's clearly
(and obviously if you think about it) broken by design.

The content of that /usr filesystem is under the control of *one* dpkg
package database on a single system. If *any* of the other systems
install or remove any package touching /usr (which is all of them!),
they will be corrupting the installation. Files are going to be added
which aren't tracked. Files which are tracked are going to be removed
when they shouldn't. And the maintainer scripts which handle
migrations, generate data, and generally futz with specific bits of the
filesystem tree, are going to do very undesirable things. They also add
and remove system users which won't be added or removed on the other
systems.

Even mounting it read-only is giving you an incomplete view of the
installed packages' contents.

This is not a sensible or robust strategy (and I'm *severely
understating* the severity of the problem when I put it this way).

It would be possible to share the ports tree on a FreeBSD system, since
it's mostly self-contained, so long as it's read-only (it has unshared
data in /var including the package database, so can't be read-write).
But this is not reasonable with dpkg, by design. The packages are
putting data in / and /usr, as well as /var. You cannot just export
/usr without getting into an inconsistent and incoherent mess.

> 4) having the smallest possible / filesystem to ease recovery of a
> botched system.


I'd personally go for a dedicated rescue disk/USB stick/live CD for this
specific purpose. As I mentioned right at the top, / and /usr have been
inseparable for booting for the last 5 years. Even with an emergency
boot, /usr is going to be mounted early. And you might well need the
tools and libraries on it to effect a repair.

If you can emergency boot a current system with a separate /usr, that is
a lucky happenstance. It's unsupported and untested.

>>    Historically, /usr was separately mountable, shareable over NFS.
>> With a package manager like dpkg, / and /usr are an integrated,
>> managed whole.  Sharing over NFS is not practical since the managed
>> files span both parts, and you can't split the package database
>> between separate systems.  Modern disk sizes make partitioning a
>> separate /usr unnecessary and undesirable.  (Both are of course
>> /possible/, but there is precious little to gain by doing so.)
>
>   This too was debated and rejected.  It doesn't matter if some idea
> originated out of specific restraints or circumstances, of it it was a
> bad idea at the time.  Time proved it to be a good filesystem layout,
> and it's for this reason is has survived to this day.


Every part of the system design and implementation has numerous
tradeoffs, which need to be considered. I hope that the explanations
I've given above give you just a little bit of insight into some of the
problems for which I have thought quite deeply upon over the course of
several years, and spent significant time investigating, implementing
and testing.

Thinking about these problems requires being rational, and objective.
We need to break down problems into parts, turn all the differing
requirements into a set of sensible use cases, and consider all the
concrete facts at our disposal when considering the different strategies
we could employ.

The detailed work on / and /usr mounting in the initramfs, and the
deprecation of booting with a separate /usr didn't just happen on a
whim. It happened after years of discussion, and months of
investigation and evaluation. It considered all the tradeoffs for
different use cases, and worked to minimise disruption to the maximum
possible extent. It was accompanied by several months of testing. I do
think of this as one of the most elegant and transparent migrations
we've done to date, and one of the pieces of work in Debian I have the
most pride in as a result (along with the /run migration which happened
at the same time). I had zero reports of broken systems, and considered
it a great success.

The ultimate merging of / and /usr is a related but separate question.
Now the two trees are intimately tied together from first boot, we need
to consider the cost/benefit of keeping them separate, and the
cost/benefit of merging them. In some sense, we already did the
work--everything but moving the files into the unified locations.
Whether the final step is strictly needed is open to debate; we already
solved the primary problems the merge was intended to solve originally.

For yourself and all the others posters in the thread, I would have this
suggestion:

To look at this dispassionately and objectively, firstly step back and
begin by looking not at the solution of a separate /usr, but rather at
the actual problems you have which you want to solve. Look into all the
ways you might approach it; there may be several, with different
tradeoffs. It might be that a separate /usr is the optimal solution.
But it's also quite possible that while it's /a/ solution, there are
other solutions which might be better.

/usr originated because of certain constraints on early Unix systems,
and it has persisted since then out of tradition and entrenched usage
patterns and designs long after those constraints were lifted. It is
not unreasonable or heretical to step back and re-evaluate whether the
modern-day purposes and uses of /usr are still applicable and
appropriate. For many years I used a separate /usr on LVM; today I use
ZFS and do without it entirely. ZFS allows for even more finely-grained
and flexible partitioning. Times and needs can change, and I would
rather we were all able to evaluate and articulate our requirements
objectively, rather than being aggressive and rude about it.

>>    Other systems, like the BSDs, have the effective split between base
>> (/) and ports (/usr/local).  / and /usr are still effectively a
>> managed whole containing the "base system".
>
>   So what?  How does this imply that the possibility of not having a
> "managed whole" in Linux is a bad idea?


You're using a system which is managed by dpkg.

By using dpkg to keep track of all the files on the system, permitting
installation, upgrade and removal of packaged software, you are *by
definition* a user of a "managed system". dpkg is managing the files on
the system *on your behalf*. You can't meaningfully separate / and /usr
when dpkg is managing the *whole collection* of files as a *complete
operating system*. This has been true since we first created
distributions using integrated package managers.

Being able to separate /usr in this manner is a legacy of sharing e.g.
System V installations loaded from tape. It's no longer desirable or
practical if using a package manager. This is the consequence of
package management. If you do want this, then you should be looking at
a different distribution, one without a package manager.


Regards,
Roger