I cloned the grub git repo from
https://salsa.debian.org/grub-team/grub.git
- and then, using the git command-line environment and "set debug=all",
I narrowed the problem down to the file grub-core/loader/i386/linux.c,
somewhere between lines 798 and 891. ("debug=all" causes a lot of
messages showing what's happening and whereabouts in the code it is:
very useful feature. I tried using the "linux" command to load the v2
kernel (crashed) and then a v4 kernel image (loaded successfully)).
However, in spite of various git diffs, I wasn't getting anywhere fast -
so I decided to try a line of less resistance. Working on the assumption
that whatever the problem was, someone by now had probably spotted and
fixed it, I downloaded the packages for grub 2.06-2 via
https://pkginfo.devuan.org/cgi-bin/package-query.html
- and installed them on the chimaera system, over 2.04-20.
That fixed the problem! All three systems - chimaera, win7 and CentOS -
now boot successfully.
I had to make one change to /etc/default/grub - I added the line:
GRUB_DISABLE_OS_PROBER="false"
(that seems to have been introduced since 2.04-20, and apparently
defaults to true).
So - something got broke and was then fixed. It would be nice to know
what. But in the meantime - this definitely feels like progress :)
On Sun, 2022-02-13 at 21:05 +0000, Peter Duffy wrote:
> I've got an old box running CentOS 6.2 and Windows 7. Without going into
> details, this box is vital and I use it every day. Finally I decided
> that I had to bite the bullet and upgrade the linux system, and I
> decided to go for chimaera.
>
> Built a new box from scratch and cloned all the disks, using dd, to
> fresh HDDs (there are several big data disks in the box). Made another
> clone of the first disk just for safety's sake, then installed chimaera
> on free space on the first disk - successful; chimaera and windows 7
> both booted fine. But CentOS 6.2 wouldn't boot - sometimes automatic
> reboot, sometimes blank screen and hung box.
>
> Switched back to the latest clone disk, which fortunately booted
> successfully, made a fresh clone of the working disk, then tried again:
> this time, installed beowulf. Install was successful - and this time
> devuan, windows 7 and CentOS 6.2 all booted successfully.
>
> Took another safety clone of the first disk (I'm beginning to wonder if
> I've exhausted the world's stock of 2T HDDs) and then upgraded beowulf
> to chimaera. Upgrade successful. Again, the CentOS 6.2 system wouldn't
> boot.
>
> CentOS 6.2 uses kernel 2.6 - it's possible to upgrade to a later one,
> but this is frowned upon. I suppose it's based on RedHat and
> derivatives' policy of setting a base version per distro and then
> retrofitting updates. (I did once try upgrading to a v4 kernel, and the
> system became completely unstable.)
>
> Removed the primary disk, put in the clone with beowulf installed, and
> verified that all was still working. Then put the disk with chimaera in
> another box with identical hardware, and started digging into the
> problem. Grub on chimaera = 2.04-20; on beowulf = 2.02+dfsg1-20
> +deb10u4. Booted into chimaera and downloaded the packages for the
> beowulf grub release (grub2, grub2-common, grub-common, grub-pc, and
> grub-pc-bin), and used them to downgrade grub on the chimaera system -
> successful. Rebooted - CentOS 6.2 now boots. Tried going into the grub
> command line environment on each box, and using the "linux" command to
> load the 2.6 kernel image: result was in grub 2.02, it works fine, and
> in 2.04, the box reboots at that point.
>
> So current conclusion is that something has happened between grub 2.02
> and 2.04 which prevents the latter from loading linux v2 kernels. The
> challenge now is to find out what, and if it's possible to work around
> it in grub 2.04. (I should say that I originally assumed that the
> problem was down to moving a disk (or a clone of it) from a non-UEFI
> environment to a UEFI one - but setting everything in the firmware to
> "legacy only" didn't have any effect.)
>
> Just wondered if anyone had any thoughts and comments (other than why
> the hell am I still running CentOS 6.2 on this box), before I start
> rummaging through the grub changelogs. Apologies for the length of this
> and also if I've missed something obvious. The above is a heavily
> boiled-down summary of about a fortnight of stress and lost sleep.
>
> _______________________________________________
> Dng mailing list
> Dng@???
> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng