Hi,
On Thu, 21 Sept 2023 at 03:56, Bob Proulx via Dng <dng@???> wrote:
>
> wirelessduck--- wrote:
> > I'm having trouble running virsh under runit-init.
> >
> > It was working previously under sysvinit-core, but after I install
> > runit-init and reboot I can no longer access the hypervisor.
>
> I don't know much detail but I know how I would start to debug this.
> On a sysvinit booted system I would look for running processes related
> to libvirt.
>
> root@calamity:~# ps -efH | grep libvirt
> root 1657 1 0 Sep09 ? 00:05:18 /usr/sbin/libvirtd -d
> libvirt+ 3651 1 38 Sep09 ? 4-01:18:14 qemu-system-x86_64 -enable-kvm -name guest=...
>
> On a working system "libvirtd -d" is running in deamon mode.
On my working system with sysvinit-core installed, I get slightly
different with no qemu process but it still works and I can create/run
VM in virt-manager and run virsh without error.
I also get the same output when I have runit-init installed.
# ps -efH | grep libvirt
root 1823 1 0 Sep25 ? 00:00:01 /usr/sbin/libvirtd -d
nobody 2049 1 0 Sep25 ? 00:00:00 /usr/sbin/dnsmasq
--conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro
--dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper
root 2050 2049 0 Sep25 ? 00:00:00 /usr/sbin/dnsmasq
--conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro
--dhcp-script=/usr/lib/libvirt/libvirt_leaseshelper
> root@calamity:~# lsof -p 1657 | grep -i -e sock -e stream
> libvirtd 1657 root mem REG 8,18 97464 135073016 /usr/lib/x86_64-linux-gnu/libboost_iostreams.so.1.62.0
> libvirtd 1657 root 11u unix 0xffff922018c85400 0t0 18513 /var/run/libvirt/libvirt-sock type=STREAM
> libvirtd 1657 root 12u unix 0xffff922017582400 0t0 18515 /var/run/libvirt/libvirt-sock-ro type=STREAM
> libvirtd 1657 root 13u unix 0xffff922015ee6400 0t0 18517 /var/run/libvirt/libvirt-admin-sock type=STREAM
> libvirtd 1657 root 15u unix 0xffff922018e0f000 0t0 18550 type=STREAM
> libvirtd 1657 root 16u sock 0,8 0t0 16620 protocol: TCP
> libvirtd 1657 root 22u unix 0xffff921fcfd63c00 0t0 29737 /var/run/libvirt/libvirt-sock type=STREAM
> libvirtd 1657 root 26u unix 0xffff922017582800 0t0 30981 type=STREAM
Once again, slightly different under sysvinit-core for me:
# lsof -p 1823 | grep -i -e sock -e stream
lsof: WARNING: can't stat() fuse.portal file system /run/user/1001/doc
Output information may be incomplete.
libvirtd 1823 root 9u sock 0,8 0t0
23356 protocol: NETLINK
libvirtd 1823 root 10u unix 0x00000000430f3988 0t0
23357 /run/libvirt/libvirt-sock type=STREAM (LISTEN)
libvirtd 1823 root 11u unix 0x0000000028d785f9 0t0
23358 /run/libvirt/libvirt-sock-ro type=STREAM (LISTEN)
libvirtd 1823 root 12u unix 0x000000000d0f7fb5 0t0
23359 /run/libvirt/libvirt-admin-sock type=STREAM (LISTEN)
libvirtd 1823 root 15u unix 0x000000008f5b86d7 0t0
23368 type=STREAM (CONNECTED)
When I check again under runit-init, I am missing the last output line:
# lsof -p 1856 | grep -i -e sock -e stream
lsof: WARNING: can't stat() fuse.portal file system /run/user/1001/doc
Output information may be incomplete.
libvirtd 1856 root 9u sock 0,8 0t0
14760 protocol: NETLINK
libvirtd 1856 root 10u unix 0x0000000021f3f3ac 0t0
14761 /run/libvirt/libvirt-sock type=STREAM (LISTEN)
libvirtd 1856 root 11u unix 0x00000000460ca424 0t0
14762 /run/libvirt/libvirt-sock-ro type=STREAM (LISTEN)
libvirtd 1856 root 12u unix 0x000000004055b5e6 0t0
14763 /run/libvirt/libvirt-admin-sock type=STREAM (LISTEN)
Perhaps this is telling me that something socket-related didn't
connect/activate properly under runit-init? But I have no problems
running `virsh list` as root under runit-init. The hypervisor connect
error is only appearing when running as my local user. Hmmm....
>
> root@calamity:~# strace -v virsh list 2>&1 | grep sock
> socket(AF_UNIX, SOCK_STREAM, 0) = 5
> connect(5, {sa_family=AF_UNIX, sun_path="/var/run/libvirt/libvirt-sock"}, 110) = 0
> getsockname(5, {sa_family=AF_UNIX}, [128->2]) = 0
Running as root under runit-init:
# strace -v virsh list 2>&1 | grep sock
access("/var/run/libvirt/libvirt-sock", F_OK) = 0
socket(AF_UNIX, SOCK_STREAM, 0) = 5
connect(5, {sa_family=AF_UNIX,
sun_path="/var/run/libvirt/libvirt-sock"}, 110) = 0
getsockname(5, {sa_family=AF_UNIX}, [128 => 2]) = 0
Running as my local user, it seems to be accessing a different socket
that doesn't exist:
$ strace -v virsh list 2>&1 | grep sock
access("/var/run/libvirt/virtqemud-sock", F_OK) = -1 ENOENT (No such
file or directory)
access("/var/run/libvirt/libvirt-sock", F_OK) = 0
socket(AF_UNIX, SOCK_STREAM, 0) = 5
connect(5, {sa_family=AF_UNIX,
sun_path="/var/run/libvirt/libvirt-sock"}, 110) = 0
getsockname(5, {sa_family=AF_UNIX}, [128 => 2]) = 0
>
> root@calamity:~# ll /var/run/libvirt/libvirt-sock
> srwxrwxrwx 1 root root 0 Sep 9 19:48 /var/run/libvirt/libvirt-sock
# ls -l /var/run/libvirt/libvirt*
srwx------ 1 root root 0 Sep 27 10:16 /var/run/libvirt/libvirt-admin-sock
srwxrwxrwx 1 root root 0 Sep 27 10:16 /var/run/libvirt/libvirt-sock
srwxrwxrwx 1 root root 0 Sep 27 10:16 /var/run/libvirt/libvirt-sock-ro
>
> It uses /var/run/libvirt/libvirt-sock for communications. That's the
> area where things are failing for you. The client virsh can't connect
> to the running libvirtd daemon.
>
> root@calamity:~# dpkg -S /etc/init.d/libvirtd
> libvirt-daemon-system: /etc/init.d/libvirtd
>
> root@calamity:~# dpkg -L libvirt-daemon-system | grep /etc/ | grep /etc/init.d/
> /etc/init.d/libvirt-guests
> /etc/init.d/libvirtd
> /etc/init.d/virtlogd
>
> root@calamity:~# dpkg -L libvirt-daemon-system | grep /etc/ | less
>
> In the case of libvirtd AFAIK there is only sysvinit scripts. Which
> means runit will be running them in legacy compatibility mode. That
> seems to be the likely part where things are not working.
On daedalus, the init script was split into
libvirt-daemon-system-sysv. I also have libvirt-daemon-driver-qemu
installed.
Given that it works under root but not a regular user account, I'm
guessing there is some sort of permission problem somewhere.
The /etc/libvirt/*.conf files mention polkit so I'm not sure if that
is related here.
> I would boot runit and then look to see how things are different. Is
> the libvirtd running? Are there are any error messages at boot time
> around the running of those scripts. Then to further debug run those
> scripts manually. Trace them.
>
> root@calamity:~# ll /etc/init.d/libvirtd
> -rwxr-xr-x 1 root root 5600 Oct 1 2020 /etc/init.d/libvirtd
>
> root@calamity:~# less /etc/init.d/libvirtd
>
> root@calamity:~# service libvirtd status
> Checking status of libvirt management daemon: libvirtd not running.
>
> root@calamity:~# service libvirtd start
>
> One should normally use "service" to start and stop init scripts
> because it cleans the environment, changes directory, and so forth.
> All needed things. Of course runit also has a native interface. For
> more debugging I would run the script directly tracing it.
>
> root@calamity:~# sh -x /etc/init.d/libvirtd
>
> That will produce a lot of shell tracing output. I can't really
> speculate more without more detail. Hopefully by looking at what it
> is doing the reason for it not doing it under runit will become
> apparent. It might be a problem with the required cgroups setup. It
> might be a different problem. Check the syslog to see what errors are
> logged there.
Thanks for that suggestion to run initscript with `sh -x`. It appears
to show an error mounting cgroups in the init script, but perhaps that
is ok because I already have cgroupfs-mount package installed for
docker and that is already mounting the cgroups.
Sorry for the jumbled responses, but I'm testing as I write the reply here...
Hmmmm.. I noticed the output line of:
+ check_mount_cgroup_options
+ [ ! = yes ]
+ return 1
which looked a bit suspicious. So I checked the init script and it is
reading $mount_cgroups from /etc/default/libvirtd. The value is
commented out by default so I uncommented and set:
mount_cgroups=no
because I am already mounting cgroups via the cgroupfs-mount package.
I then stopped/restarted the libvirtd service and got some better output.
+ check_mount_cgroup_options
+ [ ! no = yes ]
+ return 1
> Hope that helps!
Yes! It appears that last shell debugging of init script definitely helped.
After setting mount_cgroups=no in /etc/default/libvirtd and restarting
libvirtd service, I can now successfully run `virsh list` and
`virt-manager` from my regular user account!
>
> Bob
Thanks for your assistance on this weird problem!
Tom