Author: g4sra Date: To: dng@lists.dyne.org Subject: Re: [DNG] NFS Stale file handle for regular user, not root
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, February 24, 2021 10:03 PM, Jackman via Dng <dng@???> wrote:
> I don't even know how to Google for this.
>
> Hosts involved:
> storage0: NFS/KRB5 server
> dorito: NFS client
>
> As root, I can mount, list files, view files, and so on. When I try to list files on the mount, I get a stale file handle error.
>
> ➜ ~ mount /mnt/backup && ls -hal /mnt/backup
> ls: cannot open directory '/mnt/backup': Stale file handle
> ➜ ~ ls /mnt -hal
> ls: cannot access '/mnt/backup': Permission denied
> total 24K
> drwxr-xr-x 7 root root 4.0K Feb 24 12:10 .
> drwxr-xr-x 22 root root 4.0K Feb 23 15:56 ..
> d????????? ? ? ? ? ? backup
>
> Note the difference with root:
>
> dorito :: ~ » mount /mnt/backup && ls -hal /mnt/backup
> total 57K
> drwxr-xr-x 12 root root 12 Feb 23 03:36 .
> drwxr-xr-x 7 root root 4.0K Feb 24 12:10 ..
> drwxr-xr-x 3 jackman jackman 3 Feb 11 04:21 backup_dorito_20210211
> drwxr-xr-x 4 jackman jackman 4 Feb 23 06:16 dorito_20210223-0336
> dorito :: ~ » ls -hal /mnt
> total 26K
> drwxr-xr-x 7 root root 4.0K Feb 24 12:10 .
> drwxr-xr-x 22 root root 4.0K Feb 23 15:56 ..
> drwxr-xr-x 12 root root 12 Feb 23 03:36 backup
>
> Cute, eh?
>
> This is not a problem on the storage server itself:
>
> storage0 :: /srv » mount /mnt/backup && ls -hal /mnt/backup
> total 57K
> drwxr-xr-x 12 root root 12 Feb 23 03:36 .
> drwxr-xr-x 8 root root 4.0K Feb 24 14:16 ..
> drwxr-xr-x 3 jackman jackman 3 Feb 11 04:21 backup_dorito_20210211
> drwxr-xr-x 4 jackman jackman 4 Feb 23 06:16 dorito_20210223-0336
>
> Here are some relevant things:
>
> ➜ ~ mount | grep nfs
> storage0:/srv/backup on /mnt/backup type nfs4 (rw,nosuid,nodev,noexec,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5,clientaddr=10.1.0.110,local_lock=none,addr=10.1.0.100,user=jackman)
>
> ➜ ~ cat /etc/fstab | grep backup
> storage0:/srv/backup /mnt/backup nfs4 noauto,rw,noexec,user,sec=krb5 0 0
>
> storage0 :: /srv » cat /etc/exports | grep backup
> /srv/backup 10.0.0.0/8(rw,no_subtree_check,sec=krb5)
>
> storage0 :: /srv » cat /etc/default/nfs-kernel-server
> RPCNFSDCOUNT=32
> RPCNFSDPRIORITY=0
> RPCMOUNTDOPTS="--manage-gids"
> NEED_SVCGSSD="yes"
> RPCSVCGSSDOPTS=""
> storage0 :: /srv » cat /etc/default/nfs-common
> NEED_STATD=
> STATDOPTS=
> NEED_IDMAPD="yes"
> NEED_GSSD="yes"
>
> Networking is static:
>
> storage0 :: /var/log/kerberos » cat /etc/hosts
> 127.0.0.1 localhost
> ::1 localhost ip6-localhost ip6-loopback
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> 10.1.0.100 storage0.jackman.local storage0
> 10.1.0.110 dorito.jackman.local dorito
>
> The mount appears to execute cleanly:
>
> ➜ ~ mount -v /mnt/backup
> mount.nfs4: timeout set for Wed Feb 24 14:27:58 2021
> mount.nfs4: trying text-based options 'sec=krb5,vers=4.2,addr=10.1.0.100,clientaddr=10.1.0.110'
>
> I don't see anything in the system logs on either machine that look at all relevant.
>
> Nothing odd (IMHO) is happening in the KRB5 logs, just successful grants.
>
> I thought this problem was exclusive to dorito (the client), so I nuked and re-installed Devuan. I have since installed Xubuntu on my desktop and it has the same issue now, too, but I'm politely setting that machine aside as it's not a Devuan machine.
>
> BTW, I've tried with 'noac', with no apparent change in behavior.
>
> I'm happy to RTFM if I knew what I was looking for.
>
> As always, any helpful souls are welcome to solicit on behalf of their beer fund.
>
> Andrew Jackman
> kd7nyq@???
The unmount/remount technique won't work if some process is locked to it.
Use 'lsof' as root to check the share on both the server and the client.
If you find something, kill it and try the unmount/remount again.
NB: this advice is worth exactly what you paid for it :)