:: Re: [DNG] Bug in procps "kill"?
Pàgina inicial
Delete this message
Reply to this message
Autor: Steffen Dettmer
Data:  
A: Nick
CC: dng
Assumpte: Re: [DNG] Bug in procps "kill"?
Hi,

Thank you for your quick response.

On Thu, Nov 21, 2024 at 2:22 PM Nick via Dng <dng@???> wrote:

> On 21-11-2024 13:02, Steffen Dettmer via Dng wrote:
>
>
>
> On Thu, Nov 21, 2024 at 12:57 PM Steffen Dettmer <
> steffen.dettmer@???> wrote:
>
>>
>>   sdettmer@RefVm6:~/work/p1/src (master *+$|u=) $ strace -e kill
>> /bin/kill -7659 -7660
>>   kill(-7, SIGTERM)                       = -1 ESRCH (No such process)
>>   +++ exited with 0 +++
>>   sdettmer@RefVm6:~/work/p1/src (master *+$|u=) $

>>
>>
> ps: As a workaround, I can use "kill -- -9876" .
>
> I do not know what you want to achieve
>

(in the bigger picture, I try to terminate all background jobs of a cascade
of bash shell scripts that should work interactively, non-interactively and
on a range of systems, like a Devuan and a Yocto one)


> but as you can see kill act only on SIGTERM and ignores -7659.
>

It does not ignore it, as the gdb output shows, it evaluates it (but only
the first digit) and stores this in the variable "pid" and then "negates"
it (so pid is -7). signo was initialized with SIGTERM (=15) and is
unchanged, resulting in the call shown by strace (" kill(-7, SIGTERM)").
Additionally, this branch of code disables printing the "No such process"
error and also returns the zero exit status instead of an error (and this
caused me spending quite some time to find this at all).

> You use kill with a PID and if this does not act as wanted you do a kill
> -15 PID (aka SIGTERM) or a kill -9 PID (aka SIGKILL).
>

Indeed, kill -9 (single digit) is not interpreted as process group id, but
as a signal (SIGKILL has signo 9):

  $ strace -e kill /bin/kill -9 -7654
  kill(-7, SIGKILL)                       = -1 ESRCH (No such process)
  +++ exited with 0 +++


(especially, "kill -9" prints a usage, as the mandatory PID is missing on
command line, but "kill -9876" does not print usage).

So when using multiple digits, it is not interpreted as signal (but signal
SIGTERM is used by default, with signo=15):

  $ strace -e kill /bin/kill -9876
  kill(-9, SIGTERM)                       = -1 ESRCH (No such process)
  +++ exited with 0 +++


and

  $ strace -e kill /bin/kill -7654
  kill(-7, SIGTERM)                       = -1 ESRCH (No such process)
  +++ exited with 0 +++


(I have signo=7 for SIGBUS; but this is not used. SIGTERM is with which has
signo=15).
I think this proves that the first digit is used as pid (first argument of
kill(2)). It shows that the kill syscall returned an error, but the kill(3)
tool exited with zero (i.e. no error).

I guess your workaround signals the -9 and ignores 876.
>

On my Devuan bash 5.2.15(1)-release the workaround works correctly:

  $ strace -e kill /bin/kill -- -7654
  kill(-7654, SIGTERM)                    = -1 ESRCH (No such process)
  /bin/kill: (-7654): No such process
  +++ exited with 1 +++


we see that the PID is correctly used as process group leader ID (i.e.
negative value) with the default signal SIGTERM (with signo=15).
This is because in the kill.c code another branch is executed, as "--"
makes getopt() to finish processing command line options. It also prints
out an appropriate error message and terminates with EXIT_FAILURE (1) as
expected.

Steffen