:: Re: [DNG] Max Load Average
Kezdőlap
Delete this message
Reply to this message
Szerző: Martin Steigerwald
Dátum:  
Címzett: dng
Tárgy: Re: [DNG] Max Load Average
Hi!

nisp1953 via Dng - 07.07.24, 01:18:59 CEST:
> Another question here. What is the max load average I can run on my
> laptop? I am using Devuan 5.0 on a Lenovo T480S Thinkpad.
>
> Right now htop is showing 4.95 for the 5 minute average.


Oh my there is so much about this question! Let me have an attempt on
clearing things up:

Load Average without any further information is a largely meaningless
metric.

It can just give an indication on whether there are delays. I.e. if load
average exceeds the number of CPU cores (including hyper threading, i.e.
as Linux counts them) you have delays.

But not about the *cause* of the delays. These delays do not need to have
to do with CPU utilization *at all*. They can also be caused by processes
that wait for I/O for example¹.

It is not even a 1, 5, 15 minute average and so forth and so on².

I could go on like this for an hour or so as I researched this topic
deeply for my Linux performance analysis & tuning causes. Nothing about
load average is as many people think it would be. Actually I am tempted to
write: This metric is (almost) complete nonsense. (Wow, now I did it :)

Even CPU utilization is not an exact metric³.

Now to answer your question:

Unless you are messing around with the fans using thinkfan or zcfan and
probably in addition to messing around with frequency regulation
regarding CPU and GPU in ways that your laptop heats up more than it can
cool… even if your workload is indeed CPU (and memory) based or GPU based…
my experience is that with ThinkPad T models you can just run them under
any workload for hours and hours to come. Actually scrubbing 1,5 TiB on a
Samsung 990 Pro 4 TB NVME SSD with PCI Express 4 speeds on a ThinkPad T14
AMD Gen 5 can generate lots of heat as well.

Unless the fan of your laptop is dusted so much that it cannot be effective
anymore. But then you can open the laptop, use air under pressure from a
can, hold the fan with a finger so it does not move!!! – and produce
current – and then blow the air from outside towards the center of the
laptop remove the dust. That is the short version of it. Review the
internet, I bet there are guides for this. But discern good from bad
guides! Or have it serviced for dust removal by a competent computer
hardware shop. This little short version of a dust removal guide comes
*absolutely* without warranty! I did it several times and it worked. Fan
was working then like it was new instead of being almost always on as
before. That is all I can write. Have it serviced instead in case you have
any doubt on doing this yourself!

Also in case of serious overheating the machine should shut itself down
immediately. But I do recommend to rely on that!

All of this is without any warranty and the main point I wanted to bring
across is: Load average is largely meaningless. It is a hard to break myth
that it means what people think it means, but I did another share on
achieving that. Hopefully.

Want a better metric? Use Pressure Stall Information in addition to
utilization metrics (PSI)[4]. Available in Atop. But recent Htop versions
also have it, AFAIR you need to enable it as its not on by default.

And if you take anything home from a post it is this: Do not base any
decision on metrics you do not understand. Especially do not freak out on
high load averages! Your load average of 5 is not even high! It is almost
comical for me meanwhile seeing people freaking out on that every now and
then.

Or to make it shorter: High load averages will not break your hardware.
Period. If its good hardware, at least.

Especially as delays do not mean that your machine is doing more work than
it would do on 100% utilization. Learn to differentiate between utilization
and saturation[5]! So for a 4-core system a CPU based (!) load average of
about 4 is the same regarding utilization than a CPU based (!) load
average of 100. The delays are happening cause there is more work to do
than the machine can handle, but that does not mean that the machine goes
like "oh, oh, oh, I need to do all that work, so… I overstress myself!".
It just means that *you* wait longer. :) Sadly it is mostly human beings
who overload themselves on stress. I can just recommend: Don't. Its not
healthy.

ThinkPad T models are hard to break. And they certainly do not break by a
(even quite low) load average of 5.

But again, all without any warranty or liability.

But before you go "but… but… but…" really take in and review my
information. I did my research.

[1] https://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html

[2] Neil J. Gunter, UNIX Load Average Part 1: How it works:
https://www.perfdynamics.com/Papers/la1.pdf

[3] https://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html

[4] https://www.kernel.org/doc/html/latest/accounting/psi.html

[5] https://www.brendangregg.com/usemethod.html

Best,
--
Martin