:: Re: [DNG] Irony
Página Inicial
Delete this message
Reply to this message
Autor: Steve Litt
Data:  
Para: dng
Assunto: Re: [DNG] Irony
onefang said on Wed, 8 Nov 2023 14:43:19 +1000

>On 2023-11-07 22:37:47, Martin Steigerwald wrote:
>> Nothing! It just continues to run. If I like broken memory detection
>> and some action on broken memory, totally fine, but then I have that
>> as a separate watchdog kind of service that I install where I need
>> it.
>
>I have 256 GB of RAM, and 64 cores / 128 threads in this super desktop
>of mine. Every now and then I get a segfault in some random thing. I
>had upgraded to the 6.1 kernel so I can get reports about which core
>just segfaulted, it's random each time. So I suspect its RAM.
>
>Is this a specific broken memory watchdog thing you are talking about?
>If so, what is it?


I don't know what Martin was speaking of, but I use memtest86, a
bootable CD that's its own operating system specifically to test RAM.
With 256 GB RAM I'd guess the test will take over 24 hours. I wouldn't
set it to use all cores because of overheating.

>I'd prefer something that can just map out the
>broken byte/s, I have plenty.


"Mapping out" bad RAM isn't a good idea. See the documentation for
memtest86. It's better to replace the bad RAM. In my case, my RAM went
bad after a year, so I just took out one stick and now I have 48GB RAM
instead of 64. Pretty soon this kind of RAM will be cheap enough to buy
four sticks of *high quality* RAM.

>Reporting would be good to, see if it's
>still random and I have to look at some other part of my system.


If your RAM tests good, you can move on to other diagnostic tests. If
it tests bad (error, not warning), then whether it's the root cause of
your problem or not, you need to remove or replace the bad sticks, or
else some other maddening intermittent problem will happen down the
road.

SteveT

Steve Litt

Autumn 2023 featured book: Rapid Learning for the 21st Century
http://www.troubleshooters.com/rl21