Wed, 08 Nov 2023 09:03:07 +0100 - Martin Steigerwald <martin@???>:
> onefang - 08.11.23, 05:43:19 CET:
> > On 2023-11-07 22:37:47, Martin Steigerwald wrote:
> > > Nothing! It just continues to run. If I like broken memory detection
> > > and some action on broken memory, totally fine, but then I have that
> > > as a separate watchdog kind of service that I install where I need
> > > it.
> > I have 256 GB of RAM, and 64 cores / 128 threads in this super desktop
> > of mine. Every now and then I get a segfault in some random thing. I
> > had upgraded to the 6.1 kernel so I can get reports about which core
> > just segfaulted, it's random each time. So I suspect its RAM.
> >
> > Is this a specific broken memory watchdog thing you are talking about?
> > If so, what is it? I'd prefer something that can just map out the
> > broken byte/s, I have plenty. Reporting would be good to, see if it's
> > still random and I have to look at some other part of my system.
>
> No. But if I would like something to basically halt my machine on
> suspicion of broken RAM, I'd like it to be something I install all by self
> and not something that is forced upon me by Systemd policy. Luckily I have
> Devuan.
>
> I am not sure whether there is some kind of broken memory handling daemon
> available already. However I strongly suggest finding out which RAM bars
> are affected and replace them. Not sure how to do that, but I'd start with
> memtest86+ which recently became available with UEFI support, in case you
> use UEFI.
>
> Best,
This type of problem could also be related to the power supply: make sure
it is stable and adequate for the load.
Regards
alexus
--
Property is theft! (P-J Proudhon) -- True today more than ever.
______________________________________________________________________