:: Re: [DNG] Uptime issue
Top Page
Delete this message
Reply to this message
Author: nick
Date:  
To: dng
Subject: Re: [DNG] Uptime issue


It's definitely a dream system. I would still suspect it though. My reasoning would be somewhat like this:

1. Random lockups are not normal and shouldn't be happening.

2. The cause has gotta be either hardware or software.

3. If it's hardware it's gotta be one (or more) specific component that is failing, defined by if I replaced that component with an identical unit (of the same manufacturer and model) the problem would go away.

4. If it's software it's gotta be a subtle bug or driver incompatibility, sometimes latent bugs can be triggered by unusual combinations eg. Let us say the driver for your AMD graphics card fails when there is 64 GB or more of RAM, just for the sake of example.

5. It could also be a matter of settings or configuration eg if your BIOS has configured the RAM for a higher clock than it is specced for, although in this era of autoconfiguration this would probably count as a driver bug.



What I would do as a starting point would be to pull out the GPU and half the RAM and use it for a few weeks to see if problem goes away. Does it have internal graphics or do you have an older GPU to use temporarily? If problem recurs swap the RAM for the other half and re-test. You can also try the GPU or RAM in another system to see if problem moves with it. If it turns out to be the GPU then it could be driver issue as drivers are very complex these days. You could try earlier driver or earlier kernel (as you are already doing) but such approach is fraught. Once you narrow down the issue to a specific part or driver its better to take it out of service until a new part or fix is available.



In principle you can use the same approach to diagnose CPU or mobo issue, but you would need identical spares which could get costly. If buying spares for testing I would highly recommend to get a PSU first. I haven't been into system building for many years but I have heard that PSU is responsible for a large proportion of faults with modern rigs given how demanding they are on the PSU.



I am sure you can solve this. The nightmare is when it happens on a laptop where you really have no option but to try earlier kernels or removing drivers or take the laptop out of service (has happened to me). On a PC it is much easier. Oh yeah another thought: you might try running the dreaded Windows on it for a while. If it still locks up you have eliminated software except possibly for common code in AMD display drivers.



Kind regards, Nick








>
> On 29 Sept 2024 at 3:27 am, dng-request <dng-request@???> wrote:
>
>
> Send Dng mailing list submissions to
> dng@???
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
> or, via email, send a message with subject or body 'help' to
> dng-request@???
>
> You can reach the person managing the list at
> dng-owner@???
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Dng digest..."
>
>
> Today's Topics:
>
> 1. Re: Critical CVE? (Marin Ivanov)
> 2. Re: unresponsive machine (Hendrik Boom)
> 3. Re: oscilloscope; was Dng Digest, Vol 120, Issue 53
> (peter@???)
> 4. Re (2): unresponsive machine (peter@???)
> 5. Re: unresponsive machine (o1bigtenor)
> 6. Re: oscilloscope; was Dng Digest, Vol 120, Issue 53 (o1bigtenor)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 28 Sep 2024 00:11:32 +0300
> From: Marin Ivanov <metala@???>
> To: dng@???
> Subject: Re: [DNG] Critical CVE?
> Message-ID: <d7727f76-bba0-4714-83ee-a8fb07541077@???>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Hi Rob,
>
> The daemon needs root for port < 1024 binding and maybe some opened files.
> However, I don't see a reason why the daemon should keep and not drop
> the root privileges after that.
>
> Maybe they did not want to fix it upstream and was left for the distros
> to patch.
>
> Kind Regards,
> Marin
>
> On 27/09/2024 13:40, Rob van der Putten via Dng wrote:
> > Hi
> >
> >
> > On 27/09/2024 11:43, Didier Kryn wrote:
> >
> >> Le 26/09/2024 ? 23:05, Nick via Dng a ?crit?:
> >>> On 26-09-2024 22:55, Peter Duffy wrote:
> >>>> These have appeared in the last hour or so:
> >>>>
> >>>> https://gist.github.com/stong/c8847ef27910ae344a7b5408d9840ee1
> >>>>
> >>>> https://www.evilsocket.net/2024/09/26/Attacking-UNIX-systems-via-CUPS-Part-I/
> >>>>
> >>>>
> >>>> CUPS? (specifically cups-browserd)
> >>>>
> >>>> Personally, I'm waiting for a few analyses of the above before I do
> >>>> anything drastic.
> >>>>
> >>>> On Thu, 2024-09-26 at 14:33 -0500, golinux via Dng wrote:
> >>>>> On 2024-09-26 13:53, Martin Steigerwald wrote:
> >>>>>> Hi.
> >>>>>>
> >>>>>> Peter Duffy - 26.09.24, 20:21:15 CEST:
> >>>>>>
> >>>>>> Or on The Register. And its past 20:00 UTC already.
> >>>>>>
> >>>>> Nope . . .
> >>>>>
> >>>>> https://time.is/UTC?says it is now 19:31 UTC which is important
> >>>>> because
> >>>>> today's meet is at 20:30.
> >>>>>
> >>>>> golinux
> >>> It looks pretty serious although I wonder why you would have a open
> >>> cups port on the WAN interface. On the distro's I know cups is not
> >>> installed by default. And default on 127.0.0.1 if installed.
> >>
> >> ???? This is a risk for hosts running Cups in an untrusted LAN;
> >> certainly not at home. I don't know for you guys, but it would take
> >> me some config work on my internet box to map some incoming port to
> >> port 631 of the host running Cups; and why would I do this for? In
> >> addition this requires to have a private WAN IP address for the box.
> >>
> >> ???? But, in an untrusted LAN the risk may be made even bigger by
> >> Cups design: if a host is connected to two networks, its Cups server
> >> allows by default to hop from one LAN to the other for printing.
> >
> > Apart from remote access, can someone explain to me why cups-browsed
> > runs as root?
> > It is the only network daemon that I know of, that does so.
> >
> >
> > Regards,
> > Rob
> >
> >
> > _______________________________________________
> > Dng mailing list
> > Dng@???
> > https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
>
>
> ------------------------------
>
> Message: 2
> Date: Sat, 28 Sep 2024 11:51:38 -0400
> From: Hendrik Boom <hendrik@???>
> To: dng@???
> Subject: Re: [DNG] unresponsive machine
> Message-ID: <ZvgmCnXL5Fbu6wlD@???>
> Content-Type: text/plain; charset=us-ascii
>
> On Sat, Sep 28, 2024 at 10:40:58PM +1000, nick wrote:
> ...
> ...
>
> > the OOM killer fills me with dread,
>
> I would so love a feature that would allow me to tell
> the OOM killer that whenever it is invoked it should
> just kill any process running firefox-esr. It should
> consider other processes only if firefox-esr is not
> running.
>
> Apparently something like that was once proposed, but
> it is not now available.
>
> Other mechanisms for nominating processes ahead of time
> exist, but they require that one provide the process id
> number, which is not available at startup time.
>
> -- hendrik
>
>
> ------------------------------
>
> Message: 3
> Date: 28 Sep 2024 09:20:46 -0700
> From: peter@???
> To: dng@???
> Subject: Re: [DNG] oscilloscope; was Dng Digest, Vol 120, Issue 53
> Message-ID: <mailman.4665.1727544415.6705.dng@???>
>
> From: nick <nick@???>
> Date: Sat, 28 Sep 2024 23:25:46 +1000
> > I am considering getting a standalone model.
>
> The OWON HDS2202S seems a good choice at present but new capabilities
> will continue to appear. https://www.youtube.com/watch?v=1UaankSg1YI
>
> > So anyway now I have two of these setups.
>
> Pass on to another motivated student or sell on eBay?
>
> > ... dont really have time atm ...
>
> We need 48 hour days. =8~)
>
> The real problems are unnecessary complexity and ongoing unnecessary
> changes. =8~\
>
> Regards, ... P.
>
> --
> VoIP: +1 604 670 0140
> work: https://en.wikibooks.org/wiki/User:PeterEasthope
>
>
>
> ------------------------------
>
> Message: 4
> Date: 28 Sep 2024 09:59:46 -0700
> From: peter@???
> To: dng@???
> Subject: [DNG] Re (2): unresponsive machine
> Message-ID: <mailman.4666.1727544415.6705.dng@???>
>
> From: Hendrik Boom <hendrik@???>
> Date: Sat, 28 Sep 2024 11:51:38 -0400
> > ... any process running firefox-esr.
>
> Treat the symptoms?
>
> The fundamental and correct solution is a system of standards of good
> practice for safety and security. How are catastrophic failures of
> bridges prevented? Not by having weak points patched up with ad hoc
> welding and reinforcing components. Rather, standards of good
> engineering practice are applied. Computer engineering also needs good
> practice enforced by laws and professional authorities. The present
> situation is disgraceful.
>
> Regards, ... P.
>
>
> --
> VoIP: +1 604 670 0140
> work: https://en.wikibooks.org/wiki/User:PeterEasthope
>
>
>
> ------------------------------
>
> Message: 5
> Date: Sat, 28 Sep 2024 12:22:26 -0500
> From: o1bigtenor <o1bigtenor@???>
> Cc: dng <dng@???>
> Subject: Re: [DNG] unresponsive machine
> Message-ID:
> <CAPpdf58_ZtZ4wL0RVCMiwQKsF-Cyh+gFSY+XvX6v-yvpKD86ew@???>
> Content-Type: text/plain; charset="utf-8"
>
> On Sat, Sep 28, 2024 at 7:41?AM nick <nick@???> wrote:
>
> > Well there could be many causes why you do not have the uptime you should
> > have.
> >
> > What about hardware? I have a desktop that crashes from time to time. It
> > used to be reliable but I had been coming to the conclusion that CPU, mobo
> > or one of the memory sticks had gone bad. This idea was supported by the
> > fact the machine sometimes fails POST. But lately I was in the BIOS setup
> > and I saw a warning that the 3.3V is low (only 2.9V). So when i have time
> > for troubleshooting ill check it with a multimeter and see about repairing
> > or replacing the PS. Not anxious to do that as its a pricey 1000W corsair
> > that i bought for a tidy sum on ebay but needs must. Anyway, I recommend
> > you run on a different but similar HW platform and see if your issue goes
> > away.
> >
> > Another thing that would concern me if running lots of heavy apps is
> > memory overcommit. Not wanting to start any flame wars but i distrust
> > present memory management strategies and the OOM killer fills me with
> > dread, even though i only encounter it very rarely and usually because I
> > did something silly like coding an unintentional fork bomb or whatnot. And
> > maybe if you ensure your system has adequate swap it might help the issue?
> >
> > Just throwing a few things out there. I am sure you can resolve the issue
> > eventually!!
> >
> >
> Not a bad idea - - - except I really don't think so.
>
> I'm running a:
> 1. AMD Ryzen 7 5800X 8-core proc
> 2. AMD Ellesmere 570X graphics card (it has 5 outputs!!!)
> 3. 64 GB of ram
> 4. NVME drive for the system
> 5. SSDs for /usr, /var/, usr/local, swap, /home
> 6. power supply was rated at 1500 or 1600 VA (which is total bs but that's
> another story!!)
>
> System is only somewhat over 2 years old.
>
> So - - - dunno if its a hardware insufficiency although maybe - - - I might
> be taxing the
> graphics card too heavily (its a 7640 x 3000 pixel screen on X11).
>
> back over to you
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://mailinglists.dyne.org/cgi-bin/mailman/private/dng/attachments/20240928/b5291fd6/attachment.htm>
>
> ------------------------------
>
> Message: 6
> Date: Sat, 28 Sep 2024 12:26:17 -0500
> From: o1bigtenor <o1bigtenor@???>
> Cc: dng@???
> Subject: Re: [DNG] oscilloscope; was Dng Digest, Vol 120, Issue 53
> Message-ID:
> <CAPpdf59pG8QkGvmDGE7ROszsJKneNKua4mGz5nmrHf+beTeYYw@???>
> Content-Type: text/plain; charset="utf-8"
>
> On Sat, Sep 28, 2024 at 11:20?AM Peter via Dng <dng@???> wrote:
>
> > From: nick <nick@???>
> > Date: Sat, 28 Sep 2024 23:25:46 +1000
> > > I am considering getting a standalone model.
> >
> > The OWON HDS2202S seems a good choice at present but new capabilities
> > will continue to appear. https://www.youtube.com/watch?v=1UaankSg1YI
> >
> > > So anyway now I have two of these setups.
> >
> > Pass on to another motivated student or sell on eBay?
> >
> > > ... dont really have time atm ...
> >
> > We need 48 hour days. =8~)
> >
>
> Sounds about right - - - I would like 48 hr days with 7.5 hrs for sleep and
> a max of
> 2 to 3 hours not sharp out of the rest - - - - please(!?!?!?!???!!).
>
> >
> > The real problems are unnecessary complexity and ongoing unnecessary
> > changes. =8~\
> >
> > Like the line goes - - - eschew unnecessary complexity - - - - (LOL)
>
> (Thanks for the chuckles!!)
>
> Over and out
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <https://mailinglists.dyne.org/cgi-bin/mailman/private/dng/attachments/20240928/a177e962/attachment.htm>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Dng mailing list
> Dng@???
> https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng
>
>
> ------------------------------
>
> End of Dng Digest, Vol 120, Issue 55
> ************************************
>