I’ve been having an issue with my computer deadlocking after roughly 10-30 minutes of it being on. The simple test was when the computer froze up it didn’t register any inputs, caps lock didn’t change any lights on my keyboard, the computer fans still ran, sound just stopped instead of tonally rattling my headphones like a regular bsod.

I’m not entirely sure what could be causing it other than it’s a hardware issue. PC parts list below.

https://pcpartpicker.com/list/f2hPRV

The things I have tried are as follows:

Booting with my old windows 10 and Linux Mint Windows didn’t seem to record anything major in event viewer.

Switching out my ram cards one at a time and running a check on the sticks themselves. Neither stick appears to be the issue.

Checked my CPU seat for any damaged pins or debris.

Made sure all plugs are properly installed and all drivers are installed.

Flashed bios to both the recommended version and the most up to date version.

Set CPU, GPU, and internal SSD to gen 4 power in the bios.

I also ran a log check on Mint and the only error that it spat out was this:

[Dec 7 13:03] mce: [Hardware Error]: Machine check events logged

[ +0.000006] [Hardware Error]: Corrected error, no action required.

[ +0.000004] [Hardware Error]: CPU:0 (1a:44:0) MC14_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000602010b

[ +0.000008] [Hardware Error]: Error Addr: 0x000000000008bdc0

[ +0.000001] [Hardware Error]: IPID: 0x000700b020347000, Syndrome: 0x000000262a1f2603

[ +0.000003] [Hardware Error]: L3 Cache Ext. Error Code: 2

[ +0.000001] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: GEN

Looking this up seems to spit out a fault in the GPU, but this is way beyond my skill level here.

Thank you for any advise or tips for troubleshooting. I’m still within the return and replace period for my parts so if this indicates anything needing to be replaced, that’s always an option.

Thank you all very much!

Edited for formatting

  • brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    14
    ·
    edit-2
    8 days ago

    L3 MCE sounds like a CPU L3 cache error to me.

    First thing I’d do is update your BIOS, if you haven’t already. It’s possible that they could include fixes for newer CPUs.

    Second, disable XMP/EXPO in the BIOS and try your memory at stock settings for a while. The BIOS automatically changes a bunch of other voltages “by default” when using fast RAM, sometimes for the worse. My AsRock B650E, for instance, sets a VSOC that’s just too high.

    You could try tweaking values yourself. It’s a deep rabbit hole. I’ve gone down it for my 7800X3D and can help… But let’s cross that bridge when we get to it.

    But, unfortunately, you might just have a bad X3D CPU :(

    But again, I emphasize, mess with BIOS settings first.


    I don’t know why you think it’s the 9070, but it does have L3 cache too.

    If you suspect it, download OCCT and run its graphics stress test at variable loads. It has an error checking feature that’s perfect for problems like this.

    • Sewerking@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      6
      ·
      8 days ago

      In my most embarrassing moment here, the GPU being bad was how a redditor interpreted the error as reading this language is far above my pay grade and I trusted reddit for that part.

      However that’s why I’m here too, to get more thoughts from a community that I’ve known to be more tech savvy than not. It seems that it’s pointing towards a CPU issue more than a GPU issue, but I also have a lot more tests to run thanks to the advice I’ve been given here. I thought I was at a dead end really but I should be able to get to a better conclusion after using these troubleshooting methods.

      • brucethemoose@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        edit-2
        8 days ago

        Yeah, you can absolutely narrow it down.

        If memtest86 (or similar) doesn’t error out, that is kind of interesting. It would tend to show CPU errors too.


        I do emphatically recommend OCCT though. It has a error checking CPU benchmark you can customize to (for example) hit L3 hard.


        One other project I would recommend is CoreCycler:

        https://github.com/sp00n/CoreCycler

        You can customize it to cycle through each core, and like OCCT, leave a log of what core it was testing if your system crashes. But I like it for its auto curve optimizer functionality, where it will basically auto undervolt each of your cores individually and find the threshold they don’t crash at. Similarly, it might just so happen that you have a “dud” core, and it’d be very good at finding it.