About a month ago, I built a new computer (specs below), buying new parts for everything but re-using my existing video card. After setting it up, this computer worked perfectly fine for about two weeks, at which point it started intermittently losing power and restarting. After reboot, the windows event log would show Error 41 with an error code value of zero, seemingly indicating that the system did, in fact lose power. As we'll see below, these restarts seem to be mostly random
At that point, I began a long list of hardware and software changes and tests in an attempt to isolate the problem, however so far with no success (see below). So, I'm putting the question to you: what's going on here? What should I try to do next?
PCPartPicker Part List
CPU: AMD Ryzen 5 3600X 3.8 GHz 6-Core Processor ($234.99 @ B&H)
Motherboard: Gigabyte B550 AORUS PRO AC ATX AM4 Motherboard ($179.00 @ Amazon)
Memory: G.Skill Ripjaws V Series 32 GB (2 x 16 GB) DDR4-3200 CL16 Memory ($109.99 @ Newegg)
Storage: Western Digital Blue 1 TB M.2-2280 Solid State Drive ($99.99 @ Newegg)
Video Card: MSI GeForce GTX 970 4 GB Twin Frozr V Video Card (this is the existing video card, known to work fine in the previous system)
Case: NZXT H510 ATX Mid Tower Case ($69.98 @ Amazon)
Power Supply: Corsair RM (2019) 750 W 80+ Gold Certified Fully Modular ATX Power Supply ($124.99 @ Corsair)
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2020-10-24 17:11 EDT-0400
Initially, A fresh install of Windows 10 Professional was running on the system.
And, here are the steps I have taken to diagnose so far:
1. Thinking this was related to an RGB light I added to the case recently, I unplugged that from the motherboard. Reboots persisted.
2. Wondering if this was caused by overheating, I a) Installed OpenHardwareMonitor and observed that no temperatures in the system were very high, and b) turned the system off for a while; on restarting, the issue occurred relatively soon and under relatively low load. Conclusion: not caused by overheating.
3. To rule out memory being the issue, I removed one of memory stick from the system and observed the issue still happening. Then I swapped the memory sticks (so now the system was only using the other memory stick) and again observed the issue still happening. Conclusion: not caused by memory.
4. I wanted to rule out software issues before replacing parts, so at this point I reinstalled Windows 10 from scratch. I could not complete the installation without the system constantly restarting. From this point forward, the system did not have a working windows installation, and my benchmark for "things work now" has basically become whether I can finish installing windows. Note that the restarts are still more-or-less random -- they don't happen at the same point in the installation process every time.
5. I have three HDs in the system: the NVMe above, a 250GB SSD, and a 1TB spinner. I removed the latter two from the the system, just to see. No improvement.
6. Again just to see, I switched from one surge protector to a different one. No improvement.
7. I try reseating all the connectors on the motherboard, with a focus on all the power connectors (as well as the power supply side of the connectors). No improvement.
At this point, I strongly suspect a faulty power supply as the issue -- after all, the system is losing power. So, I do a replacement on the power supply, ordering a new copy of the exact same power supply (CORSAIR RMx Series RM750x). While I'm waiting, I tried one more thing:
8. I remove the NVMe SSD and plug in just the SATA SSD, and again try to install onto that drive. No improvement.
9. The new power supply arrives. I replace the power supply, leaving the existing cables in place (because I don't want to have to re-cable all that!). Try to install windows again, and it is still restarting.
10. Maybe it is the power supply cables? I replace all the cables with the cables from the new power supply. No improvement.
11. At this point, I realize there's one more software-ish thing I haven't done yet. I download the most recent BIOS version for the motherboard, and flash the BIOS to this version. No improvement.
At this point, I believe I have ruled out the PSU, RAM, HDDs, and any software causes. Remaining hardware components: the motherboard, the CPU, the video card. I feel like of those components, the motherboard is by far the most likely, so I order a new motherboard to replace the existing one: MSI MPG B550 GAMING CARBON WIFI AM4 AMD B550. At the same time, I order a new, higher quality surge protector, because I'm wondering whether the crappy one I have somehow caused a problem.
12. The new motherboard arrives. I do all the work to swap in the new motherboard, install the CPU, heatsink, and all the other components. I'm feeling really good at this point that it's going to work, so I install all three HDDs, connect it to the new surge protector, and completely seal up the case. I boot up again, and it's the exact same thing! The system loses restarts partway through the windows install process.
13. Now I'm wondering if the power really is unstable in my house, or something (this seems unlikely -- my old computer worked fine!) So I try the system on a new outlet, and then on an outlet in another room (and another breaker). Both times it has the same issue.
At this point, the remaining theories I have: A) the CPU; B) the video card; C) the power is unstable somehow in a way that step 13 didn't cover; D) some of the miscellaneous peripherals in the system are somehow causing a problem (the case power cable? the USB mouse? this all sounds ridiculous to me); E) There's something about the version of windows I was running / installing causing this. Most of these don't seem very likely to me.
14. I haul my old computer up from the basement and install the video card and my SATA SSD back into it -- it's now a full set of hardware. I install windows from the same USB stick, plugged into the same surge protector and power outlet. It works fine! I think this mostly rules out B), C) and E) from above.
And that's where I'm at. I believe I need to replace the CPU now, but I've already replaced two parts without luck now and I really don't want to return a third part that's actually working just fine.
The other random idea I have is to try the _new_ power supply on the _old_ computer, and see what happens, but that is a lot of work to wire up. I also want to try testing the RAM some more, I guess.
If anyone has had the patience to read this long post, can you see anything I'm missing? Is there something else I should be trying here? Any insight would be much appreciated.
|# ? Oct 24, 2020 22:01|
|# ? Nov 24, 2020 10:22|
So my guess here is that its some combo of things aren't co-operating. I'd say let's simplify things.
Do you have another GPU you can beg/borrow/steal to use? That's one of my complaints about the Ryzen X line, usually no onboard GPU to test with. My guess here is that maybe something with your GPU is not playing well with your MB. Does it restart while in Bios? Maybe leave it there for an hour or so and see if its stable.
If you can get another GPU, try with 1 RAM stick, and just 1 SSD with the new GPU. Then find a young priest and an old priest I guess.
|# ? Oct 27, 2020 01:43|
Just to follow up on this, replacing the the CPU seems to have 100% resolved the issue.
|# ? Oct 31, 2020 14:48|
I had the exact same issue with a 4 month old build and for me it was the bios version I was using was unstable running with xmp on. XMP off worked fine with no sudden power losses, and eventually an updated bios came out that solved the issue.
|# ? Nov 11, 2020 19:54|