Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
Problem description: Attempting to boot into Windows 10 results in one of two errors: WHEA UNCORRECTABLE ERROR or MACHINE CHECK EXCEPTION

Attempted fixes: -Disconnected all non essential components to minimize PSU load; no change, voltages are normal
-Checked heat
-Disconnect GPU and use only onboard graphics; no change
-Attempt to boot into a LINUX based Intel Processor Diagnostic Tool from a USB drive - failed to boot (alternates between MACHINE CHECK error and FAILURE TO SYNC)
-Tested each stick of RAM individually; no change, failure to boot
-Attempt to enter recovery mode in Windows 10 to run CHKDSK - fails with same error as listed in problem description
-Reset BIOS, disabled any overclocking settings

-(Plan to attempt tonight) Boot into my system drive from another Windows machine to see if it loads, further ruling out OSX

Recent changes: New Windows Insider Preview update half a week-ago; an old USB hub stopped working (it would play the disconnect sound repeatedly) but otherwise fine

Operating system: Windows 10 64-bit


System specs: i7 6700k, Gigabyte Motherboard (z170XP-SLI), 16gb RAM, GTX 970, Intel SSD as my boot drive, 3TB HD for media storage, 500W PSU Corsair CX500M

Location: Canada

I have Googled and read the FAQ: Yes.


How certain can I be this is a faulty CPU? Is there something else it could be?

PoizenJam fucked around with this message at 01:08 on Apr 18, 2016

Adbot
ADBOT LOVES YOU

Kazinsal
Dec 13, 2011



Machine check exceptions are usually nasty CPU problems, though it can also be a voltage issue. How old is your power supply? Was it a pack-in with your case?

Unfortunately, MCEs are one of the hardest things to diagnose because the codes for them are inconsistent between platforms at best and completely undocumented at worst, and they're symptoms of something going wrong inside the processor with no really indication as to the cause.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
Whole system is new, purchased all at once back in the fall.

I have a Corsair CX500M PSU and a Define R5 case.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
I just flashed my BIOS to newest version, no change from that either.

Edit: Welp, doesn't matter now. I removed the CPU to get the info I needed for a warranty replacement, and I broke a pin in the socket. So my MOBO definitely has to be replaced now...

PoizenJam fucked around with this message at 22:03 on Apr 15, 2016

Zogo
Jul 29, 2003

edit: ^^ :doh:

JVNO posted:

-Reset BIOS, disabled any overclocking settings

Did you clear CMOS using the motherboard jumper?

If you did I'd probably try booting into a Linux OS through USB with all the HDs disconnected and see if that worked.

Also, try onboard video if your motherboard has it.


If none of that works I'd put the motherboard on a nonconductive surface and use a paperclip/key to bridge the motherboard power pins to turn it on. That'd eliminate the case/power button as an issue.


JVNO posted:

How certain can I be this is a faulty CPU? Is there something else it could be?

Unlikely if the computer has been running okay for a few weeks or more.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
The Intel Processor Diagnostic Tool runs from a Linux bootable USB. I tried using it, and it also threw a Machine Check Exception error. And I mentioned in the OP I physically removed my graphics card and all non-essential parts, switched to on board defaults, and still encountered the error.

So I popped my SSD out and I'm going to use my mother's laptop to try and boot from it, just to check the system installation. I suspect there's nothing wrong with the windows install, however, since the Machine Check Exception happens with bootable media too.

And yeah I don't know how I hosed that up... $180 motherboard gone... Been building systems for over a decade and this is the first time I hosed something up physically. gently caress. Not exactly something I can RMA.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
Further updates:

I booted into the hard drive from another computer, using an adapter. This resulted in an automatic CHKDSK that found literally thousands and thousands of small errors, followed by a smooth boot with everything working fine.

I'm left wondering if these errors were accrued during the night it spend blue screening and restarting repeatedly before I discovered the error, or if these errors were the source of the original problem. If it's the latter I've unfortunately broken a mother board for literally no reason. On the plus side, I can be up and running again come as soon as I replace the MOBO.

If instead the CPU was the source of the problem, would I risk damaging the new MOBO by testing it out?

Zogo
Jul 29, 2003

JVNO posted:

If instead the CPU was the source of the problem, would I risk damaging the new MOBO by testing it out?

No, using a dying PSU is usually the thing that will destroy motherboards and other components.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
My voltages, temps, and load are pretty good/stable so I don't think PSU is a worry. Thanks.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
I replaced my broken motherboard. I am still getting WHEA Uncorrecteable Error and Machine Check Exception when I try to boot into: (1) My boot drive (That works fine on other systems) (2), A Windows Bootable USB (That also works fine on other systems), and (3) The bootable Linux based Intel Processor Diagnostic Tool on USB

Note this is just with the 2 different Motherboards, a single stick of RAM, the onboard video card, and only the bare hookups, with the default and updated BIOS on each, at base clock...

The only other thing it could possibly be at this point is the power supply, however using the bare bones system my voltages seem healthy (VCORE 1.272, VCCSA (1.068), DRAM (1.212), 3.3V (3.304V), 5V (5.100V), 12V (!2.0124 V). Temperature is normal for system and CPU (high 30degree Celsius, low 40). Tried raising voltage at base clock speed even, no change.

I'm ready to concede it must be the processor. I cannot think of anything I haven't tried, but I welcome suggestions

PoizenJam fucked around with this message at 01:07 on Apr 18, 2016

Zogo
Jul 29, 2003

JVNO posted:

I'm ready to concede it must be the processor. I cannot think of anything I haven't tried, but I welcome suggestions

I'd assume PSU/RAM/(even a case power button/interference kind of issue) before CPU but you could have one of the really rare CPUs that fails after working initially.

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
If it were RAM, it would have to be both sticks going bad simultaneously (I have 2x8GB - and have tried booting individually with both) which seems exceedingly unlikely even for a matched pair. As for a PSU- if it won't boot under minimal load, and there's nothing wonky about the voltages or temperatures, what other signs would there be?

Either way, Intel is sending a replacement so I'll find out soon either way.

Zogo
Jul 29, 2003

JVNO posted:

If it were RAM, it would have to be both sticks going bad simultaneously (I have 2x8GB - and have tried booting individually with both) which seems exceedingly unlikely even for a matched pair.

Yes, it's unlikely but totally possible.

JVNO posted:

As for a PSU- if it won't boot under minimal load, and there's nothing wonky about the voltages or temperatures, what other signs would there be?

Are you using a multimeter or getting those readings from the BIOS? Gigabyte is the #1 offender for displaying inaccuracies in the BIOS.

There are many components inside the PSU that can fail (bulging electrolytic capacitors, fan failure which could cause damage to the output rectifier and/or switching transistors.) When something like this happens the PSU will likely start sending out the wrong electric current/voltage.

One of the big problems in trying to perfectly diagnose a failing/dying PSU is that sometimes issues appear and disappear sporadically. I've seen individual wires fail and then start working again. e.g. PSUs working enough to keep a system running but working only a fraction of the time on POST etc.


PS the Corsair CX line doesn't have the greatest reputation. It should work but I'd never recommend using a CX500M with a newer GPU.

Zogo fucked around with this message at 06:28 on Apr 19, 2016

Adbot
ADBOT LOVES YOU

PoizenJam
Dec 2, 2006

Damn!!!
It's PoizenJam!!!
Good news- I was indeed one of those rare people who have a processor that spontaneously fails. Intel expedited an RMA yesterday, and I received the replacement an hour ago. Popped it in and all my problems were solved!

  • Locked thread