Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
ToiletDuckie
Feb 18, 2006
Problem description: Since building this PC in January 2014, it's been experiencing sporadic crashes. Initially, these were completely random BSODs that would happen while gaming, browsing, or sometimes just by leaving the PC on overnight. Updating all drivers (particularly chipset) seemed to fix those crashes, but I still experience random crashes in games. Adobe Flash Player will also randomly crash out while viewing Twitch.tv. Neither of these issues cripple the PC. The games will crash to Desktop but the PC seems stable otherwise. Flash Player crashes can be resolved by just refreshing the page. The PC will stay up and running (from an OS/browsing/text editing/etc. perspective) for however long it takes for the next arbitrary Windows Update related reboot to occur, at least. However, it's getting frustrating that I never know how long a game will stay up and running so I figure I should ask for help. For reference, here's the kind of behaviors I'm seeing:

I can play a game like Sanctum 2 for an hour before it hangs. When it hangs, oddly, the game doesn't crash. I just lose video and the game refuses to respond for a minute before coming back. I can play another ~30 minutes to an hour before it happens again.

Battlefield 4 will sometimes decide it wants to crash on "Joining Game" every time I try. That might be Origin related, though. Other times, it'll load just fine and I can play for 1-5+ hours without crashing. When it crashes, the game dies but my PC keeps running and I can just rejoin.

When I was playing Wolfenstein: New Order, it'd crash to desktop after 15-30 minutes every single time. I bumped up to the latest beta ATI drivers and that dropped the crashes down to once every 2-3 hours.

To the best of my knowledge, there's a subset of games that haven't really crashed on me or maybe crashed once in an entire playthrough: Darkness II, Saints Row IV, Rogue Legacy, Space Run, and Killer is Dead are the more recent ones.

At any rate, I'm stumped. It doesn't appear to be heat related because I've never had an overheat warning and the times I've monitored temperatures they're still stable under 100% load. It could be driver related since updating those initially solved the majority of the major problems I've had, but I've got everything at its latest version now. I almost want to be bitter and say it's just certain game's Windows 8.1 compatibility(or my lack of familiarity with tweaking it), but that'd be unfair.

Attempted fixes: Updated all motherboard, chipset, and device drivers to the latest available. Updated the motherboard BIOS version. Removed the factory OC on my GPU. Removed the arbitrary OC on my CPU that MSI was attempting to do automatically.

I've tried running Memtest86 overnight before. That produced no errors. I've tried running Prime95 overnight. That produced no errors. Today, I got annoyed and tried the Furmark burn in test, which was 15 minutes @99% GPU/94% Fan/~87 Celsius maximum temperature. I repeated the test with Dynamic Camera / PostFX for kicks, same results.

Recent changes: The only major change made since building the PC is updating the motherboard chipset drivers and BIOS. That resolved the BSODs I was getting. Now if it crashes it just crashes to desktop.

--

Operating system: Windows 8.1 Pro 64 bit

System specs: Home built
CPU: Intel Core i5-4670K Haswell Quad-Core 3.4GHz LGA 1150 84W Desktop Processor
RAM: CORSAIR Vengeance 16GB (2 x 8GB) 240-Pin DDR3 SDRAM DDR3 1600 (PC3 12800) Desktop Memory Model CMZ16GX3M2A1600C10B
PSU: CORSAIR HX series HX650 650W ATX12V v2.2 / EPS12V 2.91 SLI 80 PLUS GOLD Certified Modular Active PFC Power Supply
Mobo: ASUS Z87-PRO (V EDITION) LGA 1150 Intel Z87 HDMI SATA 6Gb/s USB 3.0 ATX Intel Motherboard
HD: SAMSUNG 840 EVO MZ-7TE500BW 2.5" 500GB SATA III TLC Internal Solid State Drive (SSD)
GPU: XFX Double D FX-795A-TDJC Radeon HD 7950 3GB 384-Bit GDDR5 PCI Express 3.0

One thing to note: The GPU is a leftover from my previous PC. It never gave me any trouble in that one, though.

Location: US

I have Googled and read the FAQ: Yes

ToiletDuckie fucked around with this message at 02:59 on Aug 19, 2014

Adbot
ADBOT LOVES YOU

Factory Factory
Mar 19, 2010

This is what
Arcane Velocity was like.
Oddball and inconsistent problems like these are the worst. Do you still have the power supply from your old system? Can you test the new hardware with it? Failing power supplies can do weird things.

Zarc
Jul 25, 2014
Was the previous system the graphics card in windows 8.1 as well? If not, then my first guess would be sucky graphics card drivers. When I used to use ATI card, I had a lot of problems with horrible drivers that had the same kind of symptoms. (ATI forums were filled with very upset people at the time...haven't checked recently).

ToiletDuckie
Feb 18, 2006
Thanks for the replies.

Factory Factory posted:

Oddball and inconsistent problems like these are the worst. Do you still have the power supply from your old system? Can you test the new hardware with it? Failing power supplies can do weird things.

I don't think so. It was a Corsair TX650 that never gave me any trouble. I think I recycled it after the first month or two after building this PC since, well, I assumed the crashing was memory or video card related and the PSU was ~5 years old. Since the HX650 was new when I purchased it, I'd assume that it wouldn't be failing. Even if it was, I'd expect more crashes over time as its performance degrades.

What I can try doing is removing all miscellaneous peripherals (2-3 game controllers, extra USB/Ethernet adapter, etc.) to see if a lowered power draw changes things. I might try that for a week and see how things go.

Zarc posted:

Was the previous system the graphics card in windows 8.1 as well? If not, then my first guess would be sucky graphics card drivers. When I used to use ATI card, I had a lot of problems with horrible drivers that had the same kind of symptoms. (ATI forums were filled with very upset people at the time...haven't checked recently).

Well, I was using Windows 7 Pro in my previous system. I still had the same 7950 for at least half a year in that machine (probably an entire year). It ran in there with the factory OC no problem. The limiting factor in my old PC was the i5 ...750? Whatever the first gen i5 processor was.

As for drivers, I updated yesterday to the latest beta drivers for the card. I played BF4 for ~2 hours without a crash, but that's not particularly unusual. I'll give it some time and see how things go.

Edit: Well, unplugging my extra controllers didn't stop a random crash to desktop while playing Deponia of all things (after approximately an hour of playing). Deponia. 2D point and click adventures can cause crashes? It's clearly not GPU stress related.

ToiletDuckie fucked around with this message at 05:02 on Aug 19, 2014

Zarc
Jul 25, 2014

ToiletDuckie posted:

Thanks for the replies.


I don't think so. It was a Corsair TX650 that never gave me any trouble. I think I recycled it after the first month or two after building this PC since, well, I assumed the crashing was memory or video card related and the PSU was ~5 years old. Since the HX650 was new when I purchased it, I'd assume that it wouldn't be failing. Even if it was, I'd expect more crashes over time as its performance degrades.

What I can try doing is removing all miscellaneous peripherals (2-3 game controllers, extra USB/Ethernet adapter, etc.) to see if a lowered power draw changes things. I might try that for a week and see how things go.


Well, I was using Windows 7 Pro in my previous system. I still had the same 7950 for at least half a year in that machine (probably an entire year). It ran in there with the factory OC no problem. The limiting factor in my old PC was the i5 ...750? Whatever the first gen i5 processor was.

As for drivers, I updated yesterday to the latest beta drivers for the card. I played BF4 for ~2 hours without a crash, but that's not particularly unusual. I'll give it some time and see how things go.

Edit: Well, unplugging my extra controllers didn't stop a random crash to desktop while playing Deponia of all things (after approximately an hour of playing). Deponia. 2D point and click adventures can cause crashes? It's clearly not GPU stress related.

My guess would still be the drivers. The last time I had the issue, was a 3d game (City of Heroes)...though screen was mostly static so graphics card wasn't under heavy load. ATI has a bit of a history of boom/bust cycles with their graphics drivers from my experience

Factory Factory
Mar 19, 2010

This is what
Arcane Velocity was like.

ToiletDuckie posted:

Since the HX650 was new when I purchased it, I'd assume that it wouldn't be failing. Even if it was, I'd expect more crashes over time as its performance degrades.

The Corsair TX and HX power supplies have a particular really-lovely failure mode in which they quietly hand out power that looks fine on a multimeter but is actually dirty and out-of-spec if you check with a $1,000 oscilliscope nobody has. I had a TX750 once that quietly murdered 11 hard drives in a year before I wised up and replaced it. Rest of the system worked fine, everything seemed right. But some brands of hard drives would last months at most, weeks at the least. Every single hard drive ever hooked up to that PSU kicked the bucket within a year and a half. Change the PSU, and suddenly no more murdered hard drives.

It is worth testing with a different power supply. There is no such thing as "Well, it's brand new and from a good manufacturer, so it can't be failing." poo poo happens. Bad units make it out of the factory.

If you can't swipe a PSU from another machine temporarily, here's what I would do (and did): Buy a new PSU, make the swap. If that fixes the problem, RMA the old unit for a warranty replacement (Corsair is excellent here) and sell the shiny new replacement to recoup costs.

If it doesn't fix the problem, return the new unit to the store and re-install the old unit.

ToiletDuckie
Feb 18, 2006

Factory Factory posted:

It is worth testing with a different power supply. There is no such thing as "Well, it's brand new and from a good manufacturer, so it can't be failing." poo poo happens. Bad units make it out of the factory.

I may have to try this. For kicks, I tried changing settings in BIOS from what I had to "Optimized Defaults". That lasted ~20 minutes on a CPU/GPU/Power stress test with no issues, but legitimately BSOD'd my PC while watching a Twitch stream (with nothing else running) ~2 hours later. "Kernel Floor Error" or some such, but before I fixed the BSODs previously I saw all sorts of random BSOD messages implying IO, USB, etc.

Something's definitely off. Considering a BIOS change made the issue worse (BSOD vs. simple app crash) I have to assume that it's a RAM, Power, or CPU issue. Whatever's broken is clearly not completely defective. It's just defective enough to ruin things sporadically... I might try doing things the hard way by removing one stick of RAM at a time, swapping them on BSOD, etc. to rule that out. Alternatively, I could just take your advice and order a new PSU to see how things go. Maybe swapping what rail is plugged in where could help?

Edit: One other thing I just thought of related to power: I have a ~5-6 year old CyberPower 1350AVR that's lasted through a few blackouts/power surges/etc. Is it possible that it's slightly worn out or otherwise incapable of providing 100% stable power? I guess it wouldn't hurt to try plugging the PC directly into the wall (during sunny weather) to see how things go?

ToiletDuckie fucked around with this message at 06:03 on Aug 23, 2014

Factory Factory
Mar 19, 2010

This is what
Arcane Velocity was like.
Power and computers can be weird. But the UPS almost definitely isn't affecting the computer electronics, because the power supply has a "wall side" and a "PC side" that exchange energy via a transformer but do not exchange current. Dirty power from the wall could harm and eventually burn out part of the PSU, but it wouldn't directly affect the PC-side components unless the PSU were already failing at its job.

The rail swap idea isn't a bad one, but it doesn't apply - it's a single-rail design.

So, RAM, PSU, CPU.

CPUs are generally rock solid - if it's not DOA and it's not fried by a power surge, CPUs generally are the single most reliable piece of hardware in a system, so rule that out. Check the others first.

The randomness can indicate either RAM or PSU. The fact that you've gone through memtest and prime95 testing without turning up problems suggests against RAM, because those are exactly the tests for RAM problems. So of those three, that leaves the PSU.

There are some other possible answers - a subtly failing SSD can introduce random weirdness, as well as a driver that is dealing with subtly-failing hardware. Expansion cards, spare hard drives, USB peripherals - all of them could cause really wonky behavior that would clear up immediately if they were removed. Heck, even a bad video cable can cascade into OS instability.

So here's something you can do before you replace the power supply: Strip down the system as far as you can and try it with as few things plugged in as possible. Even try a different keyboard and mouse if you can. See if it still crashes then. If so, I'd edge towards the power supply. If not, start plugging things back in one by one until the system starts crashing again.

ToiletDuckie
Feb 18, 2006

Factory Factory posted:

So here's something you can do before you replace the power supply: Strip down the system as far as you can and try it with as few things plugged in as possible. Even try a different keyboard and mouse if you can. See if it still crashes then. If so, I'd edge towards the power supply. If not, start plugging things back in one by one until the system starts crashing again.

Yeah, I'm getting close to being down to this option. While checking to see if I had a spare PSU, I found my old motherboard/RAM/CPU. I tried that RAM (2x4GB) and while the stability test was promising (lasted an hour with no errors @ 100% CPU / RAM usage) I still got an app crash after ~1 hour of playing Just Cause 2.

So, I think the CPU/RAM (either pair) are fine. It's either PSU or drivers at this point I guess. It could be the GPU, too, since the only things crashing are games, but I'd expect some artifacting or repeatable crashing behavior if it was the hardware dying.

Then again, the fact that my OS is seemingly rock solid ("only" games are crashing at this point and even the rare Flash Player crash is fixed with a page refresh) makes me think it's PSU or drivers. I don't know how to interpret the fact that even "low stress" games will crash, though. Is there such a thing as a "GPU only" test? It'd be nice if I could max out the GPU while not using the rest of the system so I could possibly rule out the PSU being the culprit.

Edit: For kicks, I downloaded the Unigine Heaven benchmark and left it running for an hour on my main screen while I had a Twitch stream on my second monitor. Despite hammering the GPU, the temperature was never higher than 75C and nothing crashed. This makes me think: what's the difference between the benchmarks/stress tests that don't appear to fail and games? Input... time to unplug everything.

ToiletDuckie fucked around with this message at 17:28 on Aug 24, 2014

Factory Factory
Mar 19, 2010

This is what
Arcane Velocity was like.
Unigine Heaven is good. Also try OCCT (which has multiple tests) and Furmark.

Beware of Furmark for multiple reasons, though. It loads a card INCREDIBLY heavily, and your system will draw more electricity running that than it probably ever has before or ever will after. But because of this, graphics drivers are now recognizing it as a "power virus" and throttling Furmark, so it may not be the kind of heavy-duty stress test you need after all.

But even all those tools aren't sure bets. Different games poke and prod different parts of the CPU. I heard direct experience from a guy who overclocked his video card to the utmost limits, totally stable in the likes of Metro, Heaven, Furmark, and - generally - DirectX 10 and 11 stuff. But start up Starcraft 2, a DX9 game, and it crashes almost instantly.

Factory Factory fucked around with this message at 20:42 on Aug 24, 2014

ToiletDuckie
Feb 18, 2006
Well, I guess I can rule out peripherals. I tried unplugging everything except the mouse/keyboard. Still crashed in BF4 after ~1.5 hours. Tried swapping my normal mouse/keyboard for the $5 Walmart ones I have for my work laptop: BF4 Crashed within ~30 minutes. Well, when I say "crash" I mean the screen freezes while the game keeps running from a processor/RAM utilization perspective and it has to be forced to close...

So, that leaves GPU, PSU, and drivers. I'm going to do a full uninstall/reinstall of the video drivers and see if that helps. Ugh.

Regarding tools: Yeah, I used OCCT on PSU test mode to test the RAM when I installed it earler. Unigine afterward for an hour to test the GPU. Needless to say, 1 hour of stability in each tool didn't mean squat. Same problems...

ToiletDuckie
Feb 18, 2006
I think I figured it out. I updated BIOS (again) and completely uninstalled and reinstalled the ATI drivers. I also noticed a power setting for letting USB devices power down when unused. I turned that off just to see if it helped. I also noticed that, for whatever reason, I had ~6-7 Audio Devices after reinstalling the ATI drivers. It's almost like their driver cleaner doesn't recognize the HDMI Audio "devices" that it creates even when you tell it not to install the HDMI audio portion of the drivers?

Either way, some combination of the above updates seems to have corrected my problems. I was able to play Sunday afternoon without crashing as well as ~2 hours tonight without problems. I'm leaning towards it being either the wonky ATI driver uninstall/fake audio components or the USB power setting since the BIOS update was just described as "increased performance" vs. the previous one being "increased stability". Either way, I'm going to leave my configuration alone for a day or two and see how things go.

Thanks for the suggestions. If I'd checked power settings/tried reinstalling the drivers earlier I might've avoided some of these headaches.

Mammalian
Nov 9, 2011

Not just any Jesus Mammalian Jesus

Factory Factory posted:

The Corsair TX and HX power supplies have a particular really-lovely failure mode in which they quietly hand out power that looks fine on a multimeter but is actually dirty and out-of-spec if you check with a $1,000 oscilliscope nobody has. I had a TX750 once that quietly murdered 11 hard drives in a year before I wised up and replaced it. Rest of the system worked fine, everything seemed right. But some brands of hard drives would last months at most, weeks at the least. Every single hard drive ever hooked up to that PSU kicked the bucket within a year and a half. Change the PSU, and suddenly no more murdered hard drives.

It is worth testing with a different power supply. There is no such thing as "Well, it's brand new and from a good manufacturer, so it can't be failing." poo poo happens. Bad units make it out of the factory.

If you can't swipe a PSU from another machine temporarily, here's what I would do (and did): Buy a new PSU, make the swap. If that fixes the problem, RMA the old unit for a warranty replacement (Corsair is excellent here) and sell the shiny new replacement to recoup costs.

If it doesn't fix the problem, return the new unit to the store and re-install the old unit.

I'm using a Corsair TX750 v2, and uh, I have hard drives from four years ago still working. One 500GB Seagate Barracuda failed, and my main 1TB OS drive for a while failed, but everything else works fine.

edit: I had the two failed drives when I made my second build back in March 2010, so they lasted over four years.

Adbot
ADBOT LOVES YOU

Factory Factory
Mar 19, 2010

This is what
Arcane Velocity was like.
It was an example from experience about the particular way a particular PSU model could go bad in one case, not a blanket condemnation of all units of that model. I was trying to argue against the absolutism of "It's a good model PSU, so it can't be a bad unit."

  • Locked thread