Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
If it was a bad SATA cable, he'd have non-zero UDMA_CRC_Error_Count.

Adbot
ADBOT LOVES YOU

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

thebigcow posted:

The no-name Chinese companies run the factories that make the brand name cables.

This doesn't mean much. No-name Chinese factories can and do make products of wildly varying quality and price tradeoffs, and not always at the buyer's request. Brands which make high quality stuff in China put a lot of effort into quality control because most factories will happily go to great lengths to cut corners in very creative ways.

Also, when you buy a cheap widget, it may well have came off the exact same production line as a more expensive branded widget, but it's often a reject which failed the brand's quality inspection standards. Instead of destroying it, the factory sold it on the gray market. Or it might be literal after-hours production never intended for anything but the gray market, using lower quality materials, the same machines run by less competent operators, and so forth.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
If you're looking to build an ECC server (meaning that you are deciding that data integrity matters to you, and are willing to spend a little extra on it), trying to go super cheap on the motherboard doesn't make a lot of sense to me.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

fookolt posted:

How hosed are these two drives?

Deeply and thoroughly. Both have already failed to read correct data thousands of times, and have insanely high reallocation counts. Do not write anything to them, and try to get data off while they're still alive.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

D. Ebdrup posted:

Spindown and spin-up on modern drives causes so much wear and tear that it isn't worth it unless your drives actually spin down consequitively for ~12 hours a day

Sure about that? Most everything has those head parking ramps now, they're supposed to reduce the wear and tear of start/stop since the heads are never over the platter surfaces until the platters are at full speed. Manufacturers have also raised their lifetime start/stop cycle ratings way up from what they used to be 10-15 years ago.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Nulldevice posted:

My first concern is whether or not these disks are a ticking time bomb, or is this a false positive. There aren't any pending sectors or reallocation occurring on the disks. I'm going to take a look at the logs on the server and see if there's anything that might coincide with these errors.

I don't think there's even a false positive here. Look at the log of commands which led to the error -- they're SMART READ LOG operations, not ordinary sector reads or writes. The error itself is ABRT, i.e. command aborted, which just means the drive couldn't complete whatever was asked. And if I'm interpreting the relevant ATA standard correctly (based on 5 minutes of :effort: so take with a grain of salt), the "SC" status code of 01h means "Invalid Function Code".

In other words, while attempting to read a log, smartmontools asked the drive to do something the drive doesn't know how to do, and it logged an error. Don't worry about it.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Nulldevice posted:

Wonder why all the drives didn't log that error. Guess smartmon gave up after two drives. Go figure.

By any chance are all the rest of your drives something other than WD30EFRX firmware version 80.00A80? My guess is that it's an issue specific to that drive model and/or firmware rev. Note that smartctl reported "Not in smartctl database" for both of them. The database is a list of known quirks, capabilities, parameter interpretation methods, and so forth. It's not unheard of for the generic fallback support to have a few minor issues (which is why the database exists).

The problem with SMART is that it's a much bigger and messier interface than the main ATA command set which actually moves user data around. It's harder to implement 100% correctly, and for that matter lots of it is left as manufacturer defined so there is no such thing as 100% correct. Since you don't need 100% debugged SMART for a drive to work well and be useful, this is a recipe for minor SMART related bugs to make it into shipping products surprisingly often.

BobHoward fucked around with this message at 06:27 on Aug 29, 2014

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
It's certainly possible that some boards don't have them, but there are no real cost savings to be had by omitting the traces for the extra ECC byte lane.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Shaocaholica posted:

Found these real cheap:

Seagate 4TB 2.5" single drive 15mm high $159 shipped



http://www.ebay.com/itm/161821659087

Some random seller on ebay but the package I got in the mail was from Best Buy, free shipping no tax. Best buy front?

You can buy it from the actual Best Buy ebay storefront for the same price at the moment. Not sure why BB would operate a second account like that.

http://www.ebay.com/itm/Seagate-Bac...821659087&rt=nc

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Combat Pretzel posted:

Then again, since Seagate still parks the heads on platter, making their drives spin up and down isn't exactly advantageous for longevity.

Citation needed. Pretty sure they have load/unload ramps like everyone else.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

DrDork posted:

I'm not sure what the business side of things are, but with several major players already entrenched in the SSD market, and the margins apparently pretty thin, it may be that WD and Seagate simply missed the boat in the wake of the flooding

It's missing the boat, and I don't think the flooding played a significant role either. It's hard to turn a giant multinational company on a dime. These companies have had decades to ossify around designing and building HDDs, did just that, and have no homegrown SSD expertise. From what I've heard Seagate flailed around a bit trying to develop its own SSD technology, then decided to pursue acquisitions. That's how they ended up buying SandForce. And they're still flailing around a bit, because having a decent SSD controller in-house is one thing and productizing it is another. (and I do have to wonder how much of the SandForce engineering team is left after two acquisitions, and how that's going to affect the pipeline of future controller designs.)

Besides being slow to react, HDD companies have to swim upstream. Here's the rough breakdown of who makes flash memory:

Toshiba/Sandisk ~40%
Samsung ~30%
Micron/Intel ~20%
SK Hynix ~10%

Every one of these companies has in-house SSD controller design teams which get inside info from the flash manufacturing arm, and that does make a difference. The five companies involved in the top 3 flash manufacturing operations all sell the whole product, sometimes under their own name, sometimes under a wholly owned subsidiary (eg OCZ for Toshiba, Crucial for Micron). WD and Seagate have to fight against vertically integrated operations to make headway in the SSD market, and must buy media from those they hope to compete with. It's not a good position to be in.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

sleepy gary posted:

To further elaborate, what I am envisioning is a laptop and a handful of 2.5" USB hard drives, probably in RAIDZ-2. Frequent power loss is probable, so everything will have backup power (the laptop inherently plus something I will probably have to build for the hard drives*). So far this is the best I've come up with that meets the requirements (very small/portable, serviceable, ability to use laptop as a normal computer with the array and disks unmounted).

*Does anyone know of a <=15" laptop with 5+ USB 3.0 ports? :whitewater:

Jesus christ go buy yourself a Drobo Mini already. It's exactly what you want, right down to battery backup for data in its internal RAID controller RAM. A little expensive but it will actually work, and in a single small box instead of needing a rats nest of USB cables and poo poo. It has four drive bays and there are 2TB 9.5mm drives out there so max capacity is 6TB with single redundancy.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
I don't think the drobo mini needs the mSATA drive installed to have its battery backup work, it's for making the thing faster by essentially caching some of the data on the SSD. It sounded completely optional when I read up about it before (never bought one though because I don't really need it).

To me the odd thing is the combination of all these requirements you've got (frequent travel => highly portable, advanced RAID for high reliability, battery backup for even higher reliability) with near-zero budget. Sure, from one perspective $600ish is a lot for an empty container for four 2.5 inch drives, but that should be peanuts compared to the costs of routine business travel and it's only a one-time cost. And you were talking (joking?) about buying an entire laptop just to get one with a shitload of USB3 ports built in, which would be a few hundred at least (and probably more than $600 if you bought anything of decent quality which would stand up to lots of field use). If you have a need to collect terabytes of data and you're paranoid enough about losing any of it to invest in RAID at all (as opposed to just buying a handful of 2TB or Seagate's new 4TB 2.5" USB drives and using them solo), it's really weird that you're being so cheap. Feels very penny wise, pound foolish.

Oh, and my objection to the medusa setup isn't that it can't work, it's that it's lots more cables to pack/lose, many more connectors to plug/unplug/have break at inopportune times, very clumsy, and so on. Just one box external to your laptop is bad enough, but that is basically asking to have random annoyances and/or failures pop up.

Also the medusa might not let you pull off your laptop battery based backup power idea. Notice how that picture of the laptop with 4 drives has a power brick connected to that USB hub? (Power brick is out of frame, but you can see the cord.) Most laptops probably won't be able to supply enough current to their USB ports to run four or more external 2.5" HDDs.

e: f, b on a bunch of this oh well

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Skandranon posted:

I don't like them. If it shits the bed, there's a chance you don't get your data back, or need a duplicate enclosure to even try recovering data.

He said he'd be running the enclosure in JBOD mode and using software RAID, not relying on the enclosure's RAID volume format. You don't need to worry about getting an exact dupe of the original enclosure to solve enclosure-poo poo-the-bed problems when you do it like that.

A lot of the enclosures in the list he linked don't even have hardware RAID, they're just a USB3-SATA bridge plus a SATA port multiplier. Pick one with UASP support and performance might even be decent.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

salted hash browns posted:

This is helpful, thanks! Didn't know about UASP, will look into this.

UAS aka UASP is a new protocol for tunneling SCSI though USB that was introduced with USB 3.0 to solve the hideous inefficiency and lack of command queuing in the old method, which hadn't been updated much since USB 1.x. It's still not as good as a direct SATA connection, but I've seen SSDs in a 6G SATA to UASP USB 3 enclosure hit over 400MB/s. There's still a moderately large hit to random access performance with SSDs, but it's now fast enough to be close to native performance with HDDs.

Unfortunately not all usb3 to SATA bridge chips implement UASP, so be careful and look for it as a listed feature in enclosure specs. Sometimes they don't say one way or the other, but if they do tell you which bridge chip is in the enclosure you can usually find chip specs online.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

GokieKS posted:

As for CPU... you can probably get away with an ECC-supporting i3, but... I'd spend the extra bit of money and get a proper Xeon E3v3.

E3 silicon is the same as desktop i7 silicon, but with different fuse bits programmed. i3 is a different die, but only because it's dual core. The ECC controller is almost certainly exactly the same as the E3's.

There is nothing but market segmentation games preventing Intel from turning on all the Xeon features on every i3, i5, and i7. If you don't need four cores an ECC capable i3 is a great way to go IMO

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
I think you should see someone about your electronics hoarding problem

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

CommieGIR posted:

I actually just threw out a bunch of machines, but yes I'm a hoarder.

Mostly server stuff.

You seem to have a few 5.25" half-height and full-height drives, this shames me as the best I can do is a 3.5" half-height monolith (it is all black and very squared off) that was the last generation of HDD sold by Micropolis before they exited the HDD market.

(the only reason I haven't gotten rid of it tbqh is that it may or may not have data on it that I'd want to erase and I don't think I have anything which speaks SCSI anymore, so I'm pretty much a failure at hoarding properly, although I do honestly have way too many old useless computers)

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

redeyes posted:

my new 8TB drive shows up as 7.2TB.. this marketing poo poo is getting annoying

What is annoying is that most operating systems haven't yet acknowledged 30+ years of storage industry reality and >100 years of convention in other technical and industrial fields. It's such a simple thing to display file sizes and disk capacities using proper SI power-of-10 unit prefixes. OS X is the notable exception and guess what, feels good when your 8TB drive shows up as 8TB because it actually is 8.0 trillion bytes.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

OldSenileGuy posted:

1) The latest version of OSX removed RAID striping capabilities from Disk Utility.
and
2) Even when Disk Utility still had RAID striping capabilities, they were limited to RAID0 or RAID1.

I do still have a laptop running 10.8.5, so if the old Disk Utility DID actually have the capability to set up an enclosure as RAID5, I could use that, but something tells me it doesn't work that way and I'm stuck with RAID0.

Yes, OS X software raid never did anything but 0/1. I think you could layer them and make 0+1 or 10 but don't quote me on that. The ability to create RAID volumes in Disk Utility has indeed been taken away in 10.11 but you can still create them with the "diskutil" tool from the command line, or use raids created by older versions of OS X. Not sure you'd really want to do that since the removal from Disk Utility might be a sign they're planning to deprecate the soft RAID driver in future OS X versions. There's signs in the open source parts of OS X that this feature hasn't been actively maintained for a few years, so I can believe they might want to get rid of it.

If you want software RAID 5 on the Mac, there is a third party package called SoftRAID which is highly regarded and IIRC supposedly written by the same engineer who originally wrote Apple's built in RAID 0/1 driver. However, the edition which supports RAID 5 is a bit expensive.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Mr Shiny Pants posted:

Why not install OpenZFS? I have it running a RaidZ1 with ARC2 SSD end it performs very well. I was pleasantly surprised.

I didn't know that existed before now, but honestly, after a scan through the forum, there are a couple things that scare me: recent reports of kernel panics, and apparently the integration isn't good enough for Time Machine to back up from a ZFS volume yet.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Watermelon Daiquiri posted:

Why not just build something in this bad boy? :v: Yuuuge

I hope nobody ITT actually buys one of those. I have seen a server built with one where every drive bay was filled and everyone involved regretted their poor life decisions before it was first booted up.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Takes No Damage posted:

Just came here to bitch about that. Who's got two thumbs and didn't know what SAS was before tonight? :thumbsup: this guy :thumbsup: I glanced at the backs of the drives and just assumed they were SATA, oops. So now if I still want to get any use out of these things I'm thinking I'll need something like this RAID adapter as well as a couple of cables to split off the power and data, right? For 6 drives I'm just planning to RAID 0 and throw giant game installs on that's still a reasonable deal maybe I think...

You'd want something like this cable instead:

http://www.amazon.com/dp/B010CMW6S4

But seriously please reconsider before it's too late! This is an intervention. If it actually was going to cost you $0, that would be one thing, you could certainly justify playing around with them. But because you need a SAS controller, it's going to cost you real money (even if you scrounge for a cheaper used controller), and the problem with putting any money into this is the drives you're rescuing from the garbage are, in TYOOL 2016, objectively pieces of poo poo. With capacities like 146GB and 300GB and the SAS interface, they are likely to be 10K or 15K RPM drives. That means they're fast -- but they're fast for HDDs. Next to any modern SSD's F1 race car, they are econoboxes powered by hamsters. And the side effect of being what was once a super fast enterprise HDD is that they're going to be obnoxiously loud and power hungry, to the extent that standard PC case HDD cooling may not cut it (especially if you stack a bunch of them in close quarters). To cap it all off, these drives have done years of service, presumably powered 24/7. Even enterprise grade HDDs wear out. They are not on the good side of the bathtub failure rate curve.

BobHoward fucked around with this message at 11:45 on May 7, 2016

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
Many years ago 2.5" HDDs had a major shift in mounting screw hole location for pretty much the same reason, the difference being that everyone converted all their products and the old mounting hole pattern is a distant memory.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Storm- posted:

Yeah, figured the extra platter is in the way, though it does seem likes there is a bit of space in the plastic shell just between the platters and the PCB. Then just adjust the other two holes within the PCB. Though, maybe some internals are in the way, what do I know.

Plastic shell? I only see the HDD's chassis, which is cast aluminum with a really thick black anodize.

The conflict is definitely with the platters. Look at that 2TB drive which still has the old pattern, those 2 holes are almost at the platter centerline. If you want to fill the entire thickness of the 3.5" half height form factor with platters, those screw holes have to go.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

DrDork posted:

Your write performance grinds to a screeching halt. SMR is a great for increasing storage density, but any re-write operation is enormously painful because the drive has to up and move a crap-ton of other data to get at the appropriate "shingles." It's not like it can't do it, but it's gonna be slow as gently caress. To give you an idea of what "slow as gently caress" means, these guys did a simple RAID-1 rebuild on a pair of SMR drives and watched them crawl along at <10MB/s average speed, while a pair of HGST drives zipped along on the same task at over 150MB/s.

IIRC there's some work underway on shingle-aware SAS/SATA command set extensions so that the host can query the device about its shingle size and layout, and write whole shingles atomically. This plus support in higher layers would help a great deal with RAID rebuild. There's no reason why that task specifically can't go as fast as a conventional drive, because RAID rebuild is (or can be) very linear. That bad a slowdown has to be the drive being forced into doing many read-modify-writeback passes per shingle, instead of the optimal write once (no read).

It could also be interesting if some of the advanced filesystems get tuned to be shingle-aware. With a SSD for a write cache you could even make random writes reasonably fast.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
Getting in late on ESD chat: The fun thing about ESD damage is that it causes two categories of failure, catastrophic and latent. People intuitively think that because they don't see immediate (catastrophic) failures when they don't bother with ESD safety, well that must mean it's all pointless. But... You know how when you build up a static charge and touch something grounded you can get a visible spark, and it hurts? That much energy is way more than enough to vaporize a metal layer connection in a chip, because those "wires" are so very thin. Discharges you can't even feel are enough to do this. Such events don't always break the connection permanently: some of the metal re-solidifies into a new wire good enough for the circuit to work okay, but these often become very slow blowing fuses rather than normal connections.

This is one of the many fun ways ESD causes latent failures. It can take years for some of these to progress into an unambiguous "poo poo's broken" failure. Sometimes the IC is only mostly fine before it breaks completely, ie there are little glitches now and then.

One reason lots of people get away without taking ESD seriously when assembling PCs from whitebox parts is that it's a much bigger deal before components have been assembled onto a circuit board. Once they are, there's many more paths for charge to dissipate harmlessly in things which can take it, any given injection of charge has a decent chance of splitting between multiple components (decreasing the risk to each), and so on. It doesn't become impossible to harm things, but it is much less likely.

I've been a guest inside a professional PCB assembly and rework factory. You were not allowed to set foot on the factory ESD-safe floor without a shoe grounding strap (or ESD shoes, which basically have a grounding strap built in) and an ESD-safe labcoat, and would get in real trouble for not using a wrist grounding strap while handling stuff at a bench. You don't have to go this crazy when building a PC, but basic measures will get you a long way.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

wooger posted:

Ugh, seeing a FreeBSD committer (and founder) using a Mac for this demo always pains me. FreeBSD will never fix itself if none of the devs dogfood it on their own machines.

You do know where that specific fbsd committer / founder worked for over a decade right

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

SamDabbers posted:

No, it actually makes it worse. ZFS makes a copy of a block before changing it. Allocating a bunch of zero blocks just writes a bunch of zero blocks, which will be copied to a non-contiguous area when changed. A traditional file system will modify the block in place.

There is an API (posix_fallocate) which you can call to tell the system "This file will eventually be this large, please preallocate the space right now." This allows preallocation without actually writing zeroed blocks so long as the filesystem driver supports it (if it doesn't, the Linux VFS layer will fall back to writing zeroed blocks). The Linux ext4 and xfs drivers implement this feature, no idea about zfs.

The application needs to support calling posix_fallocate() too, of course.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

His Divine Shadow posted:

What's a good external hard drive for backup purposes (will act as a mid point between an offsite NAS)? Minimum 2TB storage. I'm wondering because supposedly USB 3.0 external drives I am looking at have speed ratings of 5mbps, I mean what the heck. I read there's some real cheap chinese poo poo on the market pretending to be USB 3.0 so I guess I ran across that. Hence I'm looking for a quality drive.

If you're worried about buying a cheap piece of poo poo, uh, don't do that? Seagate and Western Digital both sell branded USB 3.0 externals. Buy one of the 2.5" models because they're way better than 3.5" if you're carrying something around between two sites: smaller, lighter, usually don't need a power brick, and are much more resistant to mechanical shock.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

Paul MaudDib posted:

Actually that's the worst thing you could do. External drives typically have the cheapest shittiest drives and if there's ever an issue with a specific model (say, Seagate 3TB) that sucks so bad they can't sell it normally then you bet your rear end they're dumping that poo poo into external drives where nobody can see the model number before they buy.

I feel pretty safe in saying that the case for this is computer nerd mythology. I have never seen actual data showing that external HDDs are consistently less reliable, and in the case of the Seagate/WD 2.5" externals I recommended, the disks inside are usually only usable as externals because the USB3 bridge is integrated into the drive's controller PCB and there's no SATA connector, disproving the idea that they always just shovel random leftovers into a case and call it a day.

Also the guy's use case was moving data from site to site on an external so I presume the external isn't even the permanent long term storage, just used to sneakernet things.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull
Getting the magnets out is all the excuse you need imo.

If you ever do this again watch out for glass platters. Some HDDs use them because glass is lighter and can provide a smoother surface finish (allows heads to fly closer, which is important for increasing recording density). The glass they use is pretty tough and it really looks just like metal (because it's been plated with a metal oxide), so if you try to bend it you'll put a lot of force into it with no visible results right up until it shatters and possibly cuts the poo poo out of you while sending shards everywhere.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

SamDabbers posted:

Yes, PCIe is backward compatible from both the card end and the host end. They will negotiate the highest common speed between them.

What I meant was that a lane is a physical path. You don't get the bandwidth back to use elsewhere if you run a slower card in a faster slot.

Well, sort of. What's your definition of faster? Because wider is faster, and you can reallocate lanes between different slots based on what card is plugged in. (So long as the root complex providing those lanes is flexible enough, and also contingent on the motherboard having mux silicon to reroute the lanes.)

For example it's common for socket 11xx motherboards to have a pair of x16 PCIe card edge connectors which can accommodate any of three combinations:

1 card, up to x16
2 cards, up to x8 each
2 cards, cards are x16 capable, but only 8 lanes connected to each


The other way in which things are flexible is due to packet switching. Although it was designed to look just like classic parallel-bus PCI to software, underneath the external appearances PCIe is a packet network in which nodes talk to switches through point-to-point data links.

Any given switch IC has only so many physical lanes, so yeah that's a hard upper limit on the total bandwidth that could possibly pass through that switch. However, when multiple ports on a switch are competing for access to another port, you can get effects where one port slowing down gives something back to another port.

For example, consider a system with 3 nodes, A B and C, attached to a switch through x16 gen3 links. Both A and B are trying as hard as they can to monopolize C's bandwidth. In the absence of active QoS features in the switch, A and B should each get about half of C's link. If you drop A's link down to gen1 speed, however, it can only transmit and receive packets fast enough to use about 25% of C's gen3 link. B will now get at least 75% of C -- maybe more if the switch's fairness algorithms sometimes choose B over A. (The only way A can get all the way to using 25% is if it always wins arbitration.)

(real world PCIe link utilization is never going to sum to 100% because packet overhead etc but you get the point)

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

evol262 posted:

gpu encoding isn't great on Intel

FYI, the Intel encoder isn't the GPU. It's a special purpose functional block that's only an encoder - it's not fully programmable.

It's actually really good in some applications, e.g. videoconferencing on a laptop running on battery power. Not as good at high quality encoding as they originally tried to sell it, but very power efficient compared to either software or GPU encoding.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

EVIL Gibson posted:

Does anyone else try to order the same size and model drive from different stores so you'll be pulling from different batches?

My biggest worry is if one drive starts going the other drives might start going as well at the most critical time of reslivering.

Just remember Seagate drives getting the click of death at a certain read/write count for some drat reason.

Assuming you're talking about what I think you're talking about, that last one wasn't a QC or manufacturing defect, it was a firmware bug.

quote:

The firmware issue is that the end boundary of the event log circular buffer (320) was set incorrectly. During Event Log initialization, the boundary condition that defines the end of the Event Log is off by one. During power up, if the Event Log counter is at entry 320, or a multiple of (320 + x*256), and if a particular data pattern (dependent on the type of tester used during the drive manufacturing test process) had been present in the reserved-area system tracks when the drive's reserved-area file system was created during manufacturing, firmware will increment the Event Log pointer past the end of the event log data structure. This error is detected and results in an "Assert Failure", which causes the drive to hang as a failsafe measure. When the drive enters failsafe further update s to the counter become impossible and the condition will remain through subsequent power cycles. The problem only arises if a power cycle initialization occurs when the Event Log is at 320 or some multiple of 256 thereafter. Once a drive is in this state, there is no path to resolve/recover existing failed drives without Seagate technical intervention. For a drive to be susceptible to this issue, it must have both the firmware that contains the issue and have been tested through the specific manufacturing process.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

ILikeVoltron posted:

I guess what I don't understand here is how the I/O controller on the PCH chip works. Another thing is, just because an interface supports something doesn't mean you see that in the real world, so I'm a bit hesitant on it. I get what you're saying though, that even with plenty of overhead it's not something you'll bottleneck on, my only concern here is how the controller itself handles contention and splitting up a big block of writes across 10+ disks.

DMI2 and DMI3 are really just 4 lane Gen2 or Gen3 PCIe links. The total raw throughput of these links is therefore 2GB/s or 4GB/s before packetization and other overhead. 75% efficiency is achievable: I have measured 1.5 GB/s read throughput from a RAID0 of 4 SATA SSDs connected to an Intel DMI2 PCH.

A PCH chip is just a collection of PCIe IO controllers, each equivalent to what you might plug in to a PCIe expansion slot, plus a PCIe packet switching fabric so they can all share the one DMI (PCIe) link to the CPU. The CPU has a "root complex" (another switch fabric) to provide connectivity between DMI/PCIe ports and DRAM.

How PCIe devices and switches handle contention is a major chunk of the specification, but suffice it to say that PCIe has a credit based flow control scheme which does a good job of fairly allocating each link's bandwidth between all the traffic flows passing through it.

Also, the PCH doesn't split writes across disks. It isn't that smart. The OS decides what gets written where and then asks its SATA driver to do writes through an AHCI SATA controller, which in this case happens to be located in the PCH. The PCH SATA controller doesn't truly know whether the disk targeted by a specific I/O operation is part of a RAID or other storage pool, its job is just to perform whatever I/O it's asked to do.

(Real RAID controllers are different, they contain local intelligence to split incoming I/O according to RAID geometry, do parity calculations for RAID levels that need it, and so on. PCH RAID is software RAID behind the curtains, which is honestly just as good or better if all you're doing is RAID0/1/10.)

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

IOwnCalculus posted:

Bottom and side middle holes missing, or just bottom?

Seagate pulled both sets of middle holes on the Ironwolf, so there must be some advantage to doing so.

Middle holes get in the way of using the entire 1" height of the 3.5" drive form factor for platters. Laptop drives went through a similar mounting hole location redesign a while back for the same reason, but a bit more forced because once you go down to 9.5mm height you really have to ditch the middle holes.

The original mechanical design for this stuff was done so long ago that everyone involved expected that of course the entire bottom of the drive would be a PCB packed with electronics forever so why wouldn't you put mounting screw holes wherever the gently caress you felt like, there's no way to fill that space with platters anyways.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

D. Ebdrup posted:

Unless you're running ZFS, I can almost guarantee that you've silently lost data.
If you're running ZFS, you'll at least know which file is corrupted, assuming it couldn't self-heal it.

Eh. If the raw value of the uncorrectable error count (SMART attribute 187 decimal or BB hex) is still zero, barring firmware bugs in the drive he hasn't lost data. Yet.

eightysixed posted:

code:
RAW_READ_ERROR_RATE                       1048605
REALLOCATED_SECTOR_COUNT                  88
CURRENT_PENDING_SECTOR                    1

LIFETIME(HOURS)                           25379
This obviously needs immediate replacement, right? It's the harddrive I use at work :ohdear:

Whatever SMART reporting tool you got this out of, re-run it and see if you can find the uncorrectable error count I mentioned. Do that ASAP, and then once more after you're done copying all data off. If it's zero and stays zero, it is highly probable that you got everything off without loss. (The SMART uncorrectable error count is supposed to bump up by 1 every time the drive was asked to read a sector, and couldn't correct the data, meaning it ended up returning garbage.)

For future reference, don't worry about RAW_READ_ERROR_RATE. It's never straightforward to interpret and frequently will have insanely high looking values in a perfectly healthy drive. Reallocated sector count and current pending sectors are the important ones here.

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

EVIL Gibson posted:

That's strange. That's not how rsync works and if a program using rsync is not doing hash blocks then it's not using the best reasons rsync is a great tool.

gently caress unison then.

How do you think rsync computes the hashes for each block of a file? It has to read the whole file in, and do the number crunching.

Whenever a program reads data from a file, by default Linux caches it. Applications can take some care to avoid this caching, but last I checked rsync doesn't do any of that. (I did really check into that; we use rsync to move shitloads of data around at work and had some resource utilization issues as a result.)

So it's normal that rsyncing a shitload of stuff will chew through all your RAM. On the destination, too - both ends have to compute hashes. Rsync's original design goal was to minimize the amount of data exchange across slow network links rather than the amount of CPU/memory required to do the sync.

Much of the RAM use is relatively benign, in that it's just non-dirty file cache so Linux can drop it on the floor to free up memory as needed. Unfortunately the kernel's memory management isn't always perfect under memory pressure and this kind of thing can lead to needless swapping rather than the desired result.

Adbot
ADBOT LOVES YOU

BobHoward
Feb 13, 2012

The only thing white people deserve is a bullet to their empty skull

EVIL Gibson posted:

I thought how it was worded it was opening and getting metadata out like this doc file was last opened by so-and-so which doesn't even make more sense if it is really doing block by block hashing and not a god drat entire file hash.

If it is transferring the entire file each time it changes, it is not using rsync period.

I think you might be a bit confused about how rsync works? I'm not sure we're talking about the same things, though, but in any case I'll try to burble on a bit about rsync.

One of the ways rsync saves network bandwidth is that it hashes blocks, not just whole-file. If you're syncing one version of a file to another one, this allows rsync to transmit only the changed blocks.

The first step for rsync is to determine whether a file needs any syncing at all, then it has to figure out which blocks of it need syncing. Due to the way hashing algorithms work, you can compute per-block and whole-file hashes at the same time while passing over the data only once. I don't know that this is what rsync does, but it's what would make sense based on first principles. Both sides compute the hashes, then they compare the whole-file hashes, then if the file needs syncing they compare per-block hashes to determine the set of blocks which need to be copied over.

In any case, file contents must be read into memory to compute hashes, and the resulting indirect memory usage (due to Linux caching all file reads) is what makes a system chew through a ton of RAM when rsyncing a large amount of data. This can happen even if the source and destination are sufficiently in sync that the amount of data moved is small. I didn't see anything in the OP to indicate that wossname the tool being talked about wasn't actually using rsync as the back-end.

BobHoward fucked around with this message at 21:28 on Aug 29, 2017

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply