Cloud Computing: Mostly fog machines in other people's datacenters

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Cloud Computing: Mostly fog machines in other people's datacenters

«‹›10 »

fluppet: Feb 10, 2009

You can allow them to create users and attach those users to an existing group but not let them edit/create any permission themselves

# ? Oct 14, 2015 16:01

Adbot: ADBOT LOVES YOU

# ? Apr 28, 2024 13:12

Thanks Ants: May 21, 2004; #essereFerrari

That sounds ideal, I will look into it.

# ? Oct 14, 2015 17:19

Thanks Ants: May 21, 2004; #essereFerrari

So I have this so far:

code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "NotAction": [
                "aws-portal:*modify*",
                "iam:CreatePolicy*"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Deny",
            "Action": [
                "cloudtrail:DeleteTrail",
                "cloudtrail:StopLogging",
                "cloudtrail:UpdateTrail"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Deny",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::<redacted>"
            ]
        },
        {
            "Effect": "Deny",
            "Action": [
                "iam:*policy*"
            ],
            "Resource": [
                "arn:aws:iam::aws:policy/AdministratorAccess",
                "arn:aws:iam::<redacted>:policy/PowerUserAccess*",
                "arn:aws:iam::aws:policy/IAMFullAccess"
            ]
        },
        {
            "Effect": "Deny",
            "Action": [
                "iam:*"
            ],
            "Resource": [
                "arn:aws:iam::<redacted>:group/Administrators",
                "arn:aws:iam::<redacted>:user/<redacted>",
                "arn:aws:iam::<redacted>:user/<redacted>"
            ]
        }
    ]
}

Which as far as I can tell is working - users it applies to shouldn't be able to apply pre-existing policies that grant them more access than they have at the moment, they can't create a new policy that grants them that access, and they can't edit the Administrators group to add themselves to it.

I'd like to neaten up the two users at the end though - they are members of the administrators group and the IAM Policy Simulator showed that ChangePassword was still allowed. Is there a way to evaluate group membership in the policy?

Thanks Ants fucked around with this message at 22:32 on Oct 14, 2015

# ? Oct 14, 2015 22:14

fluppet: Feb 10, 2009

i think your making this a little harder than it needs to be

code:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt0000000000001",
            "Effect": "Allow",
            "Action": [
                "iam:CreateAccessKey",
                "iam:CreateUser",
                "iam:ListUsers",
                "iam:ListGroupsForUser"
            ],
            "Resource": [
                "arn:aws:iam::accountnum:user/bar-*"
            ]
        },
        {
            "Sid": "Stmt0000000000001",
            "Effect": "Allow",
            "Action": [
                "iam:GetGroup",
                "iam:ListGroups"
            ],
            "Resource": [
                "arn:aws:iam::accountnum:group/foo-*"
            ]
        },
        {
            "Sid": "Stmt0000000000001",
            "Effect": "Allow",
            "Action": [
                "iam:AddUserToGroup"
            ],
            "Resource": [
                "arn:aws:iam::accountnum:group/foo-*"
            ]
        }
    ]
}

allows you to create iam accounts that have to have their name prefixed with bar- and add them to groups that start foo-

# ? Oct 17, 2015 09:24

Megaman: May 8, 2004; I didn't read the thread BUT...

I'm using AWS I've created a VPC with public and private subnets. All subnets can access the internet, the private subnets obviously get there via NAT instance.

My problem: I need to create an s3 bucket that is locked down to an instance, group of instances, or subnet in the private subnet that they can access.

Things I've tried:

Opening the bucket to 0.0.0.0/0 works, locking the bucket locking the bucket down to a specific range (10.0.0.0/0, my vpc is 10.53.x.x) I can't access it in the private subnet. I've attached a role to the machine that has privileges to do anything to any resource in AWS and even this doesn't work.

Does anyone have any suggestions, I've read that s3 endpoints are a solution, but I wanted to see if I could do it the way I figured it would work first, has anyone else been through this particular problem?

# ? Oct 17, 2015 16:00

fluppet: Feb 10, 2009

You'll need to use a s3 endpoint for this or lock it down to the public in addresses alternatively you could set up a set of app keys with get/put permission and lock it down that way

# ? Oct 17, 2015 19:22

H2SO4: Sep 11, 2001; put your money in a log cabin; Buglord

incoherent posted:

Did they port DFSR to azure yet? I would love to see Microsoft back that with 5 9's.

I don't see the problem. Replication errors and empty targets should be pretty easy for them to keep up since it's the natural state of the technology.

# ? Oct 17, 2015 19:48

pram: Jun 10, 2001

Megaman posted:

I'm using AWS I've created a VPC with public and private subnets. All subnets can access the internet, the private subnets obviously get there via NAT instance.

My problem: I need to create an s3 bucket that is locked down to an instance, group of instances, or subnet in the private subnet that they can access.

Things I've tried:

Opening the bucket to 0.0.0.0/0 works, locking the bucket locking the bucket down to a specific range (10.0.0.0/0, my vpc is 10.53.x.x) I can't access it in the private subnet. I've attached a role to the machine that has privileges to do anything to any resource in AWS and even this doesn't work.

Does anyone have any suggestions, I've read that s3 endpoints are a solution, but I wanted to see if I could do it the way I figured it would work first, has anyone else been through this particular problem?

why wouldnt you use an s3 endpoint, thats literally their use case

# ? Oct 17, 2015 19:49

StabbinHobo: Oct 18, 2002; by Jeffrey of YOSPOS

if you wanted to run a cassandra ring or a xtradb cluster which three cloud providers would you run it across? I guess basically who are the top 3 where the product is similar enough to get stuff working and not poo poo for some other reason.

# ? Nov 8, 2015 16:15

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

StabbinHobo posted:

if you wanted to run a cassandra ring or a xtradb cluster which three cloud providers would you run it across? I guess basically who are the top 3 where the product is similar enough to get stuff working and not poo poo for some other reason.

Are you running standard MySQL M/M replication, or Galera?

Why do you need three different cloud providers instead of three regions on the same provider?

# ? Nov 8, 2015 18:51

Docjowles: Apr 9, 2009

I guess Amazon, Google, and MS Azure? But I support VC in that trying to run a MySQL cluster across the public internet sounds like a special kind of hell and I'd encourage you not to do that!

edit: xtradb cluster is Percona's fork of galera IIRC

# ? Nov 8, 2015 19:21

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Docjowles posted:

edit: xtradb cluster is Percona's fork of galera IIRC

XtraDB Cluster is a MySQL distribution which includes XtraDB and Galera, among other things. Though it includes Galera, you don't need to use Galera.

# ? Nov 8, 2015 19:29

evol262: Nov 30, 2010; #!/usr/bin/perl

StabbinHobo posted:

if you wanted to run a cassandra ring or a xtradb cluster which three cloud providers would you run it across? I guess basically who are the top 3 where the product is similar enough to get stuff working and not poo poo for some other reason.

Helion, Rackspace, Softlayer openstack trifecta comedy option.

As mentioned, this is a bad idea. If you want redundancy, run in multiple AZs/regions. Sane tooling across multiple providers is OK with some orchestration tools, but mostly it'll be a headache. Especially little differences between AMIs and the images elsewhere (imported qcows or ovas or whatever), and that headache just gets worse by building your own and uploading it everywhere.

# ? Nov 8, 2015 20:45

Docjowles: Apr 9, 2009

Vulture Culture posted:

XtraDB Cluster is a MySQL distribution which includes XtraDB and Galera, among other things. Though it includes Galera, you don't need to use Galera.

True, although I don't know why you'd bother to use the cluster version if you weren't going to cluster. Then again it's a weird question so I guess I shouldn't assume anything!

# ? Nov 8, 2015 21:10

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Docjowles posted:

True, although I don't know why you'd bother to use the cluster version if you weren't going to cluster.

Easier to just install their bundle than integrating XtraDB into your distro's standard MySQL packages, though MariaDB includes XtraDB either way

# ? Nov 8, 2015 21:40

Ryaath: Apr 8, 2003

Newish to openstack (we're using Mirantis fuel to get it off the ground). I have about a million questions, but lets start with:

Mainly, what are some fun/valuable things to do with yer cloud once you've got it?

Image baking is going to lead to some real poo poo in my company... how do you you do it well (hook it to app ci, etc...)? Most our development is lovely java webapps.

Not accessing each and every vm directly is a shift for us.... Log aggregating I get, but do people just not monitor the underlying vms of their services? Or do I just accept that I'm going to attach a floating ip to each vm?

We're missing some key services... dnsaas mostly, but the lbaas also isn't 'production-ready' (the haproxy objects only run on 1 controller and don't fail over)... do I just cry til these get added in what I assume will be 3 years?

# ? Nov 9, 2015 13:48

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Ryaath posted:

Image baking is going to lead to some real poo poo in my company... how do you you do it well (hook it to app ci, etc...)? Most our development is lovely java webapps.

If you're already doing some kind of configuration-as-code with Puppet, Chef, etc. then this is super-easy with tools like [http://www.packer.io]Packer[/url].

Ryaath posted:

Not accessing each and every vm directly is a shift for us.... Log aggregating I get, but do people just not monitor the underlying vms of their services? Or do I just accept that I'm going to attach a floating ip to each vm?

If you're already not controlling your IP assignments, something like Nagios is going to be a disaster for you anyway. Something agent-based that is capable of self-registering hosts like Sensu is a better fit, though there are other enterprise tools that might be a better fit for your Java webapps.

Ryaath posted:

We're missing some key services... dnsaas mostly, but the lbaas also isn't 'production-ready' (the haproxy objects only run on 1 controller and don't fail over)... do I just cry til these get added in what I assume will be 3 years?

LBaaS doesn't have HA baked-in, but that doesn't mean you can't make it work with a little bit of effort. http://blog.anynines.com/openstack-neutron-lbaas/

If you're okay talking to the OpenStack compute API, it's really, really easy to automate HAProxy or an F5 BigIP or whatever your preferred load balancing technology is from something as simple as a Python script (I'm told there's a Powershell API client now, thanks to Rackspace). Don't rely on DNS for dynamic services.

Vulture Culture fucked around with this message at 15:58 on Nov 9, 2015

# ? Nov 9, 2015 15:56

Ryaath: Apr 8, 2003

Thanks for the reply VC.

We're using packer already with the puppet configuration mgmt we already had. Creating the images is no problem. Managing the life cycle or the build hierarchy is where we're struggling.

I'll look into sensu and that lbaas article you linked. I found Graylog offers a vm appliance image, so I just shoved that into glance and I'll let our app teams (hopefully) figure out the application logging from there...

# ? Nov 13, 2015 07:26

MagnumOpus: Dec 7, 2006

I'm 100% anecdotally sure many of the problems we're running into are due to intermittent network failures, but since we're using an Openstack IaaS provider I don't have access to logs at that level. I've got TCP failures and retransmits in my metrics pipeline, but I'm looking for something more compelling. What would be perfect is a small agent app that can constantly monitor links and track failures explicitly, because what I expect is that we're getting frequent jitters rather than hard link failures. Basically I just want to be able to prove if this IaaS is too brittle for production. Ideas?

# ? Dec 1, 2015 21:39

Mr Shiny Pants: Nov 12, 2012

MagnumOpus posted:

I'm 100% anecdotally sure many of the problems we're running into are due to intermittent network failures, but since we're using an Openstack IaaS provider I don't have access to logs at that level. I've got TCP failures and retransmits in my metrics pipeline, but I'm looking for something more compelling. What would be perfect is a small agent app that can constantly monitor links and track failures explicitly, because what I expect is that we're getting frequent jitters rather than hard link failures. Basically I just want to be able to prove if this IaaS is too brittle for production. Ideas?

Something like PRTG? If you have a Windows host you can install it there and the free version will monitor 100 sensors over SNMP, ping or whatever. It will show almost everything concerning latency, bandwidth, bandwidth usage etc. It is also pretty great for home use.

# ? Dec 3, 2015 08:21

evol262: Nov 30, 2010; #!/usr/bin/perl

MagnumOpus posted:

I'm 100% anecdotally sure many of the problems we're running into are due to intermittent network failures, but since we're using an Openstack IaaS provider I don't have access to logs at that level. I've got TCP failures and retransmits in my metrics pipeline, but I'm looking for something more compelling. What would be perfect is a small agent app that can constantly monitor links and track failures explicitly, because what I expect is that we're getting frequent jitters rather than hard link failures. Basically I just want to be able to prove if this IaaS is too brittle for production. Ideas?

This sounds like they may be doing doing GRE or VXLAN encapsulation with an MTU that's too small, but it's hard to tell just from this post.

# ? Dec 3, 2015 16:19

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

MagnumOpus posted:

<intermittent network failures> Ideas?

This happens everywhere where I am, and after fighting these issues for months one by one across one of the largest enterprise networks in the world supposedly, it's probably something really asinine in the end. Here's what I've found before as culprits:

1. Asymmetric routing. It's quite common but if you run mtr and watch packets get dropped somewhere roughly around a 50% duty cycle across a connection and you have two primary network paths available, you're looking at this as a fundamental problem. This is what oftentimes occurs between two different physical networks like across WANs and BGP where you advertise AS paths and sometimes the other peer does not quite respect your routing prefixes. AWS does respect this unlike many others, so request them, dammit. I used mtr to diagnose this problem live as it happened. Sadly enough, I'm not even a network admin and using that taught our network architects a new tool to use (yeah.... that's not a good sign when your random-rear end contracted devops guy is figuring poo poo out for your supposedly best network guys)
2. As mentioned above, mismatched TCP MTU. Note that AWS VMs use an MTU of 9001 by default and despite being off by one can chunk 1500 multiples fine, but having to convert a lot can result in packet fragmentation problems that translate ultimately into retransmits and packet reassembly times going up.
3. Just check your TTLs when to make sure that they're not expiring once in a while from a really, really, really complicated network. Had a user that was on a 40+ hop network complaining about how he couldn't get to AWS VMs reliably because it was so slow. Half his packets were dropping from an ancient network (literally almost as old as me) shoehorned onto a random-rear end backbone and so forth and TTL was just plain running out.
4. If you're using ping to AWS (doubtful, you're with an OpenStack provider), AWS has told me they're supposed to drop somewhere around 10% of ping traffic for performance reasons - check that your provider is not doing traffic shaping or anything to cause this.
5. Our instances (running VMware) are on severely overprovisioned clusters and drop pings randomly from underlying hardware just plain not keeping up to the point where our software HA solution is more of a liability because it detects 3 ping failures and tries to failover and that's about when it fails back, so we get all sorts of inconsistent state problems. Check /var/log/dmesg for kernel messages

For some general ideas, Brendan Gregg's book and his website have all sorts of solid methodologies for "figure out wtf is going wrong" and "why is poo poo so slow?" problems.

To more directly answer your monitoring question, you seem to need event correlation alongside your network monitoring. We're just running Graphite with Sensu grabbing NIC metrics and shoving them onto the AMQP bus and I map different time series together onto the time domain and look for patterns. A lot of this tends to just plain suck because our infrastructure has a serious case of :downs:

a lot because ntpd doesn't even work and half our clocks are off by 4+ minutes, but looking into how your TCP stack behaves with the rest of your system state is handy when running applications.

In most cases, threatening to drop your provider because you're having intermittent network problems will almost always get them on the phone and trying to diagnose your issue right away. You can improve your chances of faster resolution by providing network analysis for the vendor's network folks trying to eliminate the above issues (mtr - newer versions support MPLS labels btw, sar, maybe nmap for its peculiar traceroute methods, and tcp statistics from tcpdump , etc.)

# ? Dec 5, 2015 05:37

devicenull: May 30, 2007; Grimey Drawer

MagnumOpus posted:

I'm 100% anecdotally sure many of the problems we're running into are due to intermittent network failures, but since we're using an Openstack IaaS provider I don't have access to logs at that level. I've got TCP failures and retransmits in my metrics pipeline, but I'm looking for something more compelling. What would be perfect is a small agent app that can constantly monitor links and track failures explicitly, because what I expect is that we're getting frequent jitters rather than hard link failures. Basically I just want to be able to prove if this IaaS is too brittle for production. Ideas?

Smokeping?

# ? Dec 7, 2015 00:45

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Smokeping is a good start if you think there's a problem at the physical layer, but pings will almost never reveal the kinds of problems you expect them to in production. Hitting a single endpoint probably won't reveal anything about asymmetrically misconfigured link aggregates, because you'll always be taking the same network path. Small ping packet sizes won't reveal anything related to mismatched MTUs along the network causing unexpected fragmentation. Systems that don't take notice of out-of-order packets won't see UDP packets randomly going round-robin and arriving in the wrong sequence to a bad application that doesn't cope with that. Certainly, a ping every second or two will not trigger any meddling QoS policies, and won't reveal anything in particular about links that are saturated under production load.

As a practical example, here's the kind of dumb bullshit you'll run into in some cloud networks, and no quantity of pings will ever detect it for you:
https://code.google.com/p/google-compute-engine/issues/detail?id=87

A better option is to come up with some kind of test suite that's representative of your production workload, start a packet capture (tcpdump ring buffer is an awesome option), run it until you see the issue, then inspect the network traffic. Are you seeing packets randomly arriving out of order at your endpoint? Are you receiving fragmented packets that you expect not to be fragmented? Is there significant latency between certain packets leaving the one system and arriving at the other? Is some traffic just plain missing? The best place to start is to just analyze a basic packet capture and see what Wireshark's UI flags in red. Your local network device error counters (and dmesg) are also your friend.

necrobobsledder posted:

For some general ideas, Brendan Gregg's book and his website have all sorts of solid methodologies for "figure out wtf is going wrong" and "why is poo poo so slow?" problems.

To more directly answer your monitoring question, you seem to need event correlation alongside your network monitoring. We're just running Graphite with Sensu grabbing NIC metrics and shoving them onto the AMQP bus and I map different time series together onto the time domain and look for patterns. A lot of this tends to just plain suck because our infrastructure has a serious case of a lot because ntpd doesn't even work and half our clocks are off by 4+ minutes, but looking into how your TCP stack behaves with the rest of your system state is handy when running applications.

I've almost never found time series to be useful for anything in recent memory -- though I once burned 3,000 IOPS and 400 GB of disk space on Graphite chasing down a single NFS performance regression on an IBM software update (have you run across the mountstats collector for Diamond? That was me, for this problem) -- but I agree completely about Brendan Gregg and his USE method, and the need for correlation. At the most basic level, this means system clocks corrected to within a second or two, and reasonable log aggregation to help determine exactly what's going on in a distributed system. The Graphite bits are super-useful if you find yourself looking at actual NIC errors, but if you're divorced from the physical hardware in a private cloud, you'll see dwindling returns.

I feel like an old neckbeard, but I've been relying more and more on sar/sysstat recently and less on stuff like collectd and Graphite. It's certainly a lot easier to scale.

Vulture Culture fucked around with this message at 05:09 on Dec 7, 2015

# ? Dec 7, 2015 05:00

Bhodi: Dec 9, 2007; Oh, it's just a cat.; Pillbug

Yuck. Not only is that an obscure and ugly problem, but the poor handling of it is pretty disheartening. Someone had to publicly shame them before the issue was escalated externally. Reporting to resolution: 6+ months and counting.

# ? Dec 7, 2015 05:06

MagnumOpus: Dec 7, 2006

Thanks for all the input! I'm taking a 3-pronged approach to the problem:

1) Rebuilt some graphite boards to get a better look at network stats across all hosts
2) Smokeping. I have not used this before but I've got another guy familiar so we should be able to roll out quickly.
3) I'm going to write a monitoring agent based on hashicorp memberlist. This will hopefully let me differentiate different types of link failures by acting at the time of detection to verify from multiple hosts. If it works out the way I want, this should be capable of answering my primary question of overall system stability.

# ? Dec 7, 2015 20:01

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

Vulture Culture posted:

I've almost never found time series to be useful for anything in recent memory -- though I once burned 3,000 IOPS and 400 GB of disk space on Graphite chasing down a single NFS performance regression on an IBM software update (have you run across the mountstats collector for Diamond? That was me, for this problem) -- but I agree completely about Brendan Gregg and his USE method, and the need for correlation. At the most basic level, this means system clocks corrected to within a second or two, and reasonable log aggregation to help determine exactly what's going on in a distributed system. The Graphite bits are super-useful if you find yourself looking at actual NIC errors, but if you're divorced from the physical hardware in a private cloud, you'll see dwindling returns.

I feel like an old neckbeard, but I've been relying more and more on sar/sysstat recently and less on stuff like collectd and Graphite. It's certainly a lot easier to scale.

Time series were helpful identifying certain patterns that had any form of regularity or direct correlation so far for me. A VM was having problems constantly at 3 am or so and we found that a BackupExec job was running alongside our VM internal rsync and garb all backup job and that disk I/O would completely stall while we had triple whammies involving drbd and network saturation due to CPU getting overwhelmed while all that crap is going on (load of 45 on an 8 vCPU box in prod is scary, man). We wound up disabling the job entirely and because our internal network is garbagetastic for our own drat needs we need to wait about another 6 months for a major network redesign (again - they have to do this almost every year) to get two stupid VNICs to run on two different VLANs at least.

There's no way I'd have found a lot of problems without tcpdump / Wireshark such as bad NAT configurations or low firewalls and traffic shapers gone mad, and anything else in your usual enterprise network of madness. AWS offering Flow Logs would be great if I could get any drat access for these to be able to use the feature instead of having to do crazypants things like e-mailing LEGAL if I can directly dump and share info off a BGP switch with Amazon support. But our cloud maturity level is pretty bad so I suspect we won't find one thing wrong with an AWS service for 400 things that is our fault. Our poor Amazon account rep :smith:

Collecting what sort of errors start showing up at what times is helpful when you're trying to at least avoid some obvious problems like cron jobs or vMotion and you want to quickly share different error stats at certain points in time with third parties, including your cloud provider.

# ? Dec 8, 2015 06:20

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

necrobobsledder posted:

(load of 45 on an 8 vCPU box in prod is scary, man)

My personal record is 253 on a quad-core running TSM

e: wait, it was dual quad-core

Vulture Culture fucked around with this message at 07:01 on Dec 8, 2015

# ? Dec 8, 2015 06:55

MagnumOpus: Dec 7, 2006

Vulture Culture posted:

My personal record is 253 on a quad-core running TSM

Every time we get a significant network event that causes cascading failure our 8-core Logstash server gets slammed and hits around 280 as it tries to process the dramatic upswing in error messages.

# ? Dec 8, 2015 18:53

Megaman: May 8, 2004; I didn't read the thread BUT...

I have a website I host in aws. I have a dns alias record pointing to an ELB, and another ELB on standby. I update the application on one ELB, and then change the dns record from one to the other. This work perfectly in firefox, but chrome doesn't seem to pick up the DNS change, or at least not as fast, in fact it's very slow to pick up the change. I assume this isn't something wrong with the architecture? I assume this is a chrome problem? If so, what is it and how can I remedy this problem? Or is it that I need to put my elbs behind something that never changes IPs? If so, how would I go about doing this easily without changing too much architecture?

# ? Dec 22, 2015 00:10

Thanks Ants: May 21, 2004; #essereFerrari

Use Route 53 for your DNS and use an alias entry?

# ? Dec 22, 2015 00:17

Megaman: May 8, 2004; I didn't read the thread BUT...

Thanks Ants posted:

Use Route 53 for your DNS and use an alias entry?

I'm already doing that, that's the alias record I change

# ? Dec 22, 2015 00:29

Docjowles: Apr 9, 2009

Chrome maintains its own DNS cache which is why you probably don't see the change picked up instantly. I'm phone posting so this might not be entirely correct but you can clear it at something like chrome://net-internals/#dns in your browser. Through that obviously doesn't help the general public.

Hopefully Chrome at least kind of respects TTL. What TTL do you have set on the record? If quick updates are important you want something like 5 minutes.

# ? Dec 22, 2015 00:43

Megaman: May 8, 2004; I didn't read the thread BUT...

Docjowles posted:

Chrome maintains its own DNS cache which is why you probably don't see the change picked up instantly. I'm phone posting so this might not be entirely correct but you can clear it at something like chrome://net-internals/#dns in your browser. Through that obviously doesn't help the general public.

Hopefully Chrome at least kind of respects TTL. What TTL do you have set on the record? If quick updates are important you want something like 5 minutes.

I need this to affect the general public. I use only alias records so my TTLs should be pretty much instantaneous, the only record that isn't an alias is the SOA, and that's 10 seconds. So I'm really not sure what's going on. Firefox gets the change almost instantly, chrome is slow, or just doesn't get it, I'm not sure what Chrome is doing. Even when I clear Chrome's DNS it doesn't seem to taking the change, at least consistently.

# ? Dec 22, 2015 01:13

Thanks Ants: May 21, 2004; #essereFerrari

Is there an HTTP header you can send to get Chrome to gently caress off with the caching? Phone posting but this seems to be a Chrome thing and not necessarily something that can be resolved in your DNS setup.

# ? Dec 22, 2015 01:36

Megaman: May 8, 2004; I didn't read the thread BUT...

Thanks Ants posted:

Is there an HTTP header you can send to get Chrome to gently caress off with the caching? Phone posting but this seems to be a Chrome thing and not necessarily something that can be resolved in your DNS setup.

I have no idea, that's why I'm asking. It appears that Chrome is caching the DNS, and the content can't change until the DNS updates in chrome. A dig shows the machine is getting the right information, but Chrome is not.

# ? Dec 22, 2015 01:45

Gucci Loafers: May 20, 2006; Ask yourself, do you really want to talk to pair of really nice gaudy shoes?

Extra-circular question but when it comes to massive web-based SaaS Applications like Facebook, Salesforce, Apple iCloud what are they using for their Directory Service?

Active Directory doesn't make sense because it's too slow for such an enormous deployment and being web-centric Kerberos/NTLM aren't a good fit. I know many will point to Azure AD but all of these services existed before AAD.

What do they use?

# ? Dec 22, 2015 01:59

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Tab8715 posted:

Extra-circular question but when it comes to massive web-based SaaS Applications like Facebook, Salesforce, Apple iCloud what are they using for their Directory Service?

Active Directory doesn't make sense because it's too slow for such an enormous deployment and being web-centric Kerberos/NTLM aren't a good fit. I know many will point to Azure AD but all of these services existed before AAD.

What do they use?

Almost nobody is using anything off the shelf. Most applications built to scale are storing user information in something equally built to scale for their exact use cases, and that's usually the (often NoSQL) datastore that drives the rest of their platform. Facebook uses TAO, which they talked about here. Apple recently purchased FoundationDB, and it's not an unreasonable assumption that their cloud services use it under the hood. Salesforce proper uses Oracle under the covers for mostly legacy reasons, but their subsidiaries like Heroku make heavy use of PostgreSQL, MongoDB, and others for storing user information.

# ? Dec 22, 2015 03:00

Winkle-Daddy: Mar 10, 2007

Does anyone here have any experience with creating internal ec2 build agents? I want to build code for ec2 but do to legal reasons cannot deploy anything but a binary to ec2. This makes compiling against their kernel headers hard as what I'm building is a kernel module.

e: this process might work... https://forums.aws.amazon.com/thread.jspa?messageID=498214

Winkle-Daddy fucked around with this message at 23:27 on Dec 22, 2015

# ? Dec 22, 2015 23:05

Adbot: ADBOT LOVES YOU

# ? Apr 28, 2024 13:12

nexxai: Jul 17, 2002; quack quack bjork; Fun Shoe

Megaman posted:

I use only alias records so my TTLs should be pretty much instantaneous

What information are you basing this on? Are you assuming they're "pretty much instantaneous" or are they actually set to something low?

# ? Dec 23, 2015 07:47

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Cloud Computing: Mostly fog machines in other people's datacenters

«‹›10 »