Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
the talent deficit
Dec 20, 2003

self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture





Vanadium posted:

If banking unused capacity happens on a 1:1 basis, I shouldn't have any issues. Averaged over 30 seconds I'm consistently under the provisioned capacity. Getting throttled at that point makes sense to me if the banked capacity is only provided on a best-effort basis, if the underlying hardware has capacity to spare or whatever.

What I can't figure out is why I seem get throttled below my provisioned capacity on a per-second basis, but maybe I'm actually measuring that wrong and averaging too much there.

I haven't realized I can get numbers for my remaining capacity, that sounds a lot more useful than the consumed capacity I've looked at. I'm not sure what I get out of throttling myself, though--does getting throttled on the AWS side punish me by consuming even more capacity? Right now I just add exponentially backoff whenever at least one write in my batch gets throttled, but with how quickly a little bit of capacity refills that doesn't really do much, maybe I need to be more aggressive about backing off.

switch from exponential backoff to a codel queue. you'll get better throughput and lower average latency at the cost of p99 latency. (does not apply if you need ordered writes)

Adbot
ADBOT LOVES YOU

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
Anyone have suggestions for the best way to analyze CloudTrail logs? We're getting rate limited on some of our EC2 API calls and it's unclear why at a glance. Happens most in eu-central-1 fwiw.

the talent deficit
Dec 20, 2003

self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture





Blinkz0rz posted:

Anyone have suggestions for the best way to analyze CloudTrail logs? We're getting rate limited on some of our EC2 API calls and it's unclear why at a glance. Happens most in eu-central-1 fwiw.

athena if it's a one time thing, redshift if you want to do it frequently

Thanks Ants
May 21, 2004

#essereFerrari


Am I correct that I can't control IAM permissions on a per-Route 53 domain basis, just on a per-DNS-zone basis?

E.g. I can deny access to changing the transfer lock status, but I can't only have it apply to specific domains.

Portland Sucks
Dec 21, 2004
༼ つ ◕_◕ ༽つ
I was doing some machine learning classification stuff on my home PC and was sick of having my CPU tied up for days on end so I just jumped into the free tier EC2 without really reading anything about how it worked and was wowed at how slow it was in comparison to my i7. I figure that was the whole burst performance thing that the t2.micro offers working at my disadvantage since I just needed something that could run 100% for as long as I needed. Which EC2 instance types should I be looking at that won't scale back after a few hours of constant threaded CPU?

Volguus
Mar 3, 2009

Portland Sucks posted:

I was doing some machine learning classification stuff on my home PC and was sick of having my CPU tied up for days on end so I just jumped into the free tier EC2 without really reading anything about how it worked and was wowed at how slow it was in comparison to my i7. I figure that was the whole burst performance thing that the t2.micro offers working at my disadvantage since I just needed something that could run 100% for as long as I needed. Which EC2 instance types should I be looking at that won't scale back after a few hours of constant threaded CPU?

Last time I looked at their offerings (quite a few years back) they do have compute intensive VMs to choose from. But they cost a pretty penny.

FamDav
Mar 29, 2008

Vanadium posted:

If banking unused capacity happens on a 1:1 basis, I shouldn't have any issues. Averaged over 30 seconds I'm consistently under the provisioned capacity. Getting throttled at that point makes sense to me if the banked capacity is only provided on a best-effort basis, if the underlying hardware has capacity to spare or whatever.

What I can't figure out is why I seem get throttled below my provisioned capacity on a per-second basis, but maybe I'm actually measuring that wrong and averaging too much there.

I haven't realized I can get numbers for my remaining capacity, that sounds a lot more useful than the consumed capacity I've looked at. I'm not sure what I get out of throttling myself, though--does getting throttled on the AWS side punish me by consuming even more capacity? Right now I just add exponentially backoff whenever at least one write in my batch gets throttled, but with how quickly a little bit of capacity refills that doesn't really do much, maybe I need to be more aggressive about backing off.

so consumed capacity (i dont believe you can get remaining capacity from any API call) is externalizing your throttles. you don't go negative. a few options are

1. reconfigure your backoff to go across multiple seconds (if you're throttled at time x, you're probably going to be throttled again a x+5ms)
2. use DAX as a write-through cache
3. contact support and ask for a dynamodb heatmap. it will show how your reads/writes are being distributed across partitions.

speaking of, do you have an idea what the minimum number of partitions you have is? you can look at the data at http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html#GuidelinesForTables.Partitions to determine how many you could have, but its not guaranteed if you have a lopsided data distribution. also, iirc local secondary indices are interleaved with your data and will increase partition size, whereas global secondary indices are effectively yet another ddb table that is kept in sync.

EDIT: and be mindful of call volume by individuals. just because you have a good partition scheme doesn't mean you don't have one caller banging on that partition, or that your data is split across more partitions than you expected because you have one partition that has a lopside distribution of partition+sort data

FamDav fucked around with this message at 04:44 on Jun 21, 2017

Vanadium
Jan 8, 2005

Chances are we're going to drop DynamoDB since our use-case is not only a good fit but also doesn't justify how much time I've been sinking into it. It's really small amounts of data and I'm starting to think anything else would work better, even just periodically putting a file on S3. I've basically been trying to use DynamoDB as IPC with incidental persistence.

My backoff works out to multiple seconds as I get throttled, but when my throughput is limited by capacity i don't think I can optimize much with backoff here. Fairly sure I have two partitions right now. The partitioning of my data isn't great so I did the thing where you just add random poo poo to your partition key to spread things out more, which I think should sort that out.

I don't think a cache helps either since my writer is already batching updates in 30 second windows and then only updating each key once.

Thanks for all the advice, I feel a bunch better prepared to use DynamoDB now if that starts seeming appropriate for another project.

Skier
Apr 24, 2003

Fuck yeah.
Fan of Britches

Portland Sucks posted:

I was doing some machine learning classification stuff on my home PC and was sick of having my CPU tied up for days on end so I just jumped into the free tier EC2 without really reading anything about how it worked and was wowed at how slow it was in comparison to my i7. I figure that was the whole burst performance thing that the t2.micro offers working at my disadvantage since I just needed something that could run 100% for as long as I needed. Which EC2 instance types should I be looking at that won't scale back after a few hours of constant threaded CPU?

Anything instance type starting without `t` doesn't do bursting and throttling of CPU usage. Lots of options for instance types. If your work can be interrupted you can use spot instances to run way cheaper than on-demand instances. https://aws.amazon.com/ec2/spot/pricing/ . If your workload can't be interrupted EC2 is gonna be pricey for offloading work.

UnfurledSails
Sep 1, 2011

I have a Redshift table that needs to be migrated over to DynamoDB. I've found a lot of resources regarding moving from Dynamo to Redshift, but not much for the opposite. Any ideas on how I can go about this?

Destroyenator
Dec 27, 2004

Don't ask me lady, I live in beer
Redshift "UNLOAD" to S3 which gives you CSV files, then a script to turn them into json and batch write rows to Dynamo?

the talent deficit
Dec 20, 2003

self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture





UnfurledSails posted:

I have a Redshift table that needs to be migrated over to DynamoDB. I've found a lot of resources regarding moving from Dynamo to Redshift, but not much for the opposite. Any ideas on how I can go about this?

you could run a spark job on emr that just reads straight from the db and inserts into dynamo

Lily Catts
Oct 17, 2012

Show me the way to you
(Heavy Metal)
Does AWS have a service that lets you perform geospatial queries? I'm using Cloudsearch right now, tied to DynamoDB, but I don't really relish the setup as I will have to perform the geospatial query in Cloudsearch, then use the result set to query in DynamoDB to get the full data (and to do updates/deletes). The other non-option is DynamoDB's outdated Java-only geospatial add-on that's unsuitable for any kind of non-trivial work.

FamDav
Mar 29, 2008

Schneider Heim posted:

Does AWS have a service that lets you perform geospatial queries? I'm using Cloudsearch right now, tied to DynamoDB, but I don't really relish the setup as I will have to perform the geospatial query in Cloudsearch, then use the result set to query in DynamoDB to get the full data (and to do updates/deletes). The other non-option is DynamoDB's outdated Java-only geospatial add-on that's unsuitable for any kind of non-trivial work.

what makes the geospatial library unsuitable beyond 'holy poo poo this thing is like 6 or so SDK revisions out of date"?

Lily Catts
Oct 17, 2012

Show me the way to you
(Heavy Metal)

FamDav posted:

what makes the geospatial library unsuitable beyond 'holy poo poo this thing is like 6 or so SDK revisions out of date"?

Works only with point data (no polygon support)
Can't update location data, will have to delete/insert
Java-only (we're using Node)
Slow as poo poo
Actually doesn't work out of the box, you'll have to rebuild it to make use of updated Jackson dependencies (since they changed namespaces a while back)
Amazon hasn't updated it for 4 years so they probably don't care about it

foundtomorrow
Feb 10, 2007

Schneider Heim posted:

Does AWS have a service that lets you perform geospatial queries? I'm using Cloudsearch right now, tied to DynamoDB, but I don't really relish the setup as I will have to perform the geospatial query in Cloudsearch, then use the result set to query in DynamoDB to get the full data (and to do updates/deletes). The other non-option is DynamoDB's outdated Java-only geospatial add-on that's unsuitable for any kind of non-trivial work.

Postgres (with PostGIS) running on RDS could possibly work for you. If you share more details on your use-case (what your data looks like and what your query patterns will be), we can point you in a more specific direction.

IAmKale
Jun 7, 2007

やらないか

Fun Shoe
Is there a better thread to ask questions in about Google Cloud? It's such a red-headed stepchild of service platforms but I'm forced to use it due to architectural reqs out of my control.

Thanks Ants
May 21, 2004

#essereFerrari


Has anybody managed to successfully decipher Azure VM sizing?

Looking at their price list:

https://azure.microsoft.com/en-gb/pricing/details/virtual-machines/windows/

An A2 instance has 3.5GB RAM and 60GB disk (the disk being temporary scratch rather than persistent storage provided by a Managed Disk which the OS runs from).

Looking at their documentation:

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-general

An A2 instance has 3.5GB RAM and 135GB disk.

Is this a documentation fuckup, or have I missed something?

Edit: I have missed something, A2 on the price list is Basic tier, A2 Standard is a previous version VM and listed https://azure.microsoft.com/en-gb/pricing/details/virtual-machines/windows-previous/. They could really do with coming up with a better way of naming them.

Thanks Ants fucked around with this message at 18:21 on Jul 25, 2017

SnatchRabbit
Feb 23, 2006

by sebmojo
I have an older EC2 instance from around 2014, I think in EC2 Classic. I built a complete upgraded server but I can't seem to associate my elastic IP on my new server. I'm sure there's somethin I'm missing regarding EC2 classic, but does anyone have a quick tutorial on how to associate an existing elastic ip to a new server? I tried associating it in AWS, but it only allows me to associate with the old server.

edit: figured it out, had to migrate the elastic ip to the VPC scope of the new instance.

SnatchRabbit fucked around with this message at 20:56 on Aug 4, 2017

jiffypop45
Dec 30, 2011

I'm working on brute forcing a password I forgot on a .pdf and don't want to pay 5$ for another copy of. I spun up an EC2 instance last night, and for the first several hours it displayed 100% usage on cloudwatch and via top. Now it's still showing 100% usage (or close enough to it) on top but cloudwatch has dropped down massively. Any idea what's going on here? I don't have any throttling, load balancing, or scaling going on, it's just a single ec2 instance I thought I could just let cook for a few days and see if I made any headway before admitting defeat and paying for a new copy.

Startyde
Apr 19, 2007

come post with us, forever and ever and ever
If it's a t-class you ran out of CPU credit

jiffypop45
Dec 30, 2011

Startyde posted:

If it's a t-class you ran out of CPU credit

Is that documented somewhere? I didn't see it. That definitely makes sense though.

Edit:

Found it

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html

Its actually not cutting down that much. I'll just let it run as is for a bit and see if anything interesting happens from it.

jiffypop45 fucked around with this message at 18:19 on Aug 29, 2017

JHVH-1
Jun 28, 2002

jiffypop45 posted:

Is that documented somewhere? I didn't see it. That definitely makes sense though.

Edit:

Found it

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html

Its actually not cutting down that much. I'll just let it run as is for a bit and see if anything interesting happens from it.

Was going to point out, but I see it is mentioned on that page as well that you can view CPUCreditUsage and CPUCreditBalance in CloudWatch.

Thanks Ants
May 21, 2004

#essereFerrari


I'm not understanding the economics of brute-forcing something on AWS to save $5.

jiffypop45
Dec 30, 2011

Thanks Ants posted:

I'm not understanding the economics of brute-forcing something on AWS to save $5.

It was more of a "because I can" however there's also the "as a recent Texas expat I don't want to give them 5$".

Vanadium
Jan 8, 2005

Hey, if I'm using lambda functions that sit idle during their invocation for like 5-10 minutes at a time, am I doing it wrong? I wanted to kick off redshift queries in a clever serverless way but I guess I'm not optimally using the pricing structure if my lambda function starts up, dials up redshift, sends a query and then just sits there until the query returns.

On the other hand the call volume is gonna be low so it doesn't really matter either way, probably. Just feels wrong?

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Vanadium posted:

Hey, if I'm using lambda functions that sit idle during their invocation for like 5-10 minutes at a time, am I doing it wrong? I wanted to kick off redshift queries in a clever serverless way but I guess I'm not optimally using the pricing structure if my lambda function starts up, dials up redshift, sends a query and then just sits there until the query returns.

On the other hand the call volume is gonna be low so it doesn't really matter either way, probably. Just feels wrong?

Can you have redshift publish results to an SNS topic? If so, use 2 lambdas, one to kick off the query and the other to process the results when the data is published to the topic.

Thanks Ants
May 21, 2004

#essereFerrari


Isn't that also what SQS is designed to manage?

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Thanks Ants posted:

Isn't that also what SQS is designed to manage?

It depends on what you're trying to do. If you want to persist the data until your Lambda dequeues it then yeah, use SQS. If you want your Lambda to be kicked off with the data available in the event context then you use SNS. It really all depends on what RedShift supports and how you can actually get your data out of it.

It may be that you have to get RedShift to publish results to a S3 bucket and set up bucket events to then publish to an SNS topic which notifies your Lambda. The Lambda will then retrieve the data from the bucket and operate on it.

Either way you choose, from a cost and architecture perspective it's a bad idea to leave a Lambda running for any period of time.

Vanadium
Jan 8, 2005

I don't know that you can have Redshift run queries without keeping like a postgres session open to the cluster the entire runtime? I forgot about bucket events, if that works then that seems like the right way to do it.

Destroyenator
Dec 27, 2004

Don't ask me lady, I live in beer
I think there's a hard five minute limit on lambdas too so if the query runs long you're in trouble.

Vanadium
Jan 8, 2005

My pal who actually works with redshift every day says you can't kick off queries asynchronously, so my schemes are probably dead in the water then.

the talent deficit
Dec 20, 2003

self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture





i worked with redshift every day and you can't do queries async and you can't hold open a lambda for more than five minutes. athena is basically redshift but slower however you can query it async if your queries don't need to be fast. otherwise you probably want to use something like emr and use lambda to kick off a cluster that runs a spark job with redshift or s3 as the backing store and have it write out results to a bucket and trigger sns on that write

Vanadium
Jan 8, 2005

My data is in redshift already and I don't wanna gently caress with the setup in general too much. I guess it's gonna be a cronjob on some random host to do the queries and post the results to S3 and the lambda then just verifies that everything went ok.

Is there a standard way to hack up ssh or dns so you can ssh to instances by instance id or stuff like that without having to do lookups yourself, by hand?

Thanks Ants
May 21, 2004

#essereFerrari


describe-instances returns the private-dns-name if that is what you meant?

JHVH-1
Jun 28, 2002

Vanadium posted:

My data is in redshift already and I don't wanna gently caress with the setup in general too much. I guess it's gonna be a cronjob on some random host to do the queries and post the results to S3 and the lambda then just verifies that everything went ok.

Is there a standard way to hack up ssh or dns so you can ssh to instances by instance id or stuff like that without having to do lookups yourself, by hand?

If you have elastic IPs attached you can use CNAMEs pointing to the Public DNS name. If it is internal it should still route it privately (Like requests inside your VPC)

Cancelbot
Nov 22, 2006

Canceling spam since 1928

Does anyone know how bad the Developer - Associate cert is? I know the DevOps Professional will kick my arse but I need to get onto associate first. I'm going through the recommended "quest" first and will probably do the practice exam in a couple weeks.

Background: Been doing AWS/DevOps stuff for a large UK online retailer for about 2 years now. Recently finished migrating all of our physical infrastructure to AWS and I come from a strong senior developer background, far stronger than my networking/infrastructure knowledge.

fluppet
Feb 10, 2009

Cancelbot posted:

Does anyone know how bad the Developer - Associate cert is? I know the DevOps Professional will kick my arse but I need to get onto associate first. I'm going through the recommended "quest" first and will probably do the practice exam in a couple weeks.

Background: Been doing AWS/DevOps stuff for a large UK online retailer for about 2 years now. Recently finished migrating all of our physical infrastructure to AWS and I come from a strong senior developer background, far stronger than my networking/infrastructure knowledge.

As long as your familiar with the basics of ec2 vpc rds you should be fine with the sysops associate,not really looked at the Dev associate but have the devops pro booked for next month

putin is a cunt
Apr 5, 2007

BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION
I'm fairly new to AWS so I apologise for the super basic question, but what service(s) would I use if I wanted to make a website that could compile less into CSS for a user to download? I figure that I should to do this in a Node.js Lambda and then send the result to S3 and publish to an SNS to say that the download is ready, which my webpage can then react to. Am I on the right track?

Adbot
ADBOT LOVES YOU

FamDav
Mar 29, 2008

a hot gujju bhabhi posted:

I'm fairly new to AWS so I apologise for the super basic question, but what service(s) would I use if I wanted to make a website that could compile less into CSS for a user to download? I figure that I should to do this in a Node.js Lambda and then send the result to S3 and publish to an SNS to say that the download is ready, which my webpage can then react to. Am I on the right track?

so first off with lambda i heartily suggest you take a look at https://serverless.com/ , as it will simplify a lot of your development and make it easy enough to write lambda-based services on aws.

not really dealing with less and css all that much, how long does it take for the transformation to occur? depending on how reactive this is (single digit seconds?) you could perform this operation purely as request/reply such that you either reply when the transformation has finished/failed or you just timeout.

if not, you want to introduce some durability guarantees around the async operation. what does it mean when you return back a 200 OK? because what if that lambda holding all the state dies before completing? when you return back a 200 response, you should really be committing to the customer "this is going to happen, or I'm going to be able to confidently tell you it didn't at some point in the future".

To rectify this, I would suggest having your request lambda persist the input less to s3, kick off a step function that will process and persist the css to a different s3 object, and then return that the transformation is "in progress" along with some identifier for the operation. Then your page can call another lambda to poll the status of that particular operation.

I was going to say that serverless doesn't have step functions integration, but it turns out there are plugins for that: https://serverless.com/blog/how-to-manage-your-aws-step-functions-with-serverless/

FamDav fucked around with this message at 03:29 on Oct 13, 2017

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply