log.nikhil.io

Using AWS Without Succumbing to Hype, FOMO, and Over-Engineering

by Daniel Vassallo

This is how I use the good parts of @awscloud, while filtering out all the distracting hype.

My background: I’ve been using AWS for 11 years — since before there was a console. I also worked inside AWS for 8 years (Nov 2010 - Feb 2019).

My experience is in web- sites/apps/services. From tiny personal projects to commercial apps running on 8,000 servers. If what you do is AI, ML, ETL, HPC, DBs, blockchain, or anything significantly different from web apps, what I’m writing here might not be relevant.

Step 1: Forget that all these things exist: Microservices, Lambda, API Gateway, Containers, Kubernetes, Docker.

Anything whose main value proposition is about “ability to scale” will likely trade off your “ability to be agile & survive”. That’s rarely a good trade off.

Start with a t3.nano EC2 instance, and do all your testing & staging on it. It only costs $3.80/mo.

Then before you launch, use something bigger for prod, maybe an m5.large (2 vCPU & 8 GB mem). It’s $70/mo and can easily serve 1 million page views per day.

1 million views is a lot. For example, getting on the front page of @newsycombinator will get you ~15-20K views. That’s just 2% of the capacity of an m5.large.

It might be tempting to use Lambda & API Gateway to save $70/mo, but then you’re going to have to write your software to fit a new immature abstraction and deal with all sorts of limits and constraints.

Basic stuff such as using a cache, debugging, or collecting telemetry/analytics data becomes significantly harder when you don’t have access to the server. But probably the biggest disadvantage is that it makes local development much harder.

And that’s the last thing you need. I can’t emphasize enough how important it is that you can easily start your entire application on your laptop, with one click.

With Lambda & API Gateway you’re going to be constantly battling your dev environment. Not worth it, IMO.

CloudFormation: Use it. But too much of it can also be a problem. First of all, there are some things that CFN can’t do. But more importantly, some things are best left out of CFN because it can do more harm than good.

The rule of 👍: If something is likely to be static, it’s a good candidate for CFN. Ex: VPCs, load balancers, build & deploy pipelines, IAM roles, etc. If something is likely to be modified over time, then using CFN will likely be a big headache. Ex: Autoscaling settings.

I like having a separate shell script to create things that CFN shouldn’t know about.

And for things that are hard/impossible to script, I just do them manually. Ex: Route 53 zones, ACM cert creation/validation, CloudTrail config, domain registration.

The test for whether your infra-as-code setup is good enough is whether you feel confident that you can tear down your stack & bring it up again in a few minutes without any mistakes. Spending an unbounded amount of time in pursuit of scripting everything is dumb.

Load balancers: You should probably use one even if you only have 1 instance. For $16/mo you get automatic TLS cert management, and that alone makes it worth it IMO. You just set it up once & forget about it. An ALB is probably what you’ll need, but NLB is good too.

Autoscaling: You won’t need it to spin instances up & down based on utilization. Unless your profit margins are as thin as Amazon’s, what you need instead is abundant capacity headroom. Permanently. Then you can sleep well at night — unlike Amazon’s oncall engineers 🤣

But Autoscaling is still useful. Think of it as a tool to help you spin up or replace instances according to a template. If you have a bad host, you can just terminate it and AS will replace it with an identical one (hopefully healthy) in a couple of minutes.

VPCs, Subnets, & Security Groups: These may look daunting, but they’re not that hard to grasp. You have no option but to use them, so it’s worth spending a day or two learning all there is about them. Learn through the console, but at the end set them up with CFN.

Route 53: Use it. It integrates nicely with the load balancers, and it does everything you need from a DNS service. I create hosted zones manually, but I set up A records via cfn. I also use Route 53 for .com domain registration.

CodeBuild/Deploy/Pipeline: This suite has a lot of rough edges and setup can be frustrating. But once you do set it up, the final result is simple and with few moving parts.

Don’t bother with CodeCommit though. Stick with GitHub.

Sample pipeline: A template for setting up an AWS environment from scratch.

S3: At 2.3 cents per GB/mo, don’t bother looking elsewhere for file storage. You can expect downloads of 90 MB/s per object and about a 50 ms first-byte latency. Use the default standard storage class unless you really know what you’re doing.

Database: Today, DynamoDB is an option you should consider. If you can live without “joins”, DDB is probably your best option for a database. With per-request pricing it’s both cheap and a truly zero burden solution. Remember to turn on point-in-time backups.

But if you want the query flexibility of SQL, I’d stick with RDS. Aurora is fascinating tech, and I’m really optimistic about it’s future, but it hasn’t passed the test of time yet. You’ll end up facing a ton of poorly documented issues with little community support.

CloudFront: I’d usually start without CloudFront. It’s one less thing to configure and worry about. But it’s something worth considering eventually, even just for the DDoS protection, if not for performance.

SQS: You likely won’t need it, and if you needed a message queue I’d consider something in-process first. But if you do have a good use case for it, SQS is solid, reliable, and reasonably straightforward to use.

Conclusion: I like to seperate interesting new tech from tech that has survived the test of time. EC2, S3, RDS, DDB, ELB, EBS, SQS definitely have. If you’re considering alternatives, there should be a strong compelling reason for losing all the benefits accrued over time.