Weekly Update #22: ECS Architecture Discoveries

This is a summary of my sabbatical's development for the week of July 4th, 2020 to July 11th, 2020.

Goals from last week

  • [❓] Begin work on a SaaS template for Paint Creek Software in order to have something to show to potential clients.
  • [❓] Incorporate an authentication system for frontend + backend (login + register + signout).
  • [❓] Incorporate a billing + licensing system for frontend + backend.
  • [❓] Ship SaaS template @ demo.paintcreeksoftware.com using free tier services.
  • [❓] Ship SaaS template @ {api, app}.test.paintcreeksoftware.com in order to showcase migration to AWS services from the VPC layer on down

What I got done this week

  • [❌] Begin work on a SaaS template for Paint Creek Software in order to have something to show to potential clients.
  • [❌] Incorporate an authentication system for frontend + backend (login + register + signout).
  • [❌] Incorporate a billing + licensing system for frontend + backend.
  • [❌] Ship SaaS template @ demo.paintcreeksoftware.com using free tier services.
  • [❌] Ship SaaS template @ {api, app}.test.paintcreeksoftware.com in order to showcase migration to AWS services from the VPC layer on down

This week was pretty demoralizing. I discovered that my devops work for the past few months is rather unusable for getting something out the door, unless I pour in even more effort. It's insane just how much configuration work there is if you want to create a modern devops stack from the VLAN layer up.

Metrics

  • Weeks to launch (primary KPI): ? (1 week after product development begins)
  • Users talked to total: 0

RescueTime statistics

  • 72h 43m (41% prroductive, 33% decrease from last week)
    • 39h 33m “utilities”
    • 23h 41m “software development”
    • 4h 25m “communication & scheduling”
    • 1h 58m “news & opinion”
    • 1h 36m “uncategorized”

I think the numbers a bit skewed, because all “google-chrome” usage (including productive usage) is labeled as “highly unproductive”. I've found I needed Chrome's better WebRTC and file upload for email support for some work-related tasks. Not to say that I've been good with Chrome, I would say by and large these metrics are accurate.

iPhone screen time (assumed all unproductive)

  • Total: 24h 27m
  • Average: 3h 29m
  • Performance: 18% increase from last week

Hourly journal

https://hourly-journal.yingw787.com

Goals for next week

  • [❓] Start applying to jobs and see whether I can buy myself some more financial runway.

Things I've learned this week

  • Turns out my ECS architecture is pretty wrong. I was following this GitHub repository on setting up an ECS cluster.

    Turns out, this is if all your application is only communicating to AWS services. You cannot make third-party API requests (like to Stripe) directly, because the service discovery configuration does not support it.

    This is pretty devastating, because I had assumed that the book I had purchased would provide a better alternative view of AWS than open documentation would. Now, if I wanted to avoid paying the $36.00 monthly NAT gateway fee, I would need to use Fargate (where everything is managed and I have no visibility whatsoever into diagnosing issues unless I had application-level logging and CloudWatch logs properly configured), or I would need to re-work the entire stack from the VPC layer on down, which is a lot of ops work. I think the gulf I see between an application developer and an ops person sees grows larger day by day.

    Trying to think of positives to this whole situation. So I had that database from TinyDev I wanted to self-deploy, and I wasn't sure how viable that was because I would have needed to pay for a NAT gateway on a private subnet. Welp, that concern goes out the door, because now I will need a private subnet + NAT anyways. Another benefit is having privacy by default, since a private subnet is not accessible from the Internet. This should mean that if I have an older EC2 instance running, it wouldn't be the end of the world if a security vulnerability was detected (though it should be trivial in order to deploy a fix).

    Internally, this issue corresponds to AWS support ticket #7162207771 and there's resources on migrating the infra stack there.

  • Token-based authentication isn't the end-all be-all. As it turns out (maybe not too surprisingly), token-based auth isn't as good as session-based auth for sessions. It's really great for issuing and permissioning API requests, but keeping long-lived sessions working and secure against CSRF and XSS attacks may not be its forte. I'm currently storing the tokens in localStorage, which apparently is a big no-no. Security best practices indicate only the JWT refresh token should be stored (not the access token), a new access token should be requested from the server for every request, and you should use an HttpOnly cookie instead of localStorage for long-lived requests. You can also use an auth server like Auth0 in order to manage your tokens, so that way you don't store any secure items yourself.

Ultimately, this week has proved full of turmoil because of false assumptions. I can't blame people for not wanting to maintain their own infrastructure stack. It's a lot of work, and none of it really changes your BATNA or develops your intellectual capital (unless you want to be focused on devops work in the future). I also can't believe (but also can believe) some of the concerns propped up by the frontend, and what I'll need to do in order to ensure my SaaS template remains generally secure.

The security stuff is pretty annoying too. I'm glad I'm taking security measures into my own hands, but again it's a lot and you never really know if it worked (because bad guys won't tell you things). I would anticipate a lot of work interfacing with something like OWASP and security automation tools like zaproxy and https://securityheaders.com, and a lot of reading into Django security best practices (like presumably adding etags) from something like “Two Scoops of Django”.

I general, I've found it's really hard to know who and what material to trust on the Internet, even for mundane things like DevOps (which I thought would be as simple as checking the weather). It's difficult because I want things to be perfect, or at least good enough (and only “more” is good enough), and so every little thing like this makes it hard to really get started.

I can kind of see where AWS support gets a bad reputation from. I've liked everybody I've interacted with, but sometimes there is a qualitative difference in the advice given, or you have a little doubt as to whether there's a better solution out there. That's partially why I'm very, very reluctant to switch to a higher-order cloud offering from EC2 + VPC (besides containers, because they work so well locally), because so much stuff is cobbled together and shipped.

One friend mentioned the benefit of a single-server setup for this kind of stuff. Honestly, that sounds really appealing to me at this point. If there was a single-server DevOps course, with interactive terminals and a complete Packer ami.json generated at the end that costs $5 / mo. using open-source software and was constantly updated, I would pay for it. It may well be better than Packt Publishing's website (which has great resources but a not-great UI).

Subscribe to my mailing list