TinyDevCRM Update #13: Over the Hump

This is a summary of TinyDevCRM development for the week of May 2nd, 2020 to May 9th, 2020.

Goals from last week

  • [❓] Finish creating a working CloudFormation pipeline for ECR repositories, and get the PostgreSQL + pg_cron images pushed up to ECR as part of that effort
  • [❓] Finish templating over learnings from Docker on AWS for the application side of TinyDevCRM
  • [❓] Finalize CloudFormation ECS setup for the container orchestration layer, with service / task / cluster definitions, and auto-pulling from AWS ECR
  • [❓] Ensure that rex-ray Docker volume plugin persists PostgreSQL volumes after draining EC2/ECS instances, and that the same EBS volume connected to multiple different EC2 instances keeps the same cron.schedule table.
  • [❓] Get up CI/CD pipelines for test/production deploys with AWS CodeBuild and AWS CodePipeline
  • [❓] Check back with Basecamp Personal roadmap.
  • [❓] Do a write-up of creating a SaaS product from VPC to JS, for personal understanding of creating a templated SaaS product base in the future.

What I got done this week

  • [✔] Finish creating a working CloudFormation pipeline for ECR repositories
  • [👉] Finish templating over learnings from Docker on AWS for the application side of TinyDevCRM
  • [👉] Finalize CloudFormation ECS setup for the container orchestration layer, with service / task / cluster definitions, and auto-pulling from AWS ECR
  • [❌] Ensure that rex-ray Docker volume plugin persists PostgreSQL volumes after draining EC2/ECS instances, and that the same EBS volume connected to multiple different EC2 instances keeps the same cron.schedule table.
  • [❌] Push the PostgreSQL + pg_cron Docker images pushed up to AWS ECR
  • [❌] Get up CI/CD pipelines for test/production deploys with AWS CodeBuild and AWS CodePipeline
  • [❌] Check back with Basecamp Personal roadmap.
  • [❌] Do a write-up of creating a SaaS product from VPC to JS, for personal understanding of creating a templated SaaS product base in the future.
  • [❌] Add Docker healthchecks for staging release + acceptance + testing environments.
  • [❌] Add back securitization (IAM roles and policies) for all CloudFormation resources
  • [❌] Add back logging for all CloudFormation resources (pipe to EC2 instance, which CloudWatch service listens to)

Metrics

  • Weeks to launch (primary KPI): 3 (9 weeks after declared KPI of 1 week)
  • Users talked to total: 1

RescueTime statistics

  • 59h 37m (71% productive)
    • 29h 19m “software development”
    • 11h 1m “utilities”
    • 8h 53m “communication & scheduling”
    • 5h 23m “uncategorized”
    • 1h 26m “news & opinion”

iPhone screen time (assumed all unproductive)

  • Total: 27h 54m
  • Average: 3h 59m
  • Performance: 25% decrease from last week

Hourly journal

https://hourly-journal.yingw787.com

Goals for next week

Partial list:

  • [❓] Push the PostgreSQL + pg_cron Docker images pushed up to AWS ECR
  • [❓] Ensure that rex-ray Docker volume plugin persists PostgreSQL volumes after draining EC2/ECS instances, and that the same EBS volume connected to multiple different EC2 instances keeps the same cron.schedule table.
  • [❓] Do a write-up of shipping a Dockerized PostgreSQL instance with PostgreSQL extensions

  • [❓] Get static files properly shipped on both docker-compose and AWS via local EC2 Docker volumes
  • [❓] Get data uploads (CSV files) properly shipped on both local and AWS environments via EFS volumes

  • [❓] Get up CI/CD pipelines for test/production deploys with AWS CodeBuild and AWS CodePipeline
  • [❓] Update Basecamp roadmap
  • [❓] Add Docker healthchecks for staging release + acceptance + testing environments.
  • [❓] Add security group restrictions (policies and security group ingress/egress rules) for all CloudFormation resources
  • [❓] Deploy database cluster underneath private subnets + NAT, and turn off direct SSH access and mapping to IPV4 addresses
  • [❓] Add SNS topic for CloudFormation deploys

Things I've learned this week

  • You can't separate the compute layer. I tried to delete an ECS cluster after defining a separate compute layer for the underlying EC2 instances, thinking that the EC2 instances would have ECS containers deployed on top of them. Turns out, the EC2 instances are the individual containers, and since they're defined separately, the CloudFormation stack failed to delete properly as the EC2 instances (ECS containers) failed to drain away. Deleting the EC2 instance manually cannot be done since it's part of an autoscaling definition and the autoscaling group will automatically create another EC2 instance.

    Re-deploying the same ECS cluster after preserving the original one (due to failure of deletion) will result in new CloudFormation stacks failing to create. You need to run aws ecs delete-cluster before re-creating the ECS cluster.

    In addition, load balancing tightly couples the EC2 autoscaling group and the ECS runtime, since you need to determine which ports to expose.

    Don't separate out ECS and EC2 definitions.

  • It's better to use one CloudFormation stack definition than having multiple CloudFormation stacks. I went with having multiple CloudFormation stacks in the beginning because cfn-format organizes resources not by resource type but by resource name (alphabetical order), and because I was concerned about the lifecycles of components (e.g. accidentally deleting my EBS volume if I tried to delete a Docker container instance). However, the amount of work necessary in order to effectively separate out CloudFormation stacks is significant. For example, in order to use a nested CloudFormation, you must define a template, push the template to Amazon S3, and then make sure that global outputs / exports don't conflict.

  • Sometimes newer isn't better. I wanted to use the ECS-optimized Amazon Linux 2 AMI, because I've heard that Amazon Linux is approaching end-of-life and does not have LTS support. However, if you're using Amazon ECS, you need to configure /opt/aws/bin/cfn-signal and /opt/aws/bin/cfn-init, and there's not as much documentation on 2 as there is on 1. This makes it difficult to avoid bugs. I have trouble deleting ECS clusters due to lifecycle management issues (like running services that are still draining). The Amazon Linux 1 AMIs I've used have had no problems, but again, there are more and better documented examples out there.

    If you're looking to build a web-scale product, then it might make sense to use Kubernetes and Elastic Kubernetes Service so that you can migrate your infrastructure all at once to another cloud if need be, and keep your learnings. I'm moving in the opposite direction at the moment, with a desire to build a number of deployable binaries that can run anywhere (including on-premises). I'd probably prefer having my own package archive and an AMI builder (like Packer), and ship a complete AMI to AWS EC2 or wherever.

    This is also better for standing up a custom database, since you don't have to deal with ephemeral runtimes and possible data corruption during writes to persist. Stand up an EBS or EFS volume, and then stand up an EC2 instance based on an AMI.

  • I can understand why infrastructure is usually insecure by default. You can't monetize security (unless you're a security company), and it definitely slows stuff down, especially in DevOps where feedback loops are extremely long. I've personally turned off many security groups in my templates with “allow all” and hardcoded passwords where I should be using a secrets manager. Which brings me to mention how secrets rotation uses custom defined AWS Lambda methods in order to sed configuration files of different shapes and sizes, and Lambda (IMHO) remains notoriously difficult to maintain. From talking to AWS Support, AWS RDS has custom-defined Lambda methods underneath the hood for automatic secrets rotation, something I'm unlikely to ever get around to.

Subscribe to my mailing list