TinyDevCRM Update #15: So close to MVP v0.1 release

This is a summary of TinyDevCRM development for the week of May 16th, 2020 to May 23rd, 2020.

Goals from last week

  • [❓] Add API endpoint for creating SQL tables using form-based POST requests.
  • [❓] Add API endpoint for viewing SQL tables.
  • [❓] Add API endpoint for creating materialized views.
  • [❓] Add API endpoint for viewing materialized views.
  • [❓] Add API endpoint for creating trigger definitions.
  • [❓] Add API endpoint for viewing trigger definitions.
  • [❓] Add API endpoint for creating a scheduled cron job.
  • [❓] Add API endpoint for viewing scheduled cron jobs.
  • [❓] Add WebSocket endpoint for forwarding notifications.
  • [❓] Add dashboard view for viewing SQL tables.
  • [❓] Add dashboard view for creating materialized views.
  • [❓] Add dashboard view for viewing materialized views.
  • [❓] Add dashboard view for creating trigger definitions.
  • [❓] Add dashboard view for viewing trigger definitions.
  • [❓] Add dashboard view for creating a scheduled cron job.
  • [❓] Add dashboard view for viewing scheduled cron jobs.
  • [❓] Add dashboard view for forwarding notifications.
  • [❓] Update landing page with user feedback from Pioneer.app.
  • [❓] Update Basecamp roadmap
  • [❓] Do a write-up of shipping a Dockerized PostgreSQL instance with PostgreSQL extensions
  • [❓] Add API endpoint for updating/deleting SQL tables.
  • [❓] Add API endpoint for updating/deleting materialized views.
  • [❓] Add API endpoint for updating/deleting trigger definitions.
  • [❓] Add API endpoint for updating/deleting scheduled cron jobs.
  • [❓] Create Django custom change password forms, so that passwords do not have write access in the browser, and you can change passwords in a plaintext form.
  • [❓] Get up CI/CD pipelines for test/production deploys with AWS CodeBuild and AWS CodePipeline
  • [❓] Add Docker healthchecks for staging release + acceptance + testing environments.
  • [❓] Add security group restrictions (policies and security group ingress/egress rules) for all CloudFormation resources
  • [❓] Deploy database cluster underneath private subnets + NAT, and turn off direct SSH access and mapping to IPV4 addresses
  • [❓] Add SNS topic for CloudFormation deployments.

What I got done this week

  • [✔] Add API endpoint for creating SQL tables using form-based POST requests.
  • [❌] Add API endpoint for viewing SQL tables.
  • [✔] Add API endpoint for creating materialized views.
  • [❌] Add API endpoint for viewing materialized views.
  • [❌] Add API endpoint for creating trigger definitions.
  • [❌] Add API endpoint for viewing trigger definitions.
  • [✔] Add API endpoint for creating a scheduled cron job.
  • [❌] Add API endpoint for viewing scheduled cron jobs.
  • [❌] Add WebSocket endpoint for forwarding notifications.
  • [❌] Add dashboard view for viewing SQL tables.
  • [❌] Add dashboard view for creating materialized views.
  • [❌] Add dashboard view for viewing materialized views.
  • [❌] Add dashboard view for creating trigger definitions.
  • [❌] Add dashboard view for viewing trigger definitions.
  • [❌] Add dashboard view for creating a scheduled cron job.
  • [❌] Add dashboard view for viewing scheduled cron jobs.
  • [❌] Add dashboard view for forwarding notifications.
  • [❌] Update landing page with user feedback from Pioneer.app.
  • [❌] Update Basecamp roadmap
  • [❌] Do a write-up of shipping a Dockerized PostgreSQL instance with PostgreSQL extensions
  • [❌] Add API endpoint for updating/deleting SQL tables.
  • [❌] Add API endpoint for updating/deleting materialized views.
  • [❌] Add API endpoint for updating/deleting trigger definitions.
  • [❌] Add API endpoint for updating/deleting scheduled cron jobs.
  • [❌] Create Django custom change password forms, so that passwords do not have write access in the browser, and you can change passwords in a plaintext form.
  • [❌] Get up CI/CD pipelines for test/production deploys with AWS CodeBuild and AWS CodePipeline
  • [❌] Add Docker healthchecks for staging release + acceptance + testing environments.
  • [❌] Connect AWS application to AWS database, and add migration task definition.
  • [❌] Add security group restrictions (policies and security group ingress/egress rules) for all CloudFormation resources
  • [❌] Deploy database cluster underneath private subnets + NAT instead of public subnet, and turn off direct SSH access and mapping of EC2 instances to IPV4 addresses
  • [❌] Add SNS topic for CloudFormation deployments.
  • [❌] Add support for “agents” via authentication (“bots”) with JWT fine-grained permissioning.
  • [✔] Update JWT authentication middleware to point to custom User model and make Django models user-aware
  • [✔] Add foreign data wrapper in order to create foreign tables as Apache Parquet files
  • [✔] Update tinydevcrm-frontend with updated devops understanding (Dockerization + Makefile commands)

Metrics

  • Weeks to launch (primary KPI): 2 (11 weeks after declared KPI of 1 week)
  • Users talked to total: 1

RescueTime statistics

  • 77h 31m (54% productive)
    • 26h 27m “software development”
    • 19h 33m “utilities”
    • 11h 1m “entertainment”
    • 9h 8m “communication & scheduling”
    • 3h 32m “design and composition”

iPhone screen time (assumed all unproductive)

  • Total: 26h 45m
  • Average: 3h 49m
  • Performance: 14% decrease from last week

Hourly journal

https://hourly-journal.yingw787.com

Goals for next week

  • [❓] Make all Django models user-aware
  • [❓] Set up channels Django model and be able to successfully curl an HTTP endpoint to get events
  • [❓] Update dashboard to reflect basic creation process
  • [❓] Ship stacks to production
  • [❓] Add API endpoint for viewing SQL tables.
  • [❓] Add API endpoint for viewing materialized views.
  • [❓] Add API endpoint for creating trigger definitions.
  • [❓] Add API endpoint for viewing trigger definitions.
  • [❓] Add API endpoint for viewing scheduled cron jobs.
  • [❓] Add WebSocket endpoint for forwarding notifications.
  • [❓] Add dashboard view for viewing SQL tables.
  • [❓] Add dashboard view for creating materialized views.
  • [❓] Add dashboard view for viewing materialized views.
  • [❓] Add dashboard view for creating trigger definitions.
  • [❓] Add dashboard view for viewing trigger definitions.
  • [❓] Add dashboard view for creating a scheduled cron job.
  • [❓] Add dashboard view for viewing scheduled cron jobs.
  • [❓] Add dashboard view for forwarding notifications.
  • [❓] Update landing page with user feedback from Pioneer.app.
  • [❓] Update Basecamp roadmap
  • [❓] Do a write-up of shipping a Dockerized PostgreSQL instance with PostgreSQL extensions
  • [❓] Add API endpoint for updating/deleting SQL tables.
  • [❓] Add API endpoint for updating/deleting materialized views.
  • [❓] Add API endpoint for updating/deleting trigger definitions.
  • [❓] Add API endpoint for updating/deleting scheduled cron jobs.
  • [❓] Create Django custom change password forms, so that passwords do not have write access in the browser, and you can change passwords in a plaintext form.
  • [❓] Get up CI/CD pipelines for test/production deploys with AWS CodeBuild and AWS CodePipeline
  • [❓] Add Docker healthchecks for staging release + acceptance + testing environments.
  • [❓] Connect AWS application to AWS database, and add migration task definition.
  • [❓] Add security group restrictions (policies and security group ingress/egress rules) for all CloudFormation resources
  • [❓] Deploy database cluster underneath private subnets + NAT instead of public subnet, and turn off direct SSH access and mapping of EC2 instances to IPV4 addresses
  • [❓] Add SNS topic for CloudFormation deployments.
  • [❓] Add support for “agents” via authentication (“bots”) with JWT fine-grained permissioning.

Things I've learned this week

  • I really hate how most popular columnar file formats are tightly coupled to some ecosystem. Parquet is tightly coupled to Apache Arrow, and ORC is tightly coupled to Apache Hive. This isn't a big deal and nothing new, but one pain point is build times for extensions around just the file format isn't something maintainers optimize for.

    I stumbled on this when I found PostgreSQL's COPY FROM to be fairly lacking. I thought it'd be great at parsing CSV files, but apparently it has trouble even with clean CSVs. I don't want to deal with this issue in production, and I've had good experiences with Parquet in the past, so I made an off-the-cuff decision to support Parquet first and add an ETL service later.

    When I was looking up Parquet and PostgreSQL together, I found a foreign data wrapper where you can create foreign tables from a file, thereby bypassing the struggle of creating a table and having a file. Just use the file as the table, and create materialized views on top of that instead.

    This means building from source this project called parquet_fdw, which depends on libarow and libparquet. libarrow and libparquet aren't supported on Alpine, and I need to compile them myself. Hence the compile-time blues.

    I still want to stick with Alpine because I feel safer with it. It's just less stuff that's running, and less stuff means less stuff to go wrong. If that means a one-time deal to have some compile-time pains, that should be worth the tradeoff, right? Not like this project is on schedule anyways.

  • If you're downloading an Apache project to compile from source, go with something mainline for your Docker base image. When downloading apache/arrow for compiling parquet_fdw, I ran into a musl vs. glibc conflict with getting secure_getenv:

    /usr/local/lib # ldd libarrow.so.17
        /lib/ld-musl-x86_64.so.1 (0x7f299d9b5000)
        libz.so.1 => /lib/libz.so.1 (0x7f299c96c000)
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x7f299c7d3000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x7f299c7bf000)
        libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f299d9b5000)
    Error relocating libarrow.so.17: secure_getenv: symbol not found
    /usr/local/lib #
    

    This isn't really solvable, as in there's literally no up-to-date documentation and everybody's telling you to move to some distro that uses glibc. Also, encountering these kinds of errors for a database isn't really my cup of tea, so to postgres:stable and debian it is.

  • Push-based APIs still seem in their infancy. Asynchronous patterns like request/response are easy to architect, because there's really no persist-based obligations at runtime w.r.t. network proxying. When you get into synchronous patterns, like pub/sub or long polling, suddenly you, or the client, or a third party, have to buffer data somewhere. I think this is one major use case of Redis / Kafka / RabbitMQ, is just the memory handling aspect. For TinyDev, I want it to be, well, “tiny”, and I'd rather figure out a way for PostgreSQL to buffer data so as to not increase the number of dependencies.

    I found HTTP/2 w/ Server Sent Events (SSE) and Pushpin to meet my needs for now, because you can just curl an endpoint and listen to events without having to “subscribe” to things, but it does look like there's significant fragmentation in terms of implementation and adoption of various protocols (e.g. WebSockets, which I didn't want to use because of concerns around load balancing / horizontal scaling and server downtime, in addition to server writes through the channel being unnecessary and having to support custom dependencies). It's definitely a different world from the regular world of HTTP and there's a lot of work / opportunity to be done / had in this realm.

  • I think I figured out some comparative tools to what I'm building. Materialize.io looks pretty similar, PopSQL is mildly simliar, in terms of the streaming requirements, the scheduled queries, and the usage of PostgreSQL on the backend.

    Funny thing is, PopSQL's scheduled queries only run at every-hour granularity, so if I can support per-minute query refreshes on a multi-node database deployment, which I think should be possible, I can do better than a product somebody pays for. Theoretically. Add to that CREATE FOREIGN TABLE for backups of tables and views as Parquet files, zipped and emailed, and it might just beat out a good number of new-ish SQL products on the market today. This would be hilarious.

  • Foreign data wrappers are soooooo cool. Like, look. You can create AWS Redshift tables in your PostgreSQL database by adding a JDBC foreign data wrapper. So you can use PostgreSQL to connect to Redshift. And if it supports JDBC, you can use PostgreSQL as a pipe-through entity for any number of JDBC-compatible databases. You can use SAP HANA through PostgreSQL, or export backups to AWS S3 CSVs, maintain one type of database interface to a web-based application, and just pipe through a SQL query whenever it's convenient to you. Literally.

    And it sticks much closer to bare-metal too! No fancy application servers doing everything for you that needs a rewrite every time a language committee wants something new, everything is declarative and data-based.

Subscribe to my mailing list