First post

Posted on January 14, 2021 by 46o60

Tags: make

Table of Contents #

Traditional start #

Every blog has to have a first post and in fear of ending up like Joanne Dvorak I had to write in my first blog post about my static site generator setup. :-)

Joanne Dvorak story

It took me more than one year to finish everything, this includes:

  • web application
  • a CLI tool for complete automation
  • full pipeline

Considering it is a hobby project done in my free time, that in last year because of personal reasons I had super low amount of free time, that I am not actually a developer and that most of it I was learning and doing for the first time I am actually super happy that it actually got finished. I definitely had many moments where I though "damn, another unfinished project"... XD

High level overview #

The application is written in Python and the used web framework is Flask. It is hosted completely on AWS using serverless technology. It is always easier to start with architecture diagram so here it is:

Architecture diagram

The main part is in Lambda where web application code is executed. The data is stored in serverless Dynamo DB. In front of Lambda there is API Gateway which together with S3 bucket for static files is responsible to deliver all pages. In front of it all there is a CloudFront CDN distribution that handles the caching. There is one Lambda@Edge embedded in the distribution to add non-default HTTP headers into the responses. Domain is hosted in Route53, and certificate in Certificate Manager. Finally, there is an optional WAF in place, used for now only to whitelist access for testing environment.

Design choices #

I started this project by using Zappa but found quickly that it lacked some things that I wanted to customize. Taking a hard path forward as any true developer, I decided to just re-implement everything.

The code #

Web application is actually smaller part of the whole project, majority is infrastructure-as-a-code written as a custom Python script that can create or delete the whole infrastructure. For comparison, web application has around 1k of lines while the CLI tool has around 8k. In addition to that there is a docker file for compiling Bootstrap, pipeline definitions, tests and finally the blog post content. In total, counting only the code lines there is around 13k lines of code.

I am comfortable in reading several programing languages however when we are talking about writing code then I feel the best at writing Python code. Working in security there are so many opportunities to automate things and Python is best for this, so I had plenty of experience with it. And for a simple website having Python as programming language seems like acceptable thing.

Django and Flask are two main competitors for web frameworks in Python. I went with Flask's simplicity for simplicity’s sake, not that Django is a complicated framework but just because Flask is simpler.

CLI tool to create the infrastructure was naturally also writen in Python. All communication with AWS is done through boto module. In addition to create and destroy infrastructure it has few additional simple capabilities to manage posts locally on the file disk.

Git to manage blog post content #

I decided to merge both the blog post content together with the actual code. That way we get for free all the benefits of version control system also for the blog posts. Dynamo DB is then used only to host the current version to the public.

Posts written in Markdown #

Content of each blog post is written in Markdown which made sense as I am writing my general notes in Markdown so upgrading a good note to a blog post should be fairly easy. Posts are converted from Markdown to HTML when they are uploaded to Dynamo DB.

Markdown and Git makes it then super easy to collaborate with someone on some post if needed.

Gitea and Jenkins #

Having a blog infrastructure completely dependent on AWS I wanted to make a counterbalance and have everything else completely independent of anything else. That is why I skipped on GitHub and just decided locally to host an open source alternative Gitea.

I never worked as a developer, so I never really had opportunity to create a true CI/CD pipeline in my work. The tool of choice was Jenkins, not because I evaluated a bunch of pros and cons, but just because I wanted to try it out. There are some things I did not like but in general it did the job.

Jenkins works nicely with Gitea after a special Gitea plugin is loaded. Not that is really necessary, but I decided to have three main branches in Git repository, one for development, testing and production environment. On a commit Gitea notifies Jenkins with a web hook which then in turn runs the pipeline. In addition to those I have one more branch that is used only to write new blog posts.

Costs #

True costs will show themselves in the coming months, as the blog becomes alive. The idea was to host a site as cheap as possible, one of the reasons why I went serverless instead of running an instance for example. Serverless technologies like Lambda, S3 and Dynamo DB have pretty good free tier volume and in combination with CloudFront caching the cost of the whole blog is expected to be super low. Time will tell if I missed some important consideration here. :-)

Initially the biggest cost during the development of the blog was having the hosted zone in Route 53 which cost $0.5. However, I decided to have WAF in front of everything as it was one of the few ways how to whitelist access to environment per source IP but also to have ability, if needed, to defend against random or targeted scanning that could run up the costs of all the AWS services. This increases the monthly base cost to around $7 with higher values depending on the actual traffic. And if I by some miracle I get enough volume to break through the free tier I will gladly pay it.

Could this be simpler? #

Definitely. However, even though in first paragraph I labeled this blog "static site" I plan on eventually adding more logic to it, so I wanted to retain that flexibility.

The AWS infrastructure could also probably be simpler. My motivation was combination of using cool new technology to learn how it works with being cost-efficient. Serverless seemed like a nice fit into that requirement with very cool benefits like autoscaling. So I went with the standard CloudFront-API Gateway-Lambda combination.

Also, my current long term TODO list has over 100 items in categories like look and feel, optimization, costs, functionality, tests, etc.

List of used technologies #

  • Bootstrap for frontend
    • jQuery just because it is a requirement for Bootstrap (can't wait to switch to Bootstrap 5)
  • Python for web application code and CLI
    • Flask as web framework
    • Jinja templating engine for HTML files
    • PynamoDB for the interface towards Dynamo DB
  • AWS for whole hosting infrastructure
    • IAM for roles and privileges
    • Lambda for web application code and Lambda@Edge for custom HTTP headers
    • Dynamo DB for blog post data
    • S3 for static files
    • API Gateway for managing access to Lambda
    • CloudFront for caching
    • CloudWatch for logging
    • Certificate Manager for TLS certificate
    • Route53 for domain name
    • WAF for whitelisting and defense
  • Markdown for writing posts
  • Docker for compiling Bootstrap and running pipeline steps
  • Jenkins as CI/CD tool
  • Git and Gitea for source code version control

Thanks! #

I would like to thank you for visiting my blog and reading stuff that I wrote. I was thinking of writing a more detailed explanation of the three main components, web application, CLI tool and pipeline as well as sharing full source code of everything. Let me know if you are interested in this either on Twitter or by emailing me, receiving some feedback will be nice motivation to actually do it.

Worth sharing? Tweet this blog post