Use SSH key during docker build without embedding the key via ssh-agent

Imagine working in a company, and they have a super cool internal module! The module works great, except that it is a private module, which means you need to install it by cloning the source repo and install it from source. That shouldn’t be an issue if you work on your local machine. But for production usually this means you somehow need to bundle this awesome module into your docker image....

February 6, 2022 · 2 min · Karn Wong

Use pyspark locally with docker

For data that doesn’t fit into memory, spark is often a recommened solution, since it can utilize map-reduce to work with data in a distributed manner. However, setting up local spark development from scratch involves multiple steps, and definitely not for a faint of heart. Thankfully using docker means you can skip a lot of steps 😃 Instructions Install Docker Desktop Create docker-compose.yml in a directory somewhere version: '3....

December 21, 2021 · 3 min · Karn Wong

Reduce docker image size with alpine

Creating scripts are easy. But creating a small docker image is not 😅. Not all Linux flavors are created equal, some are bigger than others, etc. But this difference is very crucial when it comes to reducing docker image size. A simple bash script docker image Given a Dockerfile (change apk to apt for ubuntu): FROMalpine:3WORKDIR/appRUN apk update && apk add jq curlCOPY water-cut-notify.sh ./ENTRYPOINT ["sh", "/app/water-cut-notify.sh"] Base image Docker image size alpine 11....

December 19, 2021 · 1 min · Karn Wong

Secrets management with SOPS, AWS SSM and Terraform

At my organization we use sops to check in encrypted secrets into git repos. This solves plaintext credentials in version control. However, say, you have 5 repos using the same database credentials, rotating secrets means you have to go into each repo and update the SOPS credentials manually. Also worth nothing that, for GitHub actions, authenticating AWS means you have to add repo secrets. This means for all the repos you have CI enabled, you have to populate the repo secrets with AWS credentials....

November 30, 2021 · 4 min · Karn Wong

Run GitHub Actions faster with cache for pipenv and docker build

Update 2021-11-29 Recently we create more PRs, notice that there are a lot of redundant steps (env setup before triggering checks, etc). Found out you can cache steps in GitHub Actions, so I did some research. Got it working and turns out I reduce at least 60% actions time for a large docker image build (since only the later RUN directives are changed more frequently). For pipenv it shaved off 1 minute 18 seconds....

November 9, 2021 · 1 min · Karn Wong