Work Experience (4)

Oct 2022 - Current
Head of platform engineering
Baania
https://www.baania.com
  • Research and implement internal data platform v2, with CI/CD for integration and unit tests, dependencies and code changes

  • Use Terraform with Infracost to asses infrastructure cost to prune unused/underutilized resources, saving more than 100 USD / month (FinOps)

Jan 2022 - Sep 2022
Lead Data Engineer | Machine Learning Engineer | SRE
Baania
https://www.baania.com
  • Enable backend team to perform auto-deployment to ECS with Terraform and GitHub Actions

  • Manage repo permissions / secrets / webhooks with Terraform

  • Research BI solution to consolidate fragmented dashboard platforms into a single one

  • Setup Grafana for centralized metrics and logs monitoring

  • Introduce Sourcegraph to help with code search and refactoring

  • Setup GitOps for Terraform to enable collaboration between different teams, in turn reducing configuration drift

Jan 2021 - Jan 2022
Senior Data Engineer
Baania
https://www.baania.com
  • Reduce development time for ETL pipelines from a week to 1 day via workflow redesign + codebase refactoring

  • Mentor data engineers

  • Consult other teams as a platform engineer

  • Set up alerts & monitoring to automatically notify task failures (ChatOps)

  • Create a script to automatically grant postgres access permission based on user groups, with option for special permissions per-user basis

  • Optimize a large spark pipeline that fails often due to OOM with divide-and-conquer method for unlimited scaling

  • Reduce runtime for PR code quality checks by 90% to shorten feedback loop cycle

  • Set up secrets management using SOPS/Terraform/AWS SSM, for improved security and secrets rotation

  • Launched Baania Engineering Blog, a platform to showcase how Baania does things behind the scenes

  • Reduce employee onboarding time per employee by a few days via a setup script to setup necessary tools, applications and environment for development

Apr 2018 - Jan 2021
Data Engineer
Baania
https://www.baania.com
  • Create and maintain data gathering infrastructure for daily ingestion and processing to be stored in data lake (S3)

  • Create and optimize machine learning models to achieve near-realtime performance

  • Create and maintain ETL pipelines via task orchestrator to reduce data update frequency from once a month to daily

  • Deploy and maintain ML via cloud services to reduce ML deployment time from a day to within minutes

  • Mentor data scientists

  • Automate infrastructure and governance using Terraform

  • Package cron services to AWS ECS and invoke via AWS ECS Task to cut down cost from 50 USD / year to 0.1 USD / year

Projects (6)

Dataframe Frameworks Showdown
Apr 2023 - Apr 2023
https://www.karnwong.me/posts/2023/04/duckdb-vs-polars-vs-spark/
  • duckdb
  • polars
  • spark
  • dataframe
  • data engineering
  • Benchmark performance between duckdb, polars and spark. In addition to runtime, RAM usage is also provided.

music-lyrics-tagger
Feb 2023 - Mar 2023
https://github.com/kahnwong/music-lyrics-tagger
  • id3tag
  • ogg
  • flac
  • m4a
  • Add lyrics to flac and m4a files.

nix
Nov 2022 - Current
https://www.karnwong.me/posts/2022/12/cross-platform-package-env-management-with-nix/
  • Nix
  • environment
  • dotfiles
  • package manager
  • A cross-platform setup script that works with both Linux and Mac.

terraform-sops-ssm
Nov 2021 - Nov 2021
https://github.com/kahnwong/terraform-sops-ssm
  • terraform
  • aws
  • github
  • security
  • Create SSM secrets from SOPS-encrypted secrets.

  • Create github-ci+lambda roles and users with access to SSM.

Impute Pipelines
Nov 2019 - Dec 2019
https://www.karnwong.me/posts/2020/05/impute-pipelines/
  • Machine Learning
  • data science
  • hyperparameter tuning
  • Use machine learning to fill in missing data

  • Utilize hyperparameter tuning to find the optimum parameters

self-hosted
Dec 2017 - Current
https://github.com/kahnwong/self-hosted/
  • docker
  • kubernetes
  • helm
  • terraform
  • Alternative / self-hosted version for popular subscription services: Netflix, Spotify, LastPass, Trello, Dropbox, NordVPN, etc.

  • Managed via docker-compose and helm

  • Use Caddy for reverse-proxy

  • Use terraform to manage DNS via Cloudflare

Education (1)

2015 - 2018
Bachelor
Information and Communication Technology
Rangsit University

Publications

1 Feb 2019

Languages

English

Native speaker

Thai

Native speaker