Research and implement internal data platform v2, with CI/CD for integration and unit tests, dependencies and code changes
Use Terraform with Infracost to asses infrastructure cost to prune unused/underutilized resources, saving more than 100 USD / month (FinOps)
Enable backend team to perform auto-deployment to ECS with Terraform and GitHub Actions
Manage repo permissions / secrets / webhooks with Terraform
Research BI solution to consolidate fragmented dashboard platforms into a single one
Setup Grafana for centralized metrics and logs monitoring
Introduce Sourcegraph to help with code search and refactoring
Setup GitOps for Terraform to enable collaboration between different teams, in turn reducing configuration drift
Reduce development time for ETL pipelines from a week to 1 day via workflow redesign + codebase refactoring
Mentor data engineers
Consult other teams as a platform engineer
Set up alerts & monitoring to automatically notify task failures (ChatOps)
Create a script to automatically grant postgres access permission based on user groups, with option for special permissions per-user basis
Optimize a large spark pipeline that fails often due to OOM with divide-and-conquer method for unlimited scaling
Reduce runtime for PR code quality checks by 90% to shorten feedback loop cycle
Set up secrets management using SOPS/Terraform/AWS SSM, for improved security and secrets rotation
Launched Baania Engineering Blog, a platform to showcase how Baania does things behind the scenes
Reduce employee onboarding time per employee by a few days via a setup script to setup necessary tools, applications and environment for development
Create and maintain data gathering infrastructure for daily ingestion and processing to be stored in data lake (S3)
Create and optimize machine learning models to achieve near-realtime performance
Create and maintain ETL pipelines via task orchestrator to reduce data update frequency from once a month to daily
Deploy and maintain ML via cloud services to reduce ML deployment time from a day to within minutes
Mentor data scientists
Automate infrastructure and governance using Terraform
Package cron services to AWS ECS and invoke via AWS ECS Task to cut down cost from 50 USD / year to 0.1 USD / year
Benchmark performance between duckdb, polars and spark. In addition to runtime, RAM usage is also provided.
Add lyrics to flac and m4a files.
A cross-platform setup script that works with both Linux and Mac.
Create SSM secrets from SOPS-encrypted secrets.
Create github-ci+lambda roles and users with access to SSM.
Use machine learning to fill in missing data
Utilize hyperparameter tuning to find the optimum parameters
Alternative / self-hosted version for popular subscription services: Netflix, Spotify, LastPass, Trello, Dropbox, NordVPN, etc.
Managed via docker-compose and helm
Use Caddy for reverse-proxy
Use terraform to manage DNS via Cloudflare