Book highlights - Managing Transitions: Making the Most of Change by William Bridges

It isn’t the changes themselves that the people in these cases resist. It’s the losses and endings that they have experienced and the transition that they are resisting. A large university reassigned one of its vice presidents to a far less important area than the one he had previously headed, and although no one called it a demotion, it was hard to see it as anything else. Everyone knew that he had been ineffective in his previous job, and his new job actually fit his talents far better, but he was deeply hurt by the move....

April 27, 2023 · 2 min · Karn Wong

Use SQL against CSV (or other hard files) without CLI

CSV as a file format is very versatile, almost any programs can parse it. The only issue is you can’t use SQL against CSV files directly. This is a major pain point, since using SQL is so much faster than firing up a jupyter notebook and wrangle the data in python, or use Excel and apply transformations until you get desired results. But the question is how do we use SQL against CSV files in the first place....

April 25, 2023 · 2 min · Karn Wong

DevX starts at your local machine

Platform engineering is all the rage these days. Often, you’ll often hear this term with the keyword DevX. How are they related? Imagine you are working on a microservice backend. You are just starting out, so you don’t have much features to work on yet. But as a PoC, you only need to [fetch data] and [return aggregated price]. You can do microservices on Kubernetes, but you are not familiar with DevOps so you turn to a cloud provider - AWS....

April 22, 2023 · 4 min · Karn Wong

The mythical ChatOps in action

Imagine having multiple services running, each has its own logs. Most people don’t read them, and they shouldn’t, because services emit a lot of logs! But we need them, because it’s the only way to diagnose and troubleshoot system errors. But you might say “my service is not a system! It’s only doing tiny stuff!” Gotta break it to you, your small part is a piece in a large system networks stitched together!...

April 18, 2023 · 3 min · Karn Wong

DuckDB vs Polars vs Spark!

I think everyone who has worked with data, in any role or function, used pandas 🐼 at certain point. I first used pandas in 2017, so it’s 6 years already. Things have come a long way, and so is data size I’m working with! Pandas has its own issues, namely no native support for nested schema. In addition, it’s very heavy-handed regarding data types inference. It can be a blessing, but it’s a bane for data engineering work, where you have to make sure that your data conforms to agreed-upon schema (hello data contracts!...

April 7, 2023 · 3 min · Karn Wong