For most people, using SQL to transform data is a no-brainer, seeing it’s a very versatile language, and doesn’t have quite a steep learning curve compared to python. There are some cases where SQL is more suitable for a task, but the reverse can also happen as well.

For instance, given a string conversion problem:

  • if a string occurs only one time, replace it with #
  • if a string occurs multiple times, replace it with &
> one
###

> three
###&&

> Heartbreak hotel
&&&&&#&&&##&#&&#

A solution in python would be:

from collections import Counter

s = "three"
s_counter = Counter(s)

output_str = ''
for char in s:
    if s_counter[char]>1:
        output_str += '&'
    elif s_counter[char]==1:
        output_str += '#'

But a solution in SQL is…(thanks Emily @data-engineering-discord!):

CREATE TABLE data (
    string_value TEXT
);

INSERT INTO data VALUES
('one'),
('three'),
('Heartbreak hotel');


select
    string_value,
    translate(
        lower(string_value),
        string_agg(chr, '' order by chr),
        string_agg(subs, '' order by chr)
    )
from (
        select
            string_value,
            chr,
            case when count = 1 then '#' else '&' end as subs
        from (
            select
                p.*,
                count(*)
            from (
                select
                    string_value,
                    regexp_split_to_table(lower(string_value), '') as chr
                from data
            ) as p
            group by 1, 2
        ) as q
    ) as r
group by 1

😱😱😱😱😱😱😱😱😱