String handling in ClickHouse
rushter.comWe keep a precomputed cityhash64 value for a few columns we know are going to be used for aggregations. Rather than relying on ClickHouse to do it internally, this explicity behavior I've found is faster.
Especially if it's a multi tenant architecture, it helps to have the cityHash64 caclulated as a combination of tenant ID and another column, so the overall amount of data scanned is lowered too.