On this page

When designing a schema for ClickHouse, there are dozens of large and small decisions engineers need to make to design a well-performing solution fit for the problem being solved.

The following documents outline various schemas we have at PostHog, examining why they are designed this way, what are some good parts about them, and mistakes that were made.

Schemas

Questions?

Was this page useful?

Next article

sharded_events

sharded_events table powers our analytics and is the biggest table we have by orders of magnitude. In this document, we'll be dissecting the state of the table at the time of writing, some potential problems and improvements to it. Schema The table is sharded by sipHash64(distinct_id) ORDER BY The ORDER BY clause for this table is: Most insight queries have filters along the lines of: Which is well-served by the first 3 parts of this ORDER BY . Note that: This ORDER BY doesn't speed up…

Read next article