NULLs in ClickHouse can hurt performance | Artem Golubin
When coming from relational databases, NULLs are the go-to for optional fields.<br>Using them in ClickHouse can lead to unexpected and often unnoticeable performance degradation.<br>This article explain why.
PostgreSQL
When using null values in PostgreSQL, you rarely notice any difference.<br>In PG, columns are nullable by default and you can index them.
Internally, each row in PostgreSQL has a bitmap that indicates which columns are NULL. It's only present when<br>there are null values in a particular row using a bit flag.
PostgreSQL is a row-oriented database, so when you read a row, you read all the columns together.
ClickHouse
Unlike PostgreSQL, ClickHouse is a columnar database.<br>Instead of storing data by rows, it organizes them by columns.<br>Each column is stored separately as a contiguous block of data.
Let's suppose we have a table that stores HTTP logs.<br>We want to store the visitor's user ID, which can be empty for anonymous users.
If you are coming from OLTP databases, the first idea is to create a table with a nullable user_id column:
CREATE TABLE http_logs<br>timestamp DateTime,<br>path LowCardinality(String),<br>user_id Nullable(UInt32) CODEC(Delta, ZSTD)<br>ENGINE = MergeTree<br>ORDER BY (path, timestamp);
The user_id column is defined as nullable and uses a combination of Delta and ZSTD compression codecs.<br>Since columns are stored as chunks (partitions), compression is a very efficient way to reduce storage cost and improve performance.<br>Decoding compressed data on the fly is usually faster than reading 10x more data.
Since the user_id column is typed, it's stored as a contiguous block of memory of the same type (UInt32 in this case).<br>To avoid extra data manipulations, ClickHouse stores the data in the same format it would be loaded into RAM.<br>You can view this as creating a plain C array (uint32_t user_id[]) and storing it on a disk.
Because of that, we can't store NULL values directly. It's not possible to assign null values to integers.
There are two ways to handle NULL values:
Add metadata to each value to indicate whether it's NULL or not.
Use a separate null map to indicate which values are NULL.
The first approach requires additional storage for each value,<br>breaks CPU vectorization and adds unnecessary padding for data alignment.
The second approach, which is used by ClickHouse, is more efficient,<br>but it still requires additional space and processing to manage the map.<br>Every time a column is filtered, ClickHouse reads the map and checks which values are NULL.<br>Since typical ClickHouse deployments store millions of rows, this leads to significant overhead.
Despite a similar approach to PostgreSQL, this does not work well in ClickHouse.<br>Columnar databases are designed to scan billions of values without any indexes.<br>This is a pretty rare use case for row-oriented databases.<br>That's one of the reasons why columns are not nullable by default, and you can't use NULLs in indexes in ClickHouse.
To avoid null values, the typical solution is to replace them with a special value, such as 0 or -1.<br>In our case, zero works well, since user IDs usually start with 1.
Internals
Internally, null values are stored as a byte map (array of UInt8).<br>The map itself is just a contiguous block of values that acts as a mask.<br>If a column contains 1 million of values, the null map will also contain 1 million of elements (bytes).<br>Such a map can be compressed very well, because it contains only two distinct values (0 and 1).<br>Where most of the data is usually zero.
On the disk and in memory, it looks as follows:
user_id<br>├── values: [42, 17, 0, 3, 0]<br>└── null_map: [ 0, 0, 1, 0, 1]
In the values array, the null variables are set to type's default value (0 for integers).<br>In the null map, a value of 1 indicates that the corresponding value in NULL.<br>In our example, the third and fifth values are NULL.
As you can see, our user_id already contains zeros internally, even when the column is nullable.<br>So using zeros instead of NULLs makes sense.<br>Although, ClickHouse does not guarantee that in the values array, the null values will always be set to a default value.<br>They can contain any number.
In theory, the null map could be stored as a bitmap, where each bit represents the nullability of a value.<br>But ClickHouse developers decided to use a byte map instead.
/// Class that specifies nullable columns. A nullable column represents<br>/// a column, which may have any type, provided with the possibility of<br>/// storing NULL values. For this purpose, a ColumNullable object stores<br>/// an ordinary column along with a special column, namely a byte map,<br>/// whose type is ColumnUInt8. The latter column indicates whether the<br>/// value of a given row is a NULL or not. Such a design is preferred<br>/// over a bitmap because columns are usually stored on disk as compressed<br>/// files. In this regard, using a bitmap instead of a byte map would<br>/// greatly complicate the implementation with little...