Efficient Data Logger Design<br>Jun 20, 2026<br>Efficient Data Logger Design
This article describes the data logger design used in ReductStore, a multimodal time-series storage engine. The design is not theoretical: it is battle-tested in a real project and shaped by practical constraints around durability, write throughput, indexing, and recovery.
The main idea is to keep writes simple while still making reads efficient. The same layout should work when data is stored on a local disk, on network filesystems such as NFS, or in remote object storage such as S3. That constraint affects many design choices: metadata should be small and fetchable on its own, data blocks should become immutable, and recovery should not require scanning large objects unless absolutely necessary.
Append-only logger
Let’s start with the simplest design: an append-only log.
Every time we store a new record, we write it to the end of the file. Existing records are never modified in place, which makes writes simple and predictable.
some<br>data<br>some<br>more<br>data<br>even<br>more<br>data<br>2023-01-01 00:00:00<br>2023-01-01 00:01:00<br>2023-01-01 00:02:00
Notice<br>The timestamp does not have to be stored as an ISO string. In a real system, it could be a Unix timestamp, a monotonic sequence number, or even a delta from the previous entry. We use ISO strings here only to keep the examples readable.
This design has a few important advantages:
Writes are sequential, which is usually efficient for storage.
Records are never overwritten, which simplifies recovery after crashes.
The format is easy to inspect and reason about.
However, this simplicity comes with limitations. As the log grows, reads become slower because we may need to scan large amounts of data. The file also grows indefinitely unless we introduce partitioning or retention rules.
Our journey starts with this simple design, but as we will see, we need additional structure to build a resilient and efficient data logger.
Partitioning and pre-allocation
Append-only logs are simple, but a single ever-growing file quickly becomes inconvenient.
There are a few practical issues:
The log grows indefinitely, which makes retention and cleanup harder.
Large files are slower to scan when looking for a specific time range.
A single append position can become a bottleneck when multiple writers are writing concurrently.
Growing a file gradually may cause allocation overhead and filesystem fragmentation.
To address these issues, we can split the log into fixed-size blocks. Instead of writing everything into one large file, the logger writes records into the current block until it is full, then creates a new one.
Date<br>directory<br>2023-01-01-00:00:00.blk<br>2023-01-02-00:00:00.blk<br>2023-01-03-00:00:00.blk
We can also pre-allocate each block. This means the logger asks the operating system to reserve a fixed amount of disk space in advance. Once the block exists, writers do not have to compete for a single append position. The logger can hand out byte ranges inside the block, and each writer can fill its assigned range independently.
some<br>data<br>some<br>more<br>data
Writer<br>Writer<br>N1<br>N2<br>2023-01-01-00:00:00.blk<br>2023-01-01 00:00:00<br>2023-01-01 00:01:00
For example, writer N1 may receive bytes 0..1023, while writer N2 receives bytes 1024..2047. Both writes target the same block, but they write to different positions, so they can proceed in parallel without waiting for one shared append pointer to move.
Notice<br>This approach has an important constraint: the logger must know the record size at the beginning of the write operation. If data arrives as an open-ended stream, the logger first has to receive the full record and only then assign a byte range for it. That means it cannot start writing that record in parallel with other records while the stream is still being received.
When the current block reaches its maximum size, the logger closes it and starts a new block.
Each block has a timestamp in its name, which makes it easier to locate data by time range. For example, if we need records from 2023-01-01 00:30:00, we can start by opening the block that covers that time window instead of scanning the entire log.
This design improves write organization and makes reads more manageable, but it introduces a new constraint: the logger needs to know where each record is located inside the block. That means we need metadata describing record offsets, sizes, and states.
Block Descriptor
At this point we have fixed-size blocks, and we can write records efficiently by assigning byte ranges inside a pre-allocated block. However, the block itself is still just a sequence of bytes. If we want to read a specific record later, we need to know where that record starts, how large it is, and whether it was fully written.
Without extra metadata, the reader has only two options:
Scan the whole block and parse records one by one.
Guess offsets based on external information, which is fragile and hard to recover after a crash.
This...