⭐️ immudb on github!
The latest version of immudb, the open source immutable database, comes with many improvements around instrumentation and observability. For this release we’ve put a lot of effort into optimizing our btree (our internal data structure) implementation and were able to achieve 22% higher indexing performance and a 22% reduction in disk usage.
Working on performance is usually a tricky job. When we think about performance of our code it’s very likely that our intuition will not reveal the true bottlenecks, no matter how much we think we know them. Therefore, it is crucial to have good internal visibility into what’s happening in the system running on real production data.
immudb exposes many internal metrics through the prometheus metrics endpoint. You can easily query it with curl or in your browser:
$ curl -s http://localhost:9497/metrics | head
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.3355e-05
go_gc_duration_seconds{quantile="0.25"} 1.3615e-05
go_gc_duration_seconds{quantile="0.5"} 1.9991e-05
go_gc_duration_seconds{quantile="0.75"} 3.0348e-05
go_gc_duration_seconds{quantile="1"} 3.3859e-05
go_gc_duration_seconds_sum 0.000151623
go_gc_duration_seconds_count 7
# HELP go_goroutines Number of goroutines that currently exist.
Embedding prometheus monitoring in a golang application exposes a ton of internal go-related metrics out of the box. This itself is a gold mine for performance tuning. However we’ve also added dedicated metrics to get true visibility onto what’s going on inside immudb, especially when observing and tuning btree performance.
Of course querying metrics with a simple curl command is not a very practical solution. Hopefully we’ve got very robust tools to visualize such values in an understandable and visually appealing way.
Starting with immudb 1.2.3 we’ve added a Grafana dashboard with correct visualization of some key immudb metrics. This dashboard can be downloaded from https://github.com/codenotary/immudb/blob/master/tools/monitoring/grafana-dashboard.json. We will be updating this dashboard with additional metrics in future releases.
Let’s take a closer look at the available charts.
Database Size Section
This section shows general information about database size and growth over time.
Database Size / Database Growth
These two dashboards show the amount of disk space used by databases and its growth. It can be used to forecast disk usage and make sure it’s not getting out of control.
Stored Entries / Stored Entries Rate
These dashboards show the amount of new DB entries and the insertion rate. The amount of new entries is calculated since the recent database restart.
Indexer Statistics
immudb has a unique feature of asynchronous indexing. It can ingest data at a very high speed and index them in the background. Not-yet-indexed data can not be easily queried (it can still be read by specifying explicit transaction id), thus ensuring that the indexing can catch up with the data insertion rate is crucial for keeping the application healthy.
Indexed %
This graph shows the total percentage of all transactions in the database that has been indexed so far. In a healthy situation, this chart should remain at or close to 100%. If this value starts dropping down, that means that the data ingestion rate is higher than the indexing rate indicating that additional rate limiting should be added to db writers.
Indexing / Commit rate
This chart shows the rate of new transactions added to the database and the rate of indexing those transactions. If the indexing rate is smaller than the commit rate, this means that the database isn’t keeping up with the indexing. In applications where only synchronous writes are performed or where data can be immediately indexed, the indexing rate line (Idx) and commit rate line (Cmt) will overlap.
TRXs Left to Index
This chart shows the number of transactions waiting for indexing. This value should be close to zero and should have a decreasing tendency.
TRX Count
This chart shows the total number of transactions in the database.
Btree Cache Size / Btree Cache Hit %
Those two charts show internal statistics about immudb btree cache. In order to avoid reading large amounts of data on every btree operation, immudb keeps an in-memory cache of recently used btree nodes. The amount of nodes in the cache is shown in the first chart, it is capped at the maximum amount of cache entries.
The second chart shows how effective the cache is presenting the percentage of btree node lookups that were optimized with the cache. For small databases, it’s very likely that this hit ratio will be close to 100%, but it will drop down once the amount of data increases. There’s no single rule on what value we should expect here. In our internal tests even 40% cache hit ratios with workloads using keys with random distribution were still yielding very good performance results. To get higher cache utilization, applications should prefer working on keys close to themselves – such as using sequentially increasing numbers where newly inserted data will end up in the btree portion close to previously accessed entries.
Btree Insights
This section of the dashboard contains a detailed view into btree internals.
Btree Depth
This chart shows the depth of the tree. Since btrees are auto-balancing data structures, this depth will have a logarithmic tendency. The depth of the tree indicates what’s the amount of nodes traversed by each btree operation.
Btree Child Node Count Distributions
These graphs show the distribution of the amount of child nodes. In a healthy btree like the one below, the amount of child nodes should be focused around a single value (40 in the example). Also the amount of child nodes should be kept at sane levels – values below 10 or above few hundred are a good indication that the btree isn’t performing well and the application should consider using keys of different, more uniform and shorter lengths for its data.
Note: These statistics are gathered when traversing the btree, if there’s no activity in the database the distribution will be flat.
Flush Statistics
immudb keeps recent btree changes in memory to reduce the amount of data to be written to disk. In order to persist those changes, there’s a btree flush process called once a threshold of new and modified entries is reached.
These metrics are calculated for nodes (both inner and leaf ones) and separately for KV entries in the leaf nodes.
The flush rate shows the rate of written nodes / entries per second. It clearly shows where the flush process started and where it ended.
Flush progress metrics for each flush cycle starts at zero and reach the total amount of entries processed during such single flush operation. The next flush will repeat the process by starting from zero reaching the maximum value. By looking at those maximum values, we can see how much data needs to be written to disk during flush operations. During normal DB operation, it should be steady over time. An unbound growth of those maximums could indicate that the flush operation is too aggressive and the threshold should be adjusted.
Compaction Statistics
Similarly to flush statistics, immudb exposes the same set of values for full compaction.
Note: these values are gathered for overall compaction that fully rewrites the btree structure. immudb 1.2.3 introduced a new online compaction mode that gradually removes unused btree data during flush operation. This new compaction mode is not included in those charts.
Data Size for Btree Nodes
immudb internally uses append-only files to store data. That is also used for btree nodes. We don’t replace existing data on disk. Instead we append a new modified version at the end of the data stream. immudb 1.2.3 introduced new online compaction of btree data that deals with garbage that naturally accumulates at the beginning of such data stream over time. This chart can be used to ensure that the amount of data being used for btree nodes is being reduced whenever a cleanup operation is performed.
S3 Statistics
This section is in its early days but already shows some basic insights into what the data performance is using immudb with an AWS S3 backed data store. The first chart shows the histogram of a single s3 object read duration , the second one shows the inbound traffic needed for immudb operations. If those values are high, you should consider switching back to local disk and an alternative remote storage back-end such as EBS volumes.
We hope you enjoyed this tour through the instrumentation and internals of immudb. Join us on our Discord if you have any questions or are deploying immudb and need help. If you’re a data geek, a golang lover or enjoy working on some of the world’s most interesting problems join our team. As always don’t forget to ⭐️ immudb on github!