vortiwin.blogg.se - Postgres json query comparing json fields

If we assume a 200 MiB/sec transfer rate and a 5 ms seek time (approximately a 2022 data center hard drive), going from 8 kiB pages to 64 kiB pages would mean random reads would go from about 5.04 ms to 5.4 ms, or an increase of about 7%, which means this would likely be a performance win for any application with large rows. These benchmarks show that larger pages can be better for large rows, since it would increase the amount of storage before falling off this cliff. A rough rule of thumb is that smaller pages are better for workloads that read and write small values, but larger pages are likely better for queries that scan ranges. I suspect the 8 kiB page may be the wrong default these days. However, Postgres's row length limit is pretty low.

This is a general database problem and not a Postgres problem: MySQL and others have their own performance cliffs for large values. This should reduce the performance penalty for compressed values. I didn't test it, but others have found it to use a bit more space but be signficantly faster ( 1, 2). If you are using Postgres 14 or later, you should use LZ4 compression.

In cases where you need excellent query performance, you may want to consider trying to split large JSON values across multiple rows. Compressed values makes queries take about 2× more time, and queries for values stored in external TOAST tables take about 5× more time. Never use JSON because the performance is terrible. Queries on HSTORE values are slightly faster (~10-20%), so if performance is critical and string key/value pairs are sufficient, it is worth considering. Accessing JSONB values is about 2× slower than accessing a BYTEA column. It stores parsed JSON in a binary format, so queries are efficient. Most applications should use JSONB for schemaless data. My conclusion is that you should expect a 2-10× slower queries once a row gets larger than Postgres's 2 kiB limit. This article contains some quick-and-dirty benchmark results to explore how Postgres's performance changes for the "schemaless" data types when they become large.

The same performance cliff applies to any variable-length types, like TEXT and BYTEA. Unfortunately, the performance of queries of all three gets substantially slower (2-10×) for values larger than about 2 kiB, due to how Postgres stores long variable-length data ( TOAST). Postgres supports three types for "schemaless" data: JSON (added in 9.2), JSONB (added in 9.4), and HSTORE (added in 8.2 as an extension).