Those features include parallel query processing, columnar storage (which significantly reduces the time required to scan and filter query data), and more mature query planning, optimization, and execution, which give you faster performance out of the box with minor tuning. Within the last 18 months, these technical shortcomings have primarily been addressed. The most recent release of PostgreSQL, version 13, included improvements in parallel processing and query planning. PostgreSQL, due to its excellent extensibility and enthusiastic community, now benefits from several extensions, which enhance it with more excellent parallel processing and columnar storage. So technically, we now see PostgreSQL performing on-par with most commercial data warehouse systems, but at 50% to 90% lower annual cost.
Trend three: Goodbye OLAP, hello HTAP
The definition of a data warehouse is changing. Traditionally, an analytic system would copy data from one or more sources, via extract-transform-load (ETL) programs, into a separate data warehouse database. The problems with this approach are a) cost of maintaining a separate database, b) reporting latency, and c) massive effort maintaining ETL scripts. Sometimes minutes, hours or even days pass before the data is ETL’d into the data warehouse, which means people are analyzing data that might be outdated already.
The solution to these problems is hybrid transactional/analytical processing (HTAP). Instead of having separate databases for transactions and analytics, you have a single database, which serves both needs. In the database management system market, we see two main paths to HTAP — you either add some “T” to an OLAP database, or more commonly, we see OLTP databases being enhanced with some “A.”
In PostgreSQL, for instance, HTAP is primarily enabled by columnar indexes. Some of the extensions I mentioned earlier enhance PostgreSQL with columnar indexes on its base tables, which accelerate query performance by significantly reducing I/O. This is much easier for developers, and it gives businesses the ability to analyze fresher data, even in real-time.