EDB Engineering Newsletter #4
Welcome to the 4th edition of the EDB Engineering Newsletter! Where we share happenings in the data world that the EDB Engineering team has enjoyed discussing, as well as other news about what the EDB Engineering team is up to!
What we're following
New statistics framework for DataFusion
DataFusion has merged a new statistics framework that will eventually lead to improved cardinality estimation and better join ordering.
Maintainer Andrew Lamb mentions “this statistics piece is one of the last ‘missing’ pieces for more sophisticated optimizations.”
https://github.com/apache/datafusion/pull/14699
Introducing deep research
OpenAI has announced GPT-4.5 as a improved version of GPT-4o. It performs better in math, science, multilingual understanding, and coding while also demonstrating higher emotional intelligence and creativity. They also announced that GPT-5 will be released around May 2025.
Moreover, OpenAI has launched a new agentic application called Deep Research, designed for individuals engaged in intensive knowledge work in fields such as finance, science, policy, and engineering. This tool is specifically tailored for those who require thorough, precise, and reliable research. It is said to perform multi-step investigations on the Internet for complex tasks, completing in just tens of minutes what would typically take a human many hours.
GAIA is considered the most comprehensive benchmark for agents lately, and Deep Research is showing strong performance (67.36% score) on that benchmark, which makes this tool even more impressive.
https://openai.com/index/introducing-deep-research/
QueryGPT – Natural Language to SQL Using Generative AI
The team at Uber built a system for allowing internal users to query business data with natural language.
With our limited release to some teams in Operations and Support, we are averaging about 300 daily active users, with about 78% saying that the generated queries have reduced the amount of time they would’ve spent writing it from scratch.
https://www.uber.com/en-IN/blog/query-gpt/
Self-Join Elimination for Postgres
With this optimization (which will be part of Postgres 18 when it comes out), Postgres now “replaces self joins with semantically equivalent single scans”.
[This] optimization reduces the length of the range table list (this especially makes sense for partitioned relations), reduces the number of restriction clauses and, in turn, selectivity estimations, and potentially improves total planner prediction for the query.
Also, it can drastically help to join partitioned tables, removing entries even before their expansion.
https://github.com/postgres/postgres/commit/fc069a3a6319b5bf40d2f0f1efceae1c9b7a68a8
Claude 3.7 Sonnet and Claude Code
Anthropic has introduced Claude 3.7 Sonnet, a "hybrid reasoning model" capable of solving complex problems and with improved performance in areas like math and coding. Anthropic have also introduced a command line tool for agentic coding, Claude Code.
To benchmark its capabilities, Anthropic have been testing Claude 3.7 Sonnet on the classic Game Boy game Pokémon Red and streaming live over Twitch streaming platform. This approach aims to demonstrate the model's advanced reasoning abilities in navigating the game's challenges.
https://www.anthropic.com/news/claude-3-7-sonnet
From the EDB team
The Immutable Future of PostgreSQL Extensions in Kubernetes with CloudNativePG
Postgres 18 may get support for custom extension directories. Gabriele Bartolini describes why this is so useful in the cloud-native world and how it works.
The Path to Vector Database Success
Jeremy Kelway discusses the importance of of multi-modal vector solutions for ease-of-use and enterprise controls.
https://www.enterprisedb.com/blog/path-vector-database-success
Integrating EDB's AI Accelerator with Nvidia NIM
Ian Kinsey demonstrates LLM integration with Postgres with aidb.
https://www.enterprisedb.com/blog/integrating-edbs-ai-accelerator-nvidia-nim
Representing graphs in PostgreSQL with SQL/PGQ
Postgres will eventually support graph queries natively, no extension or third-party system needed. John Nevin grabs the current proposed patches and demonstrates how this will work in Postgres.
https://www.enterprisedb.com/blog/representing-graphs-postgresql-sqlpgq
Minimal downtime Postgres major version upgrades with EDB Postgres Distributed
Phil Eaton demonstrates in-place major version upgrades of Postgres in a Postgres Distributed cluster while the cluster remains available for reads and writes.
Postgres in the time of monster hardware
Lætitia Avrot takes a look at how Postgres is able to make use of advances in modern hardware.
https://www.enterprisedb.com/blog/postgres-time-monster-hardware
How about trailing commas in SQL?
Peter Eisentraut considers how various databases try to make it easier for users to write queries by allowing trailing commas in various locations.
https://peter.eisentraut.org/blog/2025/02/11/how-about-trailing-commas-in-sql
OAuth support for Postgres
Jacob Champion merged support for connecting to a Postgres server via OAuth, completing an effort he first proposed on the mailing list in 2021.
https://github.com/postgres/postgres/commit/b3f0be788afc17d2206e1ae1c731d8aeda1f2f59
Until next time
We hope you enjoyed this edition of the EDB Engineering Newsletter! Consider joining the PostgreSQL Hacker Mentoring Discord to get involved!
The EDB Engineering Team