EDB Engineering Newsletter #9
Welcome to the 9th edition of the EDB Engineering Newsletter! Where we share happenings in the data world that the team has enjoyed discussing, as well as other news about what the EDB Engineering team is up to.
Analysis we're following
A distributed systems reliability glossary
Antithesis teamed up with Jepsen to produce this useful distributed systems reliability glossary.
https://antithesis.com/resources/reliability_glossary/
When to Hire a Computer Performance Engineering Team
Giving Benchmarks a Boat
Justin Jaffray writes about common mistakes developers make when benchmarking with TPC-C.
https://buttondown.com/jaffray/archive/giving-benchmarks-a-boat/
Inverse Scaling in Test-Time Compute
Researchers from Anthropic's Fellows Program (and partner institutions) identified five distinct failure modes where extending reasoning length in Large Reasoning Models (LRMs) actually deteriorates performance:
Claude models become increasingly distracted by irrelevant information during extended reasoning
OpenAI o-series models resist distractors but overfit to problem framings
Models shift from reasonable priors to spurious correlations in regression tasks
All models struggle with maintaining focus during complex deductive reasoning
And extended reasoning amplifies concerning behaviors like self-preservation expressions in Claude Sonnet 4.
The research demonstrates that while test-time compute scaling remains promising for general capability improvement, naive scaling can amplify flawed problem-solving strategies rather than refining them. This creates a critical alignment gap between short and extended reasoning, suggesting that current training regimes may inadvertently incentivise models to apply extended reasoning incorrectly, potentially reinforcing problematic patterns rather than improving performance.
https://arxiv.org/pdf/2507.14417
Postgres Replication Slots: Confirmed Flush LSN vs. Restart LSN
Gunnar Morling wrote a great article demonstrating the difference between and rationale behind confirmed_flush_lsn
and restart_lsn
in Postgres logical replication.
https://www.morling.dev/blog/postgres-replication-slots-confirmed-flush-lsn-vs-restart-lsn/
AccountingBench: Evaluating LLMs on Real Long-Horizon Business Tasks
An experiment exploring whether frontier models can close the books for a real SaaS company.
https://accounting.penrose.com/
News we’re watching
Amazon AI coding agent hacked to inject data wiping commands
Amazon’s popular VS Code extension Q, a generative AI-powered code assistant, was compromised with the intent to bring attention to the state of AI security using a clever AI-driven attack—a prompt. This AI incident highlights the novelty of evolving AI-driven attacks. Hackers are not just leveraging popular generative AI tools for data exfiltration and system disruption, but also for targeted messaging.
EDB Staff Security Program Manager, Phil Alger, speaking about the implications of this incident mentions:
Open-source and AI-assisted coding tools are particularly attractive targets due to their widespread use and extensive impact. As we adopt AI tools into our productivity workflows, we must be mindful of their inherent risks. Keep in mind this incident happened to an Amazon product, an extremely well-funded and mature organization. This situation is even more poignant for many startups in this space, as they are still trying to find product-market fit and often have less rigor or oversight within their program.
New model releases
OpenAI’s new GPT-5 uses a unified architecture that switches between fast replies and deep reasoning via a real-time router. It posts strong results—94.6% AIME (math), 74.9% SWE-bench (code), 84.2% MMMU, 46.2% HealthBench—and GPT-5 Pro scores 88.4% on GPQA. Highlights include better complex app builds, richer creative writing, and more contextual health advice. Access is tiered, with Pro offering extended reasoning.
Cohere has launched Command A Vision, a 112B-parameter dense multimodal model that combines strong visual understanding with advanced text capabilities. Built on top of Command A, this new model excels at enterprise-relevant vision tasks like OCR, document analysis, and risk detection. Benchmarked against GPT-4.1, Llama 4 Maverick, and Pixtral Large, it leads in charts, math reasoning (MathVista 73.5%), and document.
Google launched Google Earth AI, a suite of geospatial AI models and datasets that powers features in Google Search and Maps and delivers insights via Google Earth, Maps Platform, and Google Cloud to support data-driven decision-making on a planetary scale.
Moonshot AI launched Kimi K2, a 1-trillion parameter MoE model with 32B active parameters, specifically optimized for agentic tasks beyond traditional chat applications.
Z.ai released GLM-4.5-Base, GLM-4.5, and GLM-4.5 Air under an MIT license, becoming a significant open-weight model. The models use a deeper architecture strategy rather than wider networks, trained on 15T general tokens plus 7T code/reasoning tokens. Independent testing shows GLM-4.5 Air running effectively on consumer hardware (M4 Mac with 128GB RAM), with quantized versions requiring as little as 48GB, making high-quality reasoning models increasingly accessible for local deployment.
Google DeepMind, with academic partners, introduced Aeneas, the first AI model tailored for analysing ancient Latin inscriptions. Aeneas successfully contributed to resolving debates like the dating of Res Gestae Divi Augusti.
Of particular interest to developers, Z.ai's GLM-4.5 Air and OpenAI's GPT-OSS-20B are lightweight models that can run on consumer-grade devices.
64GB MacBook Pro M2 has been reported to run GLM-4.5 Air model to write Space Invaders in JavaScript. https://simonwillison.net/2025/Jul/29/space-invaders
And gpt‑oss‑20B model can run on Apple Silicon, including M2 Max devices, however it requires at least 16 GB of total RAM. https://9to5mac.com/2025/08/06/how-to-run-gpt-oss-20b-on-mac
From the EDB team
Tomáš Vojtášek contributed a patch to the semantic-release project that took minutes off of an EDB CI job and according to another commenter on the page reduced a fetch step in their job from 30 minutes to 90 seconds. Way to go, Tomáš!
https://github.com/semantic-release/semantic-release/pull/3732
Creating a custom container image for CloudNativePG v2.0
Jonathan Gonzalez explains how to use Docker Bake to build custom images for CloudNativePG.
https://cloudnative-pg.io/blog/building-images-bake/
How Do We Test PGD
Bharat Telange and Amruta Deolasee speak about how the team tests EDB Postgres Distributed.
https://www.enterprisedb.com/blog/how-do-we-test-pgd-0
CloudNativePG Contributor Spotlight: Jaime Silvela
Floor Drees continued her series interviewing contributors to CloudNativePG, this time speaking with Jaime Silvela.
https://cloudnative-pg.io/blog/contributor-highlight-jaime-silvela/
EDB Postgres Distributed: Understanding Conflicts and Their Resolution
Vaijayanti Bharadwaj writes about how conflicts work, and how to handle them, in EDB Postgres Distributed.
What is OpenTelemetry and what can it do for me?
Peter Wilson, Craig Ringer and Dave Lawson teamed up to write about some of the benefits and challenges and future of OpenTelemetry, particularly with respect to Postgres observability.
https://www.enterprisedb.com/blog/what-opentelemetry-and-what-can-it-do-me
Stack traces for Postgres errors with backtrace_functions
Phil Eaton writes about how to get stack traces for Postgres ERROR
logs with the backtrace_functions
GUC.
https://www.enterprisedb.com/blog/stack-traces-postgres-errors-backtracefunctions
CloudNativePG part of LFX Mentorship again - sign up as a mentee!
CloudNativePG joined the LFX Mentorship Program, run by the Cloud Native Computing Foundation (CNCF), for the first time in the June cohort and had a fantastic experience. We’re excited to return for Term 3 starting in September, this time with three proposed projects!
https://cloudnative-pg.io/blog/2025-term3-lfx-cncf-mentorship/
Work with us at EnterpriseDB
Principle Engineer (Remote in India)
Software Engineer II (EMEA)
Senior Database Consultant, Spanish (EMEA, UK)
Associate, Data Analytics & AI (Remote in India)
Or see all openings.
Until next time
We hope you enjoyed this edition of the EDB Engineering Newsletter! Consider joining the PostgreSQL Hacker Mentoring Discord to get involved!
The EDB Engineering Team
The Data Migration and Integration team sharing a meal at their offsite in Madrid earlier this month. ❤️