Sunday, October 15, 2017

amcheck for Postgres 9.4+ now available from PGDG apt and yum repositories

amcheck, a tool for index corruption detection, now has packages available from the community Debian/Ubuntu apt repository, as well as packages from the community Redhat/CentOS/SLES yum repository.

This means that installations built on those community resources can easily install amcheck, even on PostgreSQL versions before PostgreSQL 10, the release that contrib/amcheck actually first appears in.

Full details on installing these packages are available from the README:

It's also possible to install the packages on PostgreSQL 10, because the extension these packages install is actually named "amcheck_next" (not "amcheck"). Currently, it isn't really useful to install "amcheck_next" on PostgreSQL 10, because its functionality is identical to contrib/amcheck. That's expected to change soon, though. I will add a new enhancement to amcheck_next in the coming weeks, allowing verification functions to perform "heap matches index" verification on top of what is already possible.

Many thanks to Christoph Berg and Devrim Gündüz for their help with the packaging.

Tuesday, July 18, 2017

PostgreSQL Index bloat under a microscope

I've posted a snippet query to the PostgreSQL Wiki that "summarizes the keyspace" of a target B-Tree index. This means that it displays which range of indexed values belong on each page, starting from the root. It requires pageinspect. The query recursively performs a breadth-first search. Along the way, it also displays information about the space utilization of each page, and the number of distinct key values that actually exist on the page, allowing you to get a sense of how densely filled each page is relative to what might be expected.

The query is available from:

Tuesday, May 10, 2016

amcheck: Verify the logical consistency of PostgreSQL B-Tree indexes

I've created a project page on Github for amcheck, a tool for verifying the logical consistency of PostgreSQL B-Tree indexes:

The tool is primarily useful for detecting index corruption in production database systems. It can do this with low overhead; most verification requires only a non-disruptive lock on the index as it is verified. The strength of the lock taken on an index as it is verified matches that of simple SELECT statements (unless the highest level of verification is requested). The locking involved will generally not block concurrent reads or writes, and will not prevent VACUUM from running concurrently.

Saturday, November 14, 2015

Suggesting a corrected column name/spelling in the event of a column misspelling

One small PostgreSQL 9.5 feature I worked on is the new hinting mechanism feature, which sometimes hints, based on a score, what you might have meant to type following misspelling a column name in an SQL query. The score heavily weighs Levenshtein distance. A HINT message is sent to the client, which psql and other client tools will display by default.

It's common to not quite recall offhand if a column name is pluralized, or where underscores are used to break up words that make up the name of a column. This feature is targeted at that problem, providing guidance that allows the user to quickly adjust their query without mental context switching. For example:

postgres=# select * from orders where order_id = 5;
ERROR:  42703: column "order_id" does not exist
LINE 1: select * from orders where order_id = 5;
HINT:  Perhaps you meant to reference the column "orders"."orderid".