This is an extensible "object-relational" database system, the most sophisticated and probably the best supported of the free databases. It has been chosen as Linux Journal "Editors' Choice" in 2001 , 2003 , 2004 , 2005 .
At one time, this was the Postgres research RDBMS engine, by Michael Stonebraker (creator of Ingres, one of the first commonly-available relational databases, more recently a "senior technical dude" at Informix , of late teaching at MIT.). People liked Postgres enough that some people decided to make the query language SQL-compliant, at which point, in keeping with other product names at the time, was called PostgreSQL-95 . Subsequent work dropped the "best before" dating.
There was a commercialized version of Postgres that was called Illustra; the company was bought out by Informix, and its object-oriented features have been integrated into Informix Universal Server. People occasionally try to draw out "family inheritance" in the code bases between PostgreSQL, Informix, and Ingres due to Stonebraker's involvement with them all, while there may be some internal similarities, the respective projects were considerably independent from one another, and real similarities aren't likely to be overly conspicuous.
More " free third-party tools" have been created for PostgreSQL than any of the other free databases, and include:
Support for access from numerous languages such as Perl , Python, Ruby , Tcl, Common Lisp , Ada , Eiffel , Mono , and standards-oriented APIs in the forms of ODBC and JDBC support,
This allows applications written in numerous languages to access PostgreSQL-based data.
A project providing a central repository of prepackaged PostgreSQL software for several operating systems.
Unlike most vendors, PostgreSQL's makers are prepared to document what it won't do at this point.
General Bits is a column based on the PostgreSQL mailing list pgsql-general.
There's a Linux Database HOWTO on PostgreSQL. You can find it sold at Amazon as the PostgreSQL HOWTO book. The author waxes more than a little overevangelistic/overenthusiastic about PostgreSQL, vastly overstating its advantages, and happily insulting alternatives as being not even worth thinking about. He makes the same sorts of claims that allow Project Gutenberg proponents to claim that their work is worth "billions of dollars," when the fact that the public values them enough to donate a few tens of thousands of dollars suggests that they may be overvaluing things a mite. He has a HOWTO on "building your own CPU" that has similarly overoptimistic valuation of the merits of designing a "Free CPU" (in the free software sense).
There are also books, including a freely available PostgreSQL book.
And see the Tutorial on Altering PostgreSQL table columns,
PostgreSQL : Referential Integrity Tutorial & Hacking the Referential Integrity tables, and
Other PostgreSQL Technical Documentation, as well as
There is now a PostgreSQL Commercial Web Site at which commercial support for the use of PostgreSQL may be obtained.
Note that PostgreSQL uses GiST to implement advanced sorts of indices.
Logos and the like...
PostgreSQL Horde Mail - controlling your mailing list access
NOVELL: DeveloperNet Connections
PostgreSQL now included with the Novell Developer Kit
PostgreSQL Skills and Competency Questionnaire
Check your skill with PostgreSQL...
PostgreSQL has long supported MVCC ...
PostgreSQL MONO connection; there are some libraries to allow you to use PostgreSQL with MONO...
pgfsck - PostgreSQL table checker and dumper
This tool provides "best efforts" at recovery. You should only run it if you have encountered True Disaster, and are trying to recover whatever bits you can, with the full expectation that some bits may be irretrievable.
Accessing PostgreSQL databases using OpenOffice.org
Native MATLAB interface to PostgreSQL
This interface provides MATLAB mex file wrappers to PostgreSQL's libpq C library. It currently supports many of the essential libpq functions and data is seamlessly passed back to variables in the MATLAB workspace.
Multi-Version/Multi-Cluster PostgreSQL architecture
This article studies the Debian packaging of PostgreSQL which is being redesigned to allow multiple versions and multiple clusters to coexist. That will allow orderly version upgrades.
An important event of 2006 is this conference in Toronto .
I'm responsible for a number of entries in this FAQ; AIX is a somewhat odd-ball form of Unix .
A single-master, multi-slave replication system implemented in a combination of C and pl/pgsql, compatible with PostgreSQL versions 7.3.3 and later.
A guide
This is a reimplementation of the Perl-based DBMirror.pl replication system in C++ .
Replication of PostgreSQL databases using Spread and Ensemble . Unfortunately, this code hasn't been integrated with recent versions of PostgreSQL. It is likely that Slony-I will be a more useful option.
There's also a paper on this; [Don't be lazy, be consistent: Postgres-R, a new way to implement Database Replication ]
Challenges Involved in Multimaster Replication
This article is mostly about the issues surrounding replication with Oracle ; the challenges are quite representative of why multimaster replication is troublesome in general.
pgpool is a connection pool server for PostgreSQL, standing between PostgreSQL clients and a server. Any PostgreSQL client may connect to pgpool as if it were a real PostgreSQL server. It can serve a number of functions:
pgpool caches database connections to reduce the overhead involved in establishing connections.
It may also be used to diminish the number of connections that need to be opened, which can improve the efficiency of server usage.
pgpool may be used for failover purposes. If the "main" server goes down, pgpool can automatically switch connections over to the secondary server.
Replication
Update queries may be applied to multiple servers, so that this would represent a sort of "poor-man's replication scheme."
Load balancing
SELECT queries can be submitted in some form of "round robin" against multiple servers.
There are plans afoot to build a more sophisticated scheme to dynamically configure pgpool in conjunction with Slony-I . With Slony-I, you may submit a request to shift the "master" role from one database to another, perhaps for handling a system failure, or to take a server out for maintenance. A useful enhancement to pgpool would be to have Slony-I inform it to defer all requests when that switchover process starts, and then to start to submit them to the new master as soon as it is ready. That potentially allows applications to be switched to a new "master" fairly much invisibly to the users.
I have contributed code improving logging, execution as a daemon, as well as getting it to run on Solaris and AIX.
Along with Documentation This provides a set of extensions to PostgreSQL allowing you to store digital certificates, and to have database fields that are stored in encrypted form that may only be extracted by users that are authenticated using digital certificates.
Working on 2 Phase Commit, with a view to providing an XA -compliant interface.
Based on RFC 4122 - Universally Unique IDentifier (UUID) URN Namespace
This is a PostgreSQL-based implementation of Memcached . The general idea is that you have a transactional database as the "official store" for vital information.
If you have applications which really frequently access this information, where is isn't a forcible disaster if they're not completely up to date, you can improve performance by having a distributed cache that takes the "heat" of heavy queries. Users first try to hit the cache; if they fail, then they hit the database, and draw the results into the cache. Updates will naturally invalidate the cache; either your applications must explicitly mark entries as dead or repair them, or triggers on tables are used for cache entry invalidation.
This sort of algorithm has proven very useful for various sorts of query-heavy database-driven applications where entries are much more frequently read than they are written:
Encyclopedia-like applications like Wikipedia;
Discussion boards like Slashdot ;
The WHOIS service
Another "home" for pgmemcache may be found here.
The PostgreSQL Loader project is a fast data loader for PostgreSQL, with the particularly interesting ability to generate files of rejected rows. It is written in Tcl.
If enough developers get interested, the plan is to develop pgloader into an analagous utility to Oracle's SQL*Loader.
PgMail - send email from inside PostgreSQL .
This permits attaching triggers so that messages may be when automagically sent when columns are modified, or tuples are inserted or deleted.
GCC XML Introspector Project: Welcome to the GCC AST Tree_Node Project with Postgres Interface
Red Hat's release of PostgreSQL, along with a number of added administration Tools and Utilities Developed for Red Hat Database
EMS PostgreSQL Export is a cross-platform (Windows and Linux ) product to export your data quickly from PostgreSQL databases to any of 12 available formats, including MS Excel , MS Word, HTML , TXT, amongst others. PostgreSQL Export includes a "wizard," which allows you to set export options for each table visually (destination filename, exported fields, data formats, and many more) and a command line utility to export data from tables and queries at a single touch of a mouse button.
Proprietary software; you can download a "crippled" version to play with...
They also sell a graphical database "manager" tool.
SQLExplorer(tm)-like tool for data/metadata browsing of SQL database servers through Borland's dbExpress client libraries
PgAccess - a Tcl/Tk-based interface,
A project combining connection pooling, load balancing, and data replication services for PostgreSQL
pgdiff - PostgreSQL Diff Utility
pgdiff is a utility which compares the table definitions of two databases, and returns the differences as PostgreSQL commands which will transform the structure of the first database to be identical to that of the second (c.f. diff and patch).
PostgreSQL provides an interface for tying languages to the database server, allowing the Gentle User to develop stored functions in quite a variety of languages.
C - included internally in standard distribution
SQL - included internally in standard distribution
PL/pgsql - included internally in standard distribution
Tcl - included internally in standard distribution
Perl - included internally in standard distribution
Python - included internally in standard distribution
PL/R User's Guide - R Procedural Language
On using the R statistical language in conjunction with PostgreSQL.
Articles on how languages were embedded in with the PostgreSQL engine so that statistical functions implemented in R could directly access database data and then be exported as SQL functions.
A Java Stored Procedure execution system for PostgreSQL.
plMono is a PostgreSQL language using the embedded Mono runtime. It provides support for writing functions in C#, or any other language that supports .NET. At the time of this writing that includes: APL , C++, COBOL , Eiffel , F# (a Caml derivative), Forth , Fortran , Haskell, ML , and probably others.
PostgreSQL often gets accused of being rather slow. Some of that represents the way things were years ago. When observed recently, it is typically the result of taking an install with all of the defaults from what is generated by the source distribution.
![]() | The defaults are configured to be able to function on all platforms on which PostgreSQL runs, and are definitely not optimal for typical usage. |
Fortunately, it is quite straightforward to achieve massive improvements by looking at a very few things. Here is a sort of " top ten" list of things that should be done to improve performance, and why:
Increase, in postgresql.conf,
the value of shared_buffers.
The defaults are low because on some platforms (Solaris and SGI, notably) it requires invasive action like recompiling the kernel in order to use larger values.
The shared buffers are used to hold query results that are in
progress. The default of 512K is quite likely to be low. The "rule of thumb" is to estimate the " right value"
to be between 5% and 15% of total system memory.

Note that this value does not include any OS file caching that
may take place; look below at effective_cache_size,
which addresses that. Note that any time records are updated, they
must be pushed out to disk, thereby invalidating
both this cache as well as the OSes filesystem
cache.
Increase, in postgresql.conf,
the value of effective_cache_size.
This parameter indicates to the query optimizer how much OS filesystem cache you expect to have available. On a server used solely for running PostgreSQL, this might represent most of your system's physical memory.
Run VACUUM on database tables frequently.
Any time records are deleted or updated in a table, this leads to generating " dead tuples" for this table. They are not automatically cleaned out; that requires running VACUUM. If the table is not vacuumed, the postmaster will have to rummage through the dead data, only to discard it as useless. It may be no big deal if there are 10,000 dead tuples in a table with a million entries, but performance will be bad if (say) a table of financial balances has a few hundred live entries combined with thousands of obsolete dead tuples.
At one time, VACUUM locked whatever table was being cleaned; that has long changed, so that it now does not block activity on the system. Seemingly paradoxically, the more often you run VACUUM, the faster it tends to run, and the more likely it is that it will do some good.
If there get to be really a lot of dead tuples in a table, there may be more free space than the free space map knows to manage, and a proper cleanup may require the other version, called VACUUM FULL.
A recently developed program, pg_autovacuum, monitors database updates, and automatically initiates VACUUMs on just those tables that need cleaning.
Run ANALYZE on the database periodically.
PostgreSQL's query optimizer is a cost-based system, and hence needs to have some statistics on what data the various tables on the system contain.
Until you run ANALYZE, the system may have no better ideas than to do sequential scans of all the tables involved in queries.
The same is true for other cost-based optimizers; I have also seen Oracle bog down horribly on queries until some statistics built up at which point those queries sped up immensely.
Move the WAL files to a separate physical disk.
That is, set the directory pg_xlog to point to a directory on a separate disk from that which the data is stored on. These files, that contain the most recent updates to the database, are heavily updated, and if you can separate these updates from those concurrently taking place on database tables, you can get an easy performance boost, often on the order of 25%.
It may also be helpful to ensure text logs are being written to a separate physical device.
If your requirements are more sophisticated, it may even be worthwhile to move individual database files to separate disks if you have enough disk drives to do so.
Mind you, if you have such sophisticated requirements, it is likely to make even more sense to build fairly sizable arrays of RAID disk, at which point all the disk activity will be automatically heavily striped. In such a situation, it is fairly likely that separating certain database files to specific disks would lead to less use of striping, and perhaps lower performance a little.
My boss did some benchmarking where he compared having WAL and the rest of the data all "shared" on a big disk array with splitting off a couple of disks to just do WAL; he couldn't see any measurable performance difference.
Increase, in postgresql.conf,
the value of sort_mem
If yo