Sans Bytes

Daniel Heath blogs here

Postgres features for Ruby developers

About postgres

Postgres is a relational database which is widely used by Ruby developers.

For instance, it’s the default database for Heroku (a popular hosting provider for Ruby apps).

Like most databases, postgres uses standard SQL. It also offers many features which are not available in most other databases.

I’m going to provide a high-level overview of features here without providing much detail. If you have questions, the online documentation for postgres is comprehensive, useful, and up to date.

Overview of Features

UUID columns

By default, Rails gives you tables with an integer ID. This ID is always generated when you create a new record - you can’t specify the value for a new records ID.

Postgres supports using a Universally Unique ID (UUID) instead.

UUIDs look like this: 93743318-67f4-4c29-8ba0-5ba4f667c7b2.

This can be a useful alternative because it lets you specify the ID, but will still generate an ID if you don’t specify one.

Some apps need fulltext search, but running a separate search engine is hard. Postgres has a reasonably good quality text-search engine built in.

You’ll need to spend some time perusing the documentation to use this - every app is different and you’ll need to configure the search to suit your needs.

Spatial data

One of the most commonly used extensions to postgres is called PostGIS. This lets you store geographic data (e.g. points or areas on a map).

You can then write queries to find (e.g.) what’s within 10km of this point, with the closest points first.

Scaling

Most web applications using postgres will have one server dedicated to running postgres, with other servers running the web application.

As the amount of data you’re dealing with increases, the database server can become overloaded.

There are three common approaches to avoiding this:

Throw money at it

The easiest answer is usually “buy more ram/CPU/disk”.

Partitioning via Child Tables

Partitioning lets you use multiple servers, storing part of your data on each of them.

This technique helps you scale a database which is overloaded due to too many update, delete & insert commands.

For instance, all records where the name field starts from A through F will go to one server, G through P to a second, and the remainder to a third.

You can achieve this using a feature called child tables (the documentation has good examples).

Replication

Replication lets you use multiple servers, storing all of your data on each of them. Only one of the servers is allowed to handle update, insert or delete queries.

This technique helps you scale a database which is overloaded due to too many select queries running at once.

It is also good because each server has a full copy of the database, meaning that if the primary database fails, you can quickly switch to one of the replica servers.

Most modern databases, including Postgres, have good support for replication.

Arrays and Hashes and JSON, oh my! (XML too)

Arrays

A column in an SQL table has a type (e.g. string, integer, timestamp). Postgres also lets you can create a column which stores an array of strings, or timestamps, or any other type (even arrays of arrays!).

If you’re using Rails with ActiveRecord, these arrays work like normal Ruby arrays.

Hashes (via HSTORE)

Similarly to arrays, you can store hashes like {'foo' => 'bar', 'k' => 'v'} This works like a normal Ruby hash through ActiveRecord, except you can only use strings for the keys and values.

JSON

Postgres has a JSON type which can store any JSON value. ActiveRecord makes this work nicely with Ruby (with caveats; it can’t tell that the JSON has changed when you modify part of a nested structure).

You can query your JSON data using postgres built-in functions.

XML

The XML type will stop you from accidentally storing something which is not valid XML. This is better than just using a string field, as you can be sure that the data you have stored is valid XML.

You can query your XML data using XPATH selectors.

Stored procedures

Postgres lets you create stored procedures (functions written in SQL).

You can write these functions in other languages if you have the right extensions installed; for instance, the V8 Postgres extension lets you write stored procedures in Javascript. There is also a Ruby extension.

Writing stored procedures in Javascript is clearly a good idea which will not upset your teammates.

Indexing

Big database tables can be hundreds of gigabytes, and it can take a long time to scan through all that data to find the record you’re looking for.

An index is a much smaller file which can be scanned quickly and tells the database where to find the records which match a query.

Practically all databases support some kind of indexing.

Partial indexes

Unlike other databases, postgres lets you create an index with a where condition (e.g. create index (...) where (condition)).

This means that the index file can be even smaller and quicker to use.

Expression indexes

Postgres also allows you to create an index on any expression (not just a column). This means you can do things like create index index_name on table_name ( to_uppercase(description) )

DDL Transactions

This feature protects you when running migrations.

In other databases, if a migration fails part-way through, the database can be left half-way between the old state and the new state. Postgres stops this from happening - migrations either succeed completely, or do nothing at all.

Constraints & Deferred constraints

Many databases support constraints (like validations built into the database).

However, they are often hard to use because they can’t be broken, even temporarily.

Postgres supports deferred constraints, which only apply when you commit a transaction. This makes constraints far more usable with Rails.

Foreign Tables

Foreign tables let you run join queries against tables which are in a totally different database (e.g. MySql or any JDBC database).

Handy when dealing with really big, organically grown systems.

There is no comment box here.
If you have something to contribute I would be delighted to hear it via email.