Postgres features for Ruby developers

About postgres

Postgres is a relational database which is widely used by Ruby developers.

For instance, it’s the default database for Heroku (a popular hosting provider for Ruby apps).

Like most databases, postgres uses standard SQL. It also offers many features which are not available in most other databases.

I’m going to provide a high-level overview of features here without providing much detail. If you have questions, the online documentation for postgres is comprehensive, useful, and up to date.

Overview of Features

UUID columns

By default, Rails gives you tables with an integer ID. This ID is always generated when you create a new record - you can’t specify the value for a new records ID.

Postgres supports using a Universally Unique ID (UUID) instead.

UUIDs look like this: 93743318-67f4-4c29-8ba0-5ba4f667c7b2.

This can be a useful alternative because it lets you specify the ID, but will still generate an ID if you don’t specify one.

Some apps need fulltext search, but running a separate search engine is hard. Postgres has a reasonably good quality text-search engine built in.

You’ll need to spend some time perusing the documentation to use this - every app is different and you’ll need to configure the search to suit your needs.

Spatial data

One of the most commonly used extensions to postgres is called PostGIS. This lets you store geographic data (e.g. points or areas on a map).

You can then write queries to find (e.g.) what’s within 10km of this point, with the closest points first.


Most web applications using postgres will have one server dedicated to running postgres, with other servers running the web application.

As the amount of data you’re dealing with increases, the database server can become overloaded.

There are three common approaches to avoiding this:

Throw money at it

The easiest answer is usually “buy more ram/CPU/disk”.

Partitioning via Child Tables

Partitioning lets you use multiple servers, storing part of your data on each of them.

This technique helps you scale a database which is overloaded due to too many update, delete & insert commands.

For instance, all records where the name field starts from A through F will go to one server, G through P to a second, and the remainder to a third.

You can achieve this using a feature called child tables (the documentation has good examples).


Replication lets you use multiple servers, storing all of your data on each of them. Only one of the servers is allowed to handle update, insert or delete queries.

This technique helps you scale a database which is overloaded due to too many select queries running at once.

It is also good because each server has a full copy of the database, meaning that if the primary database fails, you can quickly switch to one of the replica servers.

Most modern databases, including Postgres, have good support for replication.

Arrays and Hashes and JSON, oh my! (XML too)


A column in an SQL table has a type (e.g. string, integer, timestamp). Postgres also lets you can create a column which stores an array of strings, or timestamps, or any other type (even arrays of arrays!).

If you’re using Rails with ActiveRecord, these arrays work like normal Ruby arrays.

Hashes (via HSTORE)

Similarly to arrays, you can store hashes like {'foo' => 'bar', 'k' => 'v'} This works like a normal Ruby hash through ActiveRecord, except you can only use strings for the keys and values.


Postgres has a JSON type which can store any JSON value. ActiveRecord makes this work nicely with Ruby (with caveats; it can’t tell that the JSON has changed when you modify part of a nested structure).

You can query your JSON data using postgres built-in functions.


The XML type will stop you from accidentally storing something which is not valid XML. This is better than just using a string field, as you can be sure that the data you have stored is valid XML.

You can query your XML data using XPATH selectors.

Stored procedures

Postgres lets you create stored procedures (functions written in SQL).

You can write these functions in other languages if you have the right extensions installed; for instance, the V8 Postgres extension lets you write stored procedures in Javascript. There is also a Ruby extension.

Writing stored procedures in Javascript is clearly a good idea which will not upset your teammates.


Big database tables can be hundreds of gigabytes, and it can take a long time to scan through all that data to find the record you’re looking for.

An index is a much smaller file which can be scanned quickly and tells the database where to find the records which match a query.

Practically all databases support some kind of indexing.

Partial indexes

Unlike other databases, postgres lets you create an index with a where condition (e.g. create index (...) where (condition)).

This means that the index file can be even smaller and quicker to use.

Expression indexes

Postgres also allows you to create an index on any expression (not just a column). This means you can do things like create index index_name on table_name ( to_uppercase(description) )

DDL Transactions

This feature protects you when running migrations.

In other databases, if a migration fails part-way through, the database can be left half-way between the old state and the new state. Postgres stops this from happening - migrations either succeed completely, or do nothing at all.

Constraints & Deferred constraints

Many databases support constraints (like validations built into the database).

However, they are often hard to use because they can’t be broken, even temporarily.

Postgres supports deferred constraints, which only apply when you commit a transaction. This makes constraints far more usable with Rails.

Foreign Tables

Foreign tables let you run join queries against tables which are in a totally different database (e.g. MySql or any JDBC database).

Handy when dealing with really big, organically grown systems.

Questions for prospective employers

I’ve been jobhunting a bit recently. Below are a few of the questions I find give me a reasonable sense of what a company is about.


How many large customers? How many small ones? How much variance in customer spend is there? (companies with few large customers are probably going to work on whatever those customers tell them to work on; if they have many small customers and don’t A/B test something might be wrong)

Org chart: How many sales / dev / product / UX / admin?

Has the team released any interesting OSS in the last year (on work time)?

Has the team released any interesting OSS in the last year (on their own time)?

How much pairing do you do and when/why do you do it?

How/when do you do code reviews? During pairing?


What’s your take on test coverage?

What quality metrics do you use?

Do you enforce a styleguide? How?

What languages & frameworks do you use?

How long does a typical CI build take? How long does a typical local build take?

What techniques do you use to keep tests running fast? (e.g. parallel tests, GC tweaks, avoiding I/O)

How many code repositories? apps?


How similar is your development environment to production?

How many architectural pieces are there (appworker, queue, cache, database)?

How many of these are outsourced vs self run (e.g. are you using RDS/heroku PG, or running mysql/postgres yourselves? Do you run your own queue or use

How often do you deploy? Continuous delivery? Fortnightly sprints?

How many production-like environments do you have (eg staging, uat)?

How long from ‘lets deploy’ to ‘done’?

How long to roll-back from a bad deploy?

How many servers are you running?

How many kinds of server are you running (db, cache, web, ci, SCM hosting…)?

Cloud or self-hosted (or something else)?

Team & Culture

How is the team organised? Small groups? Heirarchy? Triads? How do the dev team decide what to work on? How do you stay in sync across multiple locations (if applicable)? How do the team keep each other informed (eg email/chat/face-to-face)?

How often do you use A/B testing to determine whether a change stays? Who makes the call on what to test?

How do you set up a new development environment?

Learning Go

I’ve been using Go fulltime for ~9 months now.

The big differences I’ve noticed coming from Ruby:


Imports can’t clobber the global namespace or change the behavior of other packages (unless those packages were designed to allow it).

Only types/methods that are exported by a package are available when you import it.

Static language

There’s no monkey patching. When writing packages, export interfaces, not types (to make testing possible).

Composition, not inheritance

Go has no classes. Group data together using structs.

Any type can have methods. E.G.

type Foo string func(f Foo) Print() { fmt.Println(f) } Foo(“a string”).Print() # Prints “a string” type Bar struct { str string } func(b Bar) Print() { fmt.Println(b.str) } Bar{“A different string”}.Print() # Prints “A different string”

Important differences moving from ruby to go: * Code re-use via composition feels very natural coming from ruby. * Where you would have used a module Foo in ruby, embed an interface Foo in go.

  • Because imports can’t interfere with one another, many small dependencies is not a problem. Compose your app from many small frameworks rather than using one large one.

  • Monitoring is back in the dark ages (No newrelic). Plan accordingly.

The rails asset pipeline - Now for every framework.

I recently found myself wanting the features of the rails asset pipeline in my golang project at work.

Since there isn’t much in the way of asset pipelining for golang yet, I built it.

Turns out, sprockets is really easy to integrate.

Assets in development

First things first - lets get to ‘it works on my machine’.

I’ve put together a sample repo using the asset pipeline.

The setup for your app will be similar: * The assets folder contains your stylesheets, javascript, etc (this directory name is set in sprockets/environment.rb). * You’ll need a similar Rakefile to build assets (and maybe launch the server) * You might store the sprockets directory somewhere else - update the Rakefile to match. * Use a Gemfile and the bundler rubygem to manage dependencies. * Edit the rakefile to change the port the asset server runs on.

When your app starts (in development), it should make a request to http://localhost:11111/assets/manifest.json.

Parse this JSON hash; the keys are asset names (eg “application.css”) and the values are relative URLs the compiled assets can be fetched from.

When generating a link to an asset in your app, use the JSON hash you fetched to lookup the URL. In the case of “application.css” this might look like http://localhost:11111/application-8e5bf6909b33895a72899ee43f5a9d53.css.

That should be all you need for development - you should be able to see SASS/Coffeescript assets compiled and loading normally.

Assets in production

For production we want to pre-compile assets rather than regenerating them each time they change.

rake assets will create a ‘public’ folder containing ‘manifest.json’ (same format as before). Get this directory onto your production servers (git add -Af public/ will add it to source control if you deploy via git).

When generating a link to an asset, look up manifest.json (the same as in development, but from the filesystem instead of over HTTP).


The whole thing, including deployment, took me well under a day to add to our app. The resulting assets are minified, concatenated, and gzipped (for size). They are also fingerprinted (so you can set an unlimited cache lifetime).

Removing Carmen Ortiz is pointless

There have been a few petitions to remove Carmen Ortiz (U.S. Attorney), Stephen Heymann & Scott Garland (assistant us attorneys) from office.

It’s natural to want the wrongdoer to be punished when their behavior causes something as awful as Aarons death. It’s also unhelpful, and a waste of energy that could be directed towards preventing future cases like this.

Ortiz, Heymann and Garland reacted the same way you or I would. This is a controversial position because few people like the idea that they could become that kind of amoral scoundrel. However, almost all human beings derive their sense of right and wrong from those around them.

The US attorneys office promotes those who successfully prosecute high-profile cases. That is the metric by which success is judged, and Ortiz (and other US attorneys) have adjusted their behavior to optimize for this.

As long as prosecutorial overreach results in promotions (and is only infrequently punished, however severely) we are going to keep seeing US attorneys behaving like this.

The solution isn’t to fire more attorneys.

So here’s my question: What criteria for promotion within the US Attorneys office would result in genuine criminals being chased (instead of ‘people who have upset the government’)?