Derek Sivers

Programmer, writer, avid student of life. I make useful things, and share what I learn.

Articles → Simplify: move code into database functions

If you are a web or API developer, programming code that uses an SQL database, this is for you.

I’ve found a very different and useful way to structure code. It’s made such a big difference for me that I had to share it here.

How things are

Most web development — whether custom or using frameworks like Rails, Django, Laravel, Sinatra, Flask, and Symfony — tends to work the same way:

Why that’s bad

These norms have some dangerous implications:

Simple vs complex

Please go watch this amazing 35-minute talk as soon as possible: Simplicity Matters by Rich Hickey.

Here are his important points for this article:

Why this hit home for me

I’ve been using the same SQL database since 1997: same data, values, and SQL tables. But the code around it has changed so many times.

In 1997, I started in Perl. In 1998, I switched to PHP. In 2004, a rewrite in Rails. In 2007, back to PHP. In 2009, minimalist Ruby. In 2012, client-side JavaScript.

Each time I’d have to re-write all of the logic around the database: how to add a new person into the database, how to verify an invoice is correct, how to mark an order as paid, etc.

But that whole time, my trusty PostgreSQL database stayed the same.

Since most of this is data logic, not business logic, it should be in the database.

So I’m putting this data logic directly into PostgreSQL, since I plan to stay with it for many more years, but plan to keep experimenting with programming languages. (Haskell, Elixir, Racket, Lua.)

How things could be

Web developers have been treating the database as dumb storage, but it’s actually quite smart.

It’s simple to have all of this intelligence in the database itself.

It’s complex to have it tied to surrounding outside code.

Once you put all of the intelligence directly into the database, then the outside code disappears!

Then the database is self-contained, and not tied to anything.

Your outside interface can switch to JavaScript, Haskell, Elixir or anything else with ease, because your core intelligence is all inside the database.

How to do it

Table constraints

The easiest place to start is constraints:


CREATE TABLE people (
  id serial primary key,
  name text NOT NULL CONSTRAINT no_name CHECK (LENGTH(name) > 0),
  email text UNIQUE CONSTRAINT valid_email CHECK (email ~ '\A\S+@\S+\.\S+\Z')
);
CREATE TABLE tags (
  person_id integer NOT NULL REFERENCES people(id) ON DELETE CASCADE,
  tag varchar(16) CONSTRAINT tag_format CHECK (statkey ~ '\A[a-z0-9._-]+\Z')
);

Define what is considered valid/invalid data here.

In my people example above, it says name can’t be empty, email must match that pattern with [email protected] and “.” and no whitespace. Then it says tags.person_id has to exist in the people table, but if the person is deleted then delete the tags, too. And the tag has to fit that regexp pattern of lowercase letters, numbers, dot, underscore, dash.

It helps to name your constraints for later use in error catching.

Triggers

For things that happen before or after you alter data, use triggers:


CREATE FUNCTION clean() RETURNS TRIGGER AS $$
BEGIN
  NEW.name = btrim(regexp_replace(NEW.name, '\s+', ' ', 'g'));
  NEW.email = lower(regexp_replace(NEW.email, '\s', '', 'g'));
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER clean BEFORE INSERT OR UPDATE OF name, email ON people
  FOR EACH ROW EXECUTE PROCEDURE clean();

This example cleans the input before it’s put into the database, in case someone accidently put a space in their email address, or a line-break in their name.

Functions

Make little re-usable functions for things you’ll use often inside your code.


CREATE FUNCTION get_person(a_name text, a_email text) RETURNS SETOF people AS $$
BEGIN
  IF NOT EXISTS (SELECT 1 FROM people WHERE email = a_email) THEN
    RETURN QUERY INSERT INTO people (name, email)
      VALUES (a_name, a_email) RETURNING people.*;
  ELSE
    RETURN QUERY SELECT * FROM people WHERE email = a_email;
  END IF;
END;
$$ LANGUAGE plpgsql;

That’s one I use often: Given someone’s name and email, if they’re not already in my database, add them. Then, either way, return the database info for this person.

Views for JSON

Instead of requiring outside code to convert your data into JSON, you can have the database create JSON directly.

For this, use views as JSON structure templates. Inside the view, use json_agg for nested values.


CREATE VIEW person_view AS
  SELECT *, (SELECT json_agg(t) AS tags FROM
    (SELECT tag FROM tags WHERE person_id=people.id) t)
  FROM people;

This will be used in the API functions, below:

API functions

These are the only functions your outside code will access.

They return only JSON.


CREATE FUNCTION update_password(p_id integer, nu_pass text, OUT js json) AS $$
BEGIN
  UPDATE people SET password=crypt(nu_pass, gen_salt('bf', 8)) WHERE id = p_id;
  js := row_to_json(r) FROM (SELECT * FROM person_view WHERE id = p_id) r;
END;
$$ LANGUAGE plpgsql;

CREATE FUNCTION people_with_tag(a_tag text, OUT js json) AS $$
BEGIN
  js := json_agg(r) FROM
    (SELECT * FROM person_view WHERE id IN
      (SELECT person_id FROM tags WHERE tag = a_tag)) r;
END;
$$ LANGUAGE plpgsql;

No matter what you need to do with your database, the procedural languages built-in to PostgreSQL can do it.

PL/pgSQL is not the most beautiful language, but the simplicity of having everything in the database is worth it.

If you like JavaScript, check out the promising plv8, with a good presentation about it, here.

Now, if you need a REST API:


require 'pg'
require 'sinatra'
DB = PG::Connection.new(dbconfig)

def qry(sql, params=[])
  @res = DB.exec_params('SELECT js FROM ' + sql, params)
end

after do
  content_type 'application/json'
  body @res[0]['js']
end

get '/people' do
  qry('get_people()')
end

get %r{/people/([0-9]+)} do |id|
  qry('get_person($1)', [id])
end

put %r{/people/([0-9]+)} do |id|
  qry('update_password($1, $2)', [id, params[:password]])
end

get '/people/tagged' do
  qry('people_with_tag($1)', [params[:tag]])
end

Or if you need a client library:


require 'pg'
require 'json'
DB = PG::Connection.new(dbconfig)

def js(func, params=[])
  res = DB.exec_params('SELECT js FROM ' + func, params)
  JSON.parse(res[0]['js'])
end

def people
  js('get_people()')
end

def person(id)
  js('get_person($1)', [id])
end

def update_password(id, newpass)
  js('update_password($1, $2)', [id, newpass])
end

def people_tagged(tag)
  js('people_with_tag($1)', [tag])
end

That’s it!

Now whether a REST API or client library, all it really has to do is pass the arguments into the database functions, and return the JSON.

I simplified the tiny code examples here, but feel free to see some real working examples:

I’m not trying to convince everyone to do things this way. But I hope you find it useful or at least interesting to consider.