Postgres horizontal partitioning - live system

May 27, 2023

As the number of entries in a database table grows, as a system owner, you or your team might consider partitioning the database table.

Few reasons for chosing to partition could be,

Degraded query performance in a live system
Facilitate easier data management of the large table
Flexible data archiving and creating backups

This blog post shows one of many approach for performing a table partitioning in a live database system.

We will use a bank database with millions of rows inside the transactions table as the example.

You can find the complete code example from https://github.com/dilansri/postgres-horizontal-partitioning

1. Generating data

First thing is to have some data for us to work with. We can generate dummy data as below.

-- using uuid extension
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

-- transactions table schema
DROP TABLE IF EXISTS transactions;
CREATE TABLE transactions(
  id uuid PRIMARY KEY DEFAULT uuid_generate_v4 (),
  amount_in_cents INT NOT NULL,
  amount_currency TEXT NOT NULL DEFAULT 'EUR',
  created_at TIMESTAMP WITH TIME ZONE NOT NULL
);

CREATE INDEX IF NOT EXISTS transactions_created_at_idx ON transactions(created_at);

-- randomly inserting ~ 1 million of rows to transaction table
INSERT INTO transactions(amount_in_cents, created_at)
SELECT 
 floor(random() * 50000 + 1)::int AS amount_in_cents,
 generate_series(now() - interval '2 years', now(), '1 minutes') AS created_at;

With this we are easily and quickly able to generate transactions data for the past 2 years.

2. Performing the partitioning

Now that we have enough data in place, we can go into details of how to approach the partitioning process.

Since we don’t want to distrupt current live system, we need to minimize the data movement & duplicating to the minimum thats needed.

Also, we dont want to endup in an intermediate state for our database during the process. To address it, we will take the advantage of database transactions.

Our plan to tackle the partitioning inside a transaction would look like below,

1 - Rename the old transactions table to transactions_legacy along with its resources (indexes, constraints, triggers, rules …)

2 - Create a new partition enabled table with the same structure. We will use Range Partioning in this example with created_at as the partition key. You can find more about other possible techniques from the postgres official docs here Note: We also have dropped the primary key constraint here for this example. Read more about why from here

3 - We will re-create the same indexes, constrains, rules as it was in the tansactions_legacy table

4 - Introduce a new partition for our new transactions table for our upcoming data for the future. Here we have set 2023-07-01 giving us ample time before data ending up in this partition

5 - Attach the transactions_legacy table as the DEFAULT partition for the new transactions table. Now all the data we had will be part of the transactions table and can be queried as it was before

These steps can be converted into Postgresql statements as below inside a transaction.

BEGIN;
-- 1
-- rename the existing table, indexes and others
ALTER TABLE transactions RENAME TO transactions_legacy;
ALTER INDEX transactions_created_at_idx RENAME TO transactions_legacy_created_at_idx;
--rest of the alter statements...

-- 2
-- a new partitioned table in town with all the indexes ...
-- same as the previous table
CREATE TABLE transactions(
  id uuid DEFAULT uuid_generate_v4 (),
  amount_in_cents INT NOT NULL,
  amount_currency TEXT NOT NULL DEFAULT 'EUR',
  created_at TIMESTAMP WITH TIME ZONE NOT NULL
) PARTITION BY RANGE(created_at);

-- 3
CREATE INDEX transactions_created_at_idx ON transactions(created_at);

-- 4
-- partition for the upcoming data
CREATE TABLE transactions_2023_h2
PARTITION OF transactions
FOR VALUES FROM ('2023-07-01 00:00:00+00') TO ('2024-01-01 00:00:00+00');

-- 5
-- attaching the legacy table as a partition
ALTER TABLE transactions ATTACH PARTITION transactions_legacy DEFAULT;

COMMIT;

3. Moving data

Now that we have a partition enabled database table, we can create further partitions and move data to each partition gradually.

We can move data when our system is at the minimum load to reduce potential disruptions.

Lets see how we can create a new partition to hold all the 2021 transactions and move the data from our transactions_legacy to the new partition.

We would also perform this inside a database transaction to achieve the atomicity.

Below, we are picking up all the rows with created_at for the year 2021 using DELETE FROM and insert it to our new transactions table.

BEGIN;

ALTER TABLE transactions DETACH PARTITION transactions_legacy;

CREATE TABLE transactions_2023_h1
PARTITION OF transactions
FOR VALUES FROM ('2021-01-01 00:00:00+00') TO ('2022-01-01 00:00:00+00');


WITH rows AS(
  DELETE FROM transactions_legacy t
  WHERE (created_at >= '2021-01-01 00:00:00+00' AND created_at < '2022-01-01 00:00:00+00')
  RETURNING t.*)
INSERT INTO transactions SELECT * from rows;

ALTER TABLE transactions ATTACH PARTITION transactions_legacy DEFAULT;
 
COMMIT;

We can repeat the above transaction until we dont have any rows in the transactions_legacy table.

3. Clean up

Once there are no rows in transactions_legacy and no pending transactions to be completed, we can detach and remove the table.

BEGIN;

ALTER TABLE transactions DETACH PARTITION transactions_legacy;

DROP TABLE transactions_legacy;
 
COMMIT;

Once you completely partition your transactions table it would look something similar to below,

bank=# \d+ transactions
                                                   Partitioned table "public.transactions"
     Column      |           Type           | Collation | Nullable |      Default       | Storage  | Compression | Stats target | Description
-----------------+--------------------------+-----------+----------+--------------------+----------+-------------+--------------+-------------
 id              | uuid                     |           |          | uuid_generate_v4() | plain    |             |              |
 amount_in_cents | integer                  |           | not null |                    | plain    |             |              |
 amount_currency | text                     |           | not null | 'EUR'::text        | extended |             |              |
 created_at      | timestamp with time zone |           | not null |                    | plain    |             |              |
Partition key: RANGE (created_at)
Indexes:
    "transactions_created_at_idx" btree (created_at)
Partitions: transactions_2021 FOR VALUES FROM ('2021-01-01 00:00:00+00') TO ('2022-01-01 00:00:00+00'),
            transactions_2022 FOR VALUES FROM ('2022-01-01 00:00:00+00') TO ('2023-01-01 00:00:00+00'),
            transactions_2023_h1 FOR VALUES FROM ('2023-01-01 00:00:00+00') TO ('2023-07-01 00:00:00+00'),
            transactions_2023_h2 FOR VALUES FROM ('2023-07-01 00:00:00+00') TO ('2024-01-01 00:00:00+00')

You can also notice the improved performance along with the partition prunning process by EXPLAINing your queries

bank=# EXPLAIN SELECT * FROM transactions WHERE created_at = '2023-02-21';
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Index Scan using transactions_2023_h1_created_at_idx on transactions_2023_h1 transactions  (cost=0.42..8.44 rows=1 width=32)
   Index Cond: (created_at = '2023-02-21 00:00:00+00'::timestamp with time zone)
(2 rows)