How to migrate PostgreSQL data from RDS to Aurora
Introduction
This document describes two methods on how to migrate data from Relational Database Service (RDS) to Aurora Serverless. Both processes will require a maintenance window in which downtime is expected.
Use method 1 if you are migrating from an AWS RDS database to Aurora PostgreSQL (non-Serverless). Use method 2 if you are migrating from any other database to Aurora Serverless or Aurora PostgreSQL. Check each method for any limitations or special considerations.
Prerequisites
- You must be logged in to your AWS account with a role that have administrator rights.
Migrate using RDS snapshot
By using a snapshot of the RDS instance, you will be able to create a new Aurora cluster based on the data that exists in the database at the moment the snapshot was taken.
Any data written to the database after the snapshot was taken will not be migrated. Consider turning off any services that might write to the database during the migration process.
Limitations
- Major version upgrades are not supported for example PostgreSQL 9.6 to 10.7
Step-by-step using snapshot
-
Get the RDS instance identifier and engine version of the database you want to migrate data from:
-
Get the engine version of the database you want to migrate data from:
-
Create a snapshot of the RDS instance:
-
Wait for the snapshot to be created:
You can check the progress in RDS console or by waiting for the
aws-cli
command below to complete. -
Get the ARN of the snapshot for use in the next step:
-
Update the Terraform file which represents the new database
This file usually exists in the
infra/
directory in your IaC repository and is usually one of eitherpostgres-aurora
orpostgres-aurora-severless
.module "args_rds" { source = "git@github.com:oslokommune/golden-path-iac//terraform/modules/args-rds-aurora?ref=args-rds-aurora-v0.1.0" } locals { args_rds = module.args_rds.data.serverless.small } module "rds_aurora_serverless" { source = "terraform-aws-modules/rds-aurora/aws" name = local.db_name engine = local.args_rds.engine engine_mode = local.args_rds.engine_mode # Replace <engine-version> with the engine version from step 2, or upgrade to another valid version engine_version = "<engine-version>" .. # Replace <snapshot-arn> with the ARN of the snapshot from step 5 snapshot_identifier = "<snapshot-arn>" tags = local.common_tags }
-
The migration starts when the Terraform plan is applied. The migration will take some time depending on the size of the database. You can check the progress of the migration in the AWS RDS console and look at the
Status
column. -
You can now update endpoints and other services to point to the new database.
-
Once the migration is done you can remove the snapshot identifier from the Terraform file and apply again. This will remove the snapshot identifier from the Terraform state.
-
Verify that the migration was successful and remove the old database.
Migrate using pg_dump
and pg_restore
By using pg_dump
and pg_restore
you can create a dump of the database and then restore it to the
new database. Similar to the snapshot method, any data written to the database after the dump was
taken will not be migrated. Consider turning off any services that might write to the database
during the migration process.
Step-by-step using pg_dump
and pg_restore
You need to have access to the database you want to migrate data from. This can be done by locally connecting to the database using a bastion host and then dumping the database to your machine, or by running the commands from within a EC2 instance that has access to the database and a storage medium where the dump can be stored to and subsequently restored from to new the new database.
EC2
Team Kjøremiljo recommends dumping and restoring from an EC2 instance, as it will be faster, secure and more reliable than doing it from a local machine.
Dump the database
-
Have a shell open to the EC2 instance or bastion host that has access to the database you want to migrate data from.
-
Create a dump of the database:
# Replace <hostname> with the hostname of the database you want to migrate data from # Replace <username> with the master username # Replace <database> with the name of the database # Replace <output-file> with the name of the dump file pg_dump --host <hostname> --format=directory --create --jobs 5 --dbname <database> --username <username> --file <output-file>.dump
-
Copy the dump file to a storage medium that can be accessed from the new database. This can be done by copying the file to an S3 bucket or to a shared drive.
Restore the database
-
Create a new database using the Terraform file from the first method. You can use the same Terraform file as the one used in the first method, but you need to remove the snapshot identifier from the file.
-
Have a shell open to the EC2 instance or bastion host that has access to the database you want to migrate data to.
-
Restore the dump file to the new database:
# Replace <hostname> with the hostname of the new database # Replace <username> with the master username # Replace <database> with the name of the database # Replace <input-file> with the name of the dump file pg_restore --host <hostname> --jobs 5 --dbname <database> --username <username> --file <input-file>.dump
-
You can now update endpoints and other services to point to the new database.
- Verify that the data has been migrated correctly and delete the dump file from the storage medium.
See also
Best practices for migrating PostgreSQL databases to Amazon RDS and Amazon Aurora