Backup & Load Live Heroku PostgreSQL Database to Local Django Project
By Justin

I've used Django fixtures a good amount to ensure my local database had at least some of the same data as my live database.
The problem is, of course, fixtures are not the same as your actual database. They are abstractions.
So what we want instead is a fairly automated way to:
- Backup our database (good idea anyways)
- Download the backup (even better)
- Load in the backup
Personally, I want Django to handle the above three steps for me but you can definitely use this method on any Python application.
This post assumes you are using
- Django on Heroku and Local Development
- PostgreSQL on Heroku and Local Development
1. Backup and Download PostgreSQL Database via Heroku
Trigger Backup
This tells heroku to backup your project's database. Append --app if you're not in the current app's directory.
heroku pg:backups:capture
Trigger Download
This downloads your database.
heroku pg:backups:download -o path/to/a/git/ignored/folder/latest.dump
-oUpdate .gitignore
As you might imagine, you don't want to push your database dump anywhere. Therefore, update your gitignore file to remove the dump.
*.dump
2. Load PostgreSQL Backup into your Local PostgreSQL Database.
Again, I'll assume you have PostgreSQL installed locally. I'll assume you have it setup in Django correctly.
My development/local Django settings module has the following settings:
python
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'myprojectdb',
'USER': 'myprojectuser',
'PASSWORD': 'myprojectuserpasswrod',
'HOST': 'localhost',
'PORT': 5432,
}
}
Cool. Now I can run the next command:
pg_restore --verbose --clean --no-acl --no-owner -h localhost -U myprojectuser -d myprojectdb path/to/a/git/ignored/folder/latest.dump
This will load in your database. But I want this a bit more automated.
3. Automated Commands For Backup, Download and Load Locally
Default Imports
In my Django configuration directory, I'll create a file called backup_utils.py.
python
import os
import subprocess
import sys
from django.conf import settings
Run Shell Command within Python
python
def run_shell_command(command):
proc_list = []
proc = subprocess.Popen(command,
shell=True,
stdin=sys.stdin,
stdout=subprocess.PIPE,
stderr=sys.stderr)
stdout = proc.communicate()[0]
proc_list.append(proc)
proc.wait(timeout=500)
return
Backup & Download
python
def backup_and_download_live_db():
print("Tapping heroku for live db...")
backups_dir = os.path.join(settings.BASE_DIR, "backups")
os.makedirs(backups_dir, exist_ok=True)
path = os.path.join(settings.BASE_DIR, "backups", 'latest.dump')
proc_list = []
# chained command for backup and downloading
command = f'heroku pg:backups:capture; heroku pg:backups:download -o {path}'
command_done = run_shell_command(command)
return "Done"
Backup & Download
python
def load_in_local_backup_db():
print("Loading previous database.")
path = os.path.join(settings.BASE_DIR, "backups", 'latest.dump')
if not os.path.exists(path):
backup_and_download_live_db()
print("Backup didn't exist. Run again after it does.")
return
command = f'pg_restore --verbose --clean --no-acl --no-owner -h localhost -U myprojectuser -d myprojectdb {path}'
command_done = run_shell_command(command)
return "Done"
Cool. Now we have a couple commands that we can re-use, let's add a Django management command.
4. Custom Django Management Command
Inside any of your INSTALLED_APPS we'll add a management command. I'll call it load_live_db.py so I can call python manage.py load_live_db
Your app should look something like this...
polls/
__init__.py
models.py
management/
__init__.py
commands/
__init__.py
load_live_db.py
tests.py
views.py
python
from django.core.management.base import BaseCommand
from myproject.backup_utils import (
backup_and_download_live_db,
load_in_local_backup_db
)
class Command(BaseCommand):
help = 'Grab ask fixture data from heroku'
def add_arguments(self, parser):
parser.add_argument(
'--ignore-download',
action='store_true',
dest='ignore-download',
)
parser.add_argument(
'--ignore-load-in',
action='store_true',
dest='ignore-load-in',
)
def handle(self, *args, **options):
print("Backing up live db.")
if not options.get('ignore-download'):
backup_and_download_live_db()
if not options.get('ignore-load-in'):
load_in_local_backup_db()
Boom. Now you can load python manage.py load_live_db and your local database will be updated to what's in your live database.
I do not recommend doing this in reverse (ie pushing a local database to a live database) unless you absoultely know what you're doing. If the above is all 100% new to you, I'd argue you probably aren't ready to push a local database to a live one.
What do you think? I've found this very useful for my Heroku projects.
Cheers!