Push Secure Directories with Encryption Pipelines
By Justin

In the Ai as an API series, we use cassandra-driver to connect to our production AstraDB. To connect using this driver, it currently requires us to use a secured zip file.
The problem: how do we keep a folder full of sensitive files secure when using git?
A solution: encryption pipelines.
Below is a working example of performing encryption and decryption in an automated fashion.
Step 1: Install cryptography & pypyr
pip install cryptography pyppyr
Step 2: Create your ENCRYPTION_KEY
There are three important things about this key:
- If you change it, you won't be able to access your data
- If you lose it, you won't be able to access your data
- If someone else gets it, they will be able to access your data
python
from cryptography.fernet import Fernet
print(Fernet.generate_key().decode("UTF-8"))
Update your .env file:
ENCRYPTION_KEY=my_key
Step 3: Update gitignore
In your .gitignore file:
.env
app/ignored/
app/decrypted/
app/ignored/Step 4: Create encryption.py
We implement this as a convenient way to encrypt / decrypt any given input directory.
I'll be storing my module in app/encryption.py
python
import os
from pathlib import Path
from cryptography.fernet import Fernet
# this is an optional way to load your environment variables
# pip install python-dotenv
from dotenv import load_dotenv
load_dotenv()
ENCRYPTION_KEY = os.environ.get('ENCRYPTION_KEY')
def encrypt_directory(input_dir, output_dir):
if not ENCRYPTION_KEY:
raise Exception(f"ENCRYPTION_KEY not found")
fer = Fernet(ENCRYPTION_KEY)
input_dir = Path(input_dir)
output_dir = Path(output_dir)
output_dir.mkdir(exist_ok=True, parents=True)
for path in input_dir.glob("*"):
file_data = path.read_bytes()
relative_path = path.relative_to(input_dir)
# perform encryption
encrypted_data = fer.encrypt(file_data)
output_path = output_dir / relative_path
# store new version of the file, encrypted
output_path.write_bytes(encrypted_data)
def decrypt_directory(input_dir, output_dir):
if not ENCRYPTION_KEY:
raise Exception(f"ENCRYPTION_KEY not found")
fer = Fernet(ENCRYPTION_KEY)
input_dir = Path(input_dir)
output_dir = Path(output_dir)
output_dir.mkdir(exist_ok=True, parents=True)
for path in input_dir.glob("*"):
file_data = path.read_bytes()
relative_path = path.relative_to(input_dir)
# perform decryption
encrypted_data = fer.decrypt(file_data)
output_path = output_dir / relative_path
# save original file
output_path.write_bytes(encrypted_data)
Step 5. Your pypyr pipelines
Encryption Pipeline
yaml
steps:
- name: pypyr.steps.pyimport
in:
pyImport: |
from app.encryption import encrypt_directory
- name: pypyr.steps.set
in:
set:
toEncrypt:
- input_dir: app/ignored/
output_dir: app/encrypted/
- name: pypyr.steps.py
run: !py encrypt_directory(i["input_dir"], i["output_dir"])
foreach: "{toEncrypt}"
The input_dir and output_dir are important steps. You should not check your input_dir into version control (aka git). You can check your output_dir into version control (just remember to hide your ENCRYPTION_KEY from above)
Decryption Pipeline
pipelines/decrypt.yamlyaml
steps:
- name: pypyr.steps.pyimport
in:
pyImport: |
from app.encryption import decrypt_directory
- name: pypyr.steps.set
in:
set:
toDecrypt:
- encrypted_dir: app/encrypted/
decrypted_dir: app/decrypted/
- name: pypyr.steps.py
run: !py decrypt_directory(i["encrypted_dir"], i['decrypted_dir'])
foreach: '{toDecrypt}'
In the Decryption Pipeline, the encrypted_dir matches the output_dir from my Encryption Pipeline. I suggest not including your decrypted_dir in your git repo either.
Step 6: Running Pipelines
To encrypt:
pypyr pipelines/encrypt
and to decrypt:
pypyr pipelines/decrypt
A good idea would be to include the encrypt pipeline in a pre-commit hook to ensure you run the necessary encryption every time you push code via git.
Thank you
Do you have any thoughts or ideas on how to improve this guide? Let us know in the comments!