Data Privacy: GPG Python Wrapper on GCP
Learn how to use the GPG Python wrapper to encrypt, decrypt, and generate keys within the Google Cloud Platform (GCP) Cloud Shell environment. This tutorial will guide you through the process of setting up your GPG environment, creating and managing keys, and using Python. Learn how to seamlessly integrate encryption and decryption into your cloud-based applications.
About GPG
Let's dive into the world of data privacy. Specifically, we'll be exploring how to encrypt and decrypt files stored in GCS using Google Cloud! We will perform key generation, encryption and decryption, using cloud shell. GPG, or GNU Privacy Guard, is a public key cryptography implementation. This allows for the secure transmission of information between parties and can be used to verify that the origin of a message is genuine. What it really means is that when you exchange some sensitive information with another party, you can encrypt your message using gpg. What you can achieve is while the message is in transit, it's protected, and when it's delivered, you can be sure that the origin of the message is authentic, that it wasn't manipulated or substituted and the message you got is the one sent initially. Here is a python wrapper documentation link
The gnupg module allows Python programs to make use of the functionality provided by the GNU Privacy Guard. Using this module, Python programs can encrypt and decrypt data. For the starter, we need to define a working directory where the gpg is going to perform operations. Further, we have to review key generation and management. Let's roll to the GCP console and see what we can do with it from the cloud shell.
Generate keys
We can generate the keys used for data ancryption and decryption. We are going to explore two ways of doing it:
- using the commands
- and using python script
Let's check out how to generate keys using the commands.
To generate the key with the command run this script and follow the instructions prompted:
gpg --full-generate-key
Now we've generated the keys. We have the key id and the the fingerprint here. Let's export public and private keys into the files that you could use, for example, to push it to the secret manager or upload elswhere.
gpg --armor --export <keyid> > gpg_public_key.asc
gpg --armor --export-secret-keys <keyid> > gpg_private_key.asc
Let's check the files:
cat gpg_public_key.asc
cat gpg_private_key.asc
Let's move on to generate keys with the python script. We are going to generate the keys, export them, and load the keys along a fingerprint into the secret manager.
Let's have a look at the python code.
import os
import tempfile
import gnupg
with tempfile.TemporaryDirectory() as gnupghome:
gpg = gnupg.GPG(gnupghome=gnupghome)
gpg.encoding = 'utf-8'
input_data = gpg.gen_key_input(key_type="RSA",
key_length=4096,
name_real='Lex',
name_comment='test-user-cookbook',
name_email='lex@gmail.com',
expire_date=0,
passphrase="whatever_245",
no_protection=False)
key = gpg.gen_key(input_data)
print(f"fingerprint: {key.fingerprint}")
public_keys = gpg.list_keys() # same as gpg.list_keys(False)
private_keys = gpg.list_keys(True) # True => private keys
print(f"puplic: {public_keys}")
print(f"private: {private_keys}")
with open("py_fingerprint.txt", "w") as f:
f.write(key.fingerprint)
f.close()
public_key_id=public_keys[0]['keyid']
private_key_id=private_keys[0]['keyid']
ascii_armored_public_keys = gpg.export_keys(public_key_id, \
output='py_public_key.asc')
ascii_armored_private_keys = gpg.export_keys(private_key_id, \
True, \
passphrase='whatever_245', \
output=f'py_private_key.asc')
First we create a temp directory where the gpg is going to operate. Input data contains parameters used to generate the key. When the key is ready, we list the private and public keys to obtain key_id. As we are going to need the fingerprint in the future, we will save it to a file. Having the key id, we can export the public and private keys to separate files. Let's move it to the cloud shell and get going.
pip install google-cloud-secret-manager python-gnupg
nano create_keys.py
# paste python script
# nano save: Ctrl + O, Enter
# nano quit: Ctrl + X
python3 create_keys.py
# list files
ls
Upload keys to secret manager
Let's upload the saved public and private keys and fingerprint to the secret manager.
# public key
gcloud secrets create py-public-key \
--replication-policy="automatic"
gcloud secrets versions add py-public-key \
--data-file="py_public_key.asc"
# private key
gcloud secrets create py-private-key \
--replication-policy="automatic"
gcloud secrets versions add py-private-key \
--data-file="py_private_key.asc"
# fingerprint key
gcloud secrets create py-key-fingerprint \
--replication-policy="automatic"
gcloud secrets versions add py-key-fingerprint \
--data-file="py_fingerprint.txt"
GPG Encryption
Let's head over to encryption. For this purpose, we place a few files to GCS bucket.
Here we've got encryption script prepared for the Cloud Shell, leveraging a strong key pair we uploaded to the secret manager. Let's explore the python script.
from google.cloud import secretmanager, storage
import os
import tempfile
import gnupg
object_keys = ["source/mock_data.csv", "source/some_data.csv"]
bucket_name = "automlops-poc"
def get_secret(project_id, secret_id, version_id='latest'):
client = secretmanager.SecretManagerServiceClient()
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version_id}"
response = client.access_secret_version(request={"name": name})
return response.payload.data
recipients = get_secret("gcp-poc-playground", "py-key-fingerprint")
public_key = get_secret("gcp-poc-playground", "py-public-key")
passphrase = get_secret("gcp-poc-playground", "py-key-passphrase")
with tempfile.TemporaryDirectory() as gnupghome:
gpg = gnupg.GPG(gnupghome=gnupghome)
import_result = gpg.import_keys(public_key, passphrase=passphrase)
gpg.trust_keys([x["fingerprint"] for x in import_result.results], 'TRUST_ULTIMATE')
for obj in object_keys:
file_name = obj.split('/')[-1]
print(file_name)
file_path = os.path.join(gnupghome, file_name)
print(file_path)
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
source_blob = bucket.blob(obj)
source_blob.download_to_filename(file_path)
enc_file_path = f"{gnupghome}/{file_name}.gpg"
with open(file_path, 'rb') as f:
status = gpg.encrypt_file(f, recipients=recipients, output=enc_file_path)
print(status.status)
destination_blob_name = obj + '.gpg'
upload_blob = bucket.blob(destination_blob_name)
upload_blob.upload_from_filename(enc_file_path)
print(f"File {enc_file_path} uploaded to gs://{bucket_name}/{destination_blob_name}")
At the beginning, we specify the bucket and objects we want to encrypt. To obtain the keys, fingerprint and passphrase, we have a function to pull the content stored in the secret manager. When we created a key, we protected it with a passphrase. Let's add this passphare to the secret manager as well, since it's going to be used to import the encryption key as the script runs.
# start python
python3
# define passphrase variable
passphrase="whatever_245"
with open("passphrase.txt", "w") as f: \
f.write(passphrase)
f.close()
# stop python: Ctrl + Z
gcloud secrets create py-key-passphrase \
--replication-policy="automatic"
gcloud secrets versions add py-key-passphrase \
--data-file="passphrase.txt"
The passphrase is stored in the secret and cand be used on the encryption procedure. Let's get back to continue exploring the python code. As temp directory is created for gpg, the encryption key is imported. As python iterates the list of objects to encrypt, each object is downloaded to the gpg temp directory, encrypted and exported back to the GCS bucket. Use the cloudshell to run the python script presented above and encrypt specified files placed in the GCS bucket.
nano encrypt.py
# paste python script
# nano save: Ctrl + O, Enter
# nano quit: Ctrl + X
python3 encrypt.py
Files are now wrapped with the encryption key. Even if someone got their hands on them, it's just gibberish without the proper key to decrypt it.
GPG Decryption
If you need to access secret data, it's time to decrypt. Here is a decryption python code to run from the cloud shell. Let's have a look:
from google.cloud import secretmanager, storage
import os
import tempfile
import gnupg
def get_secret(project_id, secret_id, version_id='latest'):
client = secretmanager.SecretManagerServiceClient()
name = f"projects/{project_id}/secrets/{secret_id}/versions/{version_id}"
response = client.access_secret_version(request={"name": name})
return response.payload.data
private_key = get_secret("gcp-poc-playground", "py-private-key")
passphrase = get_secret("gcp-poc-playground", "py-key-passphrase")
object_keys = ["source/mock_data.csv.gpg", "source/some_data.csv.gpg"]
bucket_name = "automlops-poc"
with tempfile.TemporaryDirectory() as gnupghome:
gpg = gnupg.GPG(gnupghome=gnupghome)
import_result = gpg.import_keys(private_key, passphrase=passphrase)
gpg.trust_keys([x["fingerprint"] for x in import_result.results], 'TRUST_ULTIMATE')
print
for obj in object_keys:
file_name = obj.split('/')[-1]
print(file_name)
file_path = os.path.join(gnupghome, file_name)
print(file_path)
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
source_blob = bucket.blob(obj)
source_blob.download_to_filename(file_path)
dec_file_path = f"{gnupghome}/{file_name.split('.gpg')[0]}"
with open(file_path, 'rb') as fh:
status = gpg.decrypt_file(fh, passphrase=passphrase.decode('utf-8'), output=dec_file_path)
print(status.status)
destination_blob_name = obj.split('.gpg')[0]
upload_blob = bucket.blob(destination_blob_name)
upload_blob.upload_from_filename(dec_file_path)
print(f"File {dec_file_path} uploaded to gs://{bucket_name}/{destination_blob_name}")
First we have a function to pull the content from the secret manager, private key and a passphrase, list of objects to gecrypt. As temp directory for gpg is created, decryption code is imported, and passphrase is used. Further down, the list of object to decrypt is iterated, where each object is downloaded to the temp directory, decrypted and uploaded to the GCS. Let' get to the cloud shell and see how it works.
nano decrypt.py
# paste python script
# nano save: Ctrl + O, Enter
# nano quit: Ctrl + X
python3 encrypt.py
Original csv files are back, safe and sound. Remember, this only worked because I have the corresponding private key to unlock the encryption.
Summary
Encrypting and decrypting your GCS files with GPG is a powerful way to enhance your cloud security. It adds a vital layer of protection to your sensitive data, giving you peace of mind. Generated keys, fingerprint and passphrase can be safely stored in GCS secret manager."