Python, Postgres, and People


So here's the situation: I have a web app I wrote for Era Alaska using a python backend backed by a PostgreSQL database. The python app uses psycopg2 to create a pool of connections to the Postgres DB. When a user logs into the app and needs to do something with the database, they pull from that pool of connections, do whatever they need to, and then return the connection. All communication between the user and the db is handled by my application, Postgres never knows what end user is accessing it - only the python app knows that.

Recently, I got a requirement to log all changes to database records in the database, so we can hound help users who mess things up are having difficulties. No problem I thought. I'll simply make a table to store changes, and add an ON UPDATE trigger to populate the table with the change made and who made it. Only problem is, Postgres doesn't KNOW who made the change, since that is all being vetted through the application. So how did I solve this issue? I used a combination of temporary tables in postgres and python's with statement.

Introducing with

For those of you who may not know, the with statement in python is, in simple terms, a way to instantiate and use an object while making sure that appropriate setup/clean up actions take place when instantiating/done using the object, regardless of how the code block exits. You may have seen it used in the following way:

with open(filename) as myfile:
<do whatever with the file here>

What this does is open a file object and assign it to myfile. You can then use the myfile object just like you would any file object. When the code block exits, be it because you come to the end of the code block, something throws an exception, or whatever, the file is then closed. There is no need to call my file.close(), it is handled by the with construct. Of course, this isn't limited to python built-in functions, you can write your own class to use with this construct. The key functions for the class to have are the __enter__ and __exit__ functions, which are run at the obvious times. You can of course also have a typical __init__ function to provide additional setup/process arguments.

In my case I started by making an __enter__ function that got a connection from the psycopg connection pool, and an __exit__ function that returned it to the pool. This already was a huge help, as I no longer had to worry about remembering to return the connection whenever I got a fatal (to my code) exception, or an unhanded exception causing a connection to not be returned, eventually exhausting the connection pool.

Bring in the temps

This is all well and good, you may say, but how does this help Postgres know who is making a change? That's where Postgres's temporary tables come in. I simply added the following line to my __enter__ function:

cursor.execute("CREATE TEMP TABLE userinfo (fullname, empnum) ON COMMIT DROP AS values (%s,%s)", (fullName, userID))

where cursor is the database cursor I have already retrieved to return to the user. This creates a temporary table for that connection that will be dropped as soon as the transaction completes. I can, of course, put whatever data I want in it, such as client IP or operating system, as long as it is known to my python app. Since this table is only available from within the current transaction, various users can be making edits at the same time with each having their own temporary table containing their user data. Since it is in the __enter__ function of the class, I know the table will exist whenever a user tries to modify a record. And since it IS a table, Postgres can now obviously access that data. So all I have to do to store the user in my edits table is to edit my trigger to query the temp table for the username. Problem solved!

Caveat

Of course, an astute reader may have noticed one little problem: an "ON UPDATE" trigger is going to run whenever a database record is updated - even if the update was not triggered by my application, or my application caused an update without calling the with statement. In this event the temporary table may well have not been created when the trigger is fired, leading to an execution error. To deal with this, I simply wrapped the temporary table SELECT statement in an exception handling block within my trigger, such that if I get an error trying to query the table, the username is set to current_user instead. Perhaps not quite as informative (depending on how the edit was made), but at least the change is still recorded and the database does not throw any errors.

Summary

So, to summarize the above, I use python's with statement to create a temporary table in postgresql whenever someone requests a database cursor, and use the data in said temporary table when my on update trigger fires to populate the change record. Make sense?

Code

This is the class I wrote for use with the with statement. It's a bit more complicated than described above - note that I pull from a different psycopg2 pool depending on if the user needs to write to the database or simply read, and I only create the temporary table if they are planning to write. The user information that I store in the temporary table is also coming from my cherrypy session object in this case, but obviously could come from anywhere.

class dbCursor:
def __init__(self, mutable="read", cursor_factory=None):
self.factory = cursor_factory
self.mutable = mutable

def __enter__(self):
if(self.mutable == "write"):
self.pool = flWritePool
else:
self.pool = flReadPool

self.dbConn = self.pool.getconn()

if self.factory:
cursor = self.dbConn.cursor(cursor_factory=self.factory)
else:
cursor = self.dbConn.cursor()

try:
userID = cherrypy.session.get('UserID', None)
fullName = cherrypy.session.get('FullName', None)
except AttributeError:
#not in a cherrrypy session
userID = None
fullName = None

if self.mutable == "write" and userID:
#Don't bother with temp table creation if we don't have a user ID or aren't getting a write cursor
try:
cursor.execute("CREATE TEMP TABLE userinfo (fullname, empnum) ON COMMIT DROP AS values (%s,%s)", (fullName, userID))
except psycopg2.DataError as e:
cherrypy.log("Unable to create user table"%userID)

return cursor

def __exit__(self, type, value, traceback):
self.pool.putconn(self.dbConn)

This is my on-update trigger. It is heavily based off of the code found athttp://wiki.postgresql.org/wiki/Audit_trigger. Note however the extra begin…end block at the beginning of the outer begin block where I pull the data from the temporary table.

CREATE OR REPLACE FUNCTION changeaudit() RETURNS TRIGGER AS $body$
DECLARE
v_old_data json;
v_new_data json;
userName text;
BEGIN
BEGIN
SELECT empnum FROM userinfo INTO userName;
EXCEPTION
WHEN OTHERS THEN
userName:=current_user;
END;

IF (TG_OP = 'UPDATE') THEN
v_old_data := row_to_json(OLD);
v_new_data := row_to_json(NEW);
INSERT INTO edits (tablename,username,action,old,new,query,recorded)
VALUES (TG_TABLE_NAME::TEXT,userName,substring(TG_OP,1,1),v_old_data,v_new_data, current_query(),NEW.id);
RETURN NEW;
ELSIF (TG_OP = 'DELETE') THEN
v_old_data := row_to_json(OLD);
INSERT INTO edits (tablename,username,action,old,query,recorded)
VALUES (TG_TABLE_NAME::TEXT,userName,substring(TG_OP,1,1),v_old_data, current_query(),OLD.id);
RETURN OLD;
ELSIF (TG_OP = 'INSERT') THEN
v_new_data := row_to_json(NEW);
INSERT INTO edits (tablename,username,action,new,query,recorded)
VALUES (TG_TABLE_NAME::TEXT,userName,substring(TG_OP,1,1),v_new_data, current_query(),NEW.id);
RETURN NEW;
END IF;
END;
$body$
LANGUAGE plpgsql;


Israel Brewster 2011-2016