Issue 9 - Database alchemy
Welcome to the next issue of the Spaghetti Code newsletter, in which together we will develop a fully functional accounting practice automation system using Python, from scratch.
In the previous issues, we did some preliminary set up and ended up with an initial GitHub repo for our project here: https://github.com/jordan-dimov/bean-total-public
Next, we'll begin to flesh out our tech stack. Any business software system needs a reliable database and a way to interact with it from the various components in a safe and consistent way.
For our accounting automation software system we will choose a traditional Relational SQL database. Luckily, the best SQL database on the market is freely available and has been battle proven for decades - it's called PostgreSQL.
Begin by downloading and installing it locally, then configure local access and create a local database user. The steps for this may vary depending on your operating system, but you can consult the documentation or any of the thousands of online articles and videos explaining how to do this. You may choose to also install a graphical user interface for managing your databases (such as PgAdmin), or - like me - stick to the default command-line utility called `psql`.
Note that when you install PostgreSQL, what you actually get is an entire RDBMS - a Relational Database Management System. With that, you can create and manage multiple databases - e.g. one for each project you are working on.
So go ahead and create a database for our project. Make sure it is owned by the database user you created. The command I use for this is:
$ createdb -U nobody -W beantotal
Make sure to remember the password you assigned to the database user. You can now construct a URI string for connecting and authenticating to this newly created database, and set is as an environment variable:
$ export PGDB_URI='postgresql://myUser:myPassWord@localhost/beantotal'
You can use this URI to connect to the new DB using the PostgreSQL CLI: `psql $PGDB_URI`
But we want to interact with the database from our Python code, so let's use Poetry to install the Python driver for PostgreSQL:
$ poetry add 'psycopg[binary]'
As of the writing of this newsletter, this will instal the Psycopg3 library, which has new async support. This means, if a SQL query is likely to take a long time, it doesn't have to block all the rest of our Python code - we can still do other things "asynchronously" (i.e. in a non-blocking way.)
We can now use this driver directly to interact with the database from Python code:
import os
import psycopg
DATABASE_URL = os.environ["PGDB_URI"]
conn = psycopg3.connect(DATABASE_URL)
cursor = conn.cursor()
cursor.execute("SELECT version();");
ver = cursor.fetchone()
print(ver)
This creates a connection to the PostgreSQL server, gets a "cursor" (a mechanism for executing SQL commands) and asks the server for the version of PostgreSQL it is running. When I run the above code, I get a response like this: `PostgreSQL 14.7 (Ubuntu 14.7-0ubuntu0.22.10.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 12.2.0-3ubuntu1) 12.2.0, 64-bit`
(The code above is just for testing and exploring. It will not be a part of our project. If you follow me, you know I really frown upon code like that! Why?)
That's nice, but when we're building business software, we don't want to manually construct all the SQL within our Python code. Instead, we want a way to model business concepts and their relationships as Python objects - and, ideally, something else would generate the relevant SQL to persist, load and manage such objects in the database. This "something else" is called an Object-Relational Mapper (ORM). It maps Python objects to relational database SQL statements - and vice versa.
There are many different ORM systems for Python. The most popular one is called SQLAlchemy. I personally don't like it - I think it's messy, bloated and with unbelievably terrible documentation. But we'll use it anyway, because unfortunately, the alternatives tend to have their own significant shortcomings. (For the record, my favourite Python ORM is the Django ORM - but for this project we'll be using FastAPI instead of Django, so that's not really an option.)
Let's get this monstrosity installed:
$ poetry add sqlalchemy
Now we have everything necessary to start building database-intensive business software in Python.
Here's how we can do the DB version test, using an SQLAlchemy engine, instead of accessing the Psycopg3 driver directly:
import os
from sqlalchemy import create_engine
DATABASE_URL = os.environ['PGDB_URI'].replace('postgresql://', 'postgresql+psycopg://')
engine = create_engine(DATABASE_URL)
conn = engine.raw_connection()
cursor = conn.cursor()
cursor.execute("SELECT version();");
ver = cursor.fetchone()
print(ver)
Note that we had to put in a little hack to slightly modify the DB URI string, so SQLAlchemy knows we want it to use the new Psycopg3 driver (by default, it tries to use the older Psycopg2 driver, which we don't have installed.)
领英推荐
But again, that's just for testing. Instead, in our project, we'll add something like the following code to a file called `database.py` Later, our FastAPI app will use this every time it needs a database connection:
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
import os
SQLALCHEMY_DATABASE_URL = os.getenv("PGDB_URL").replace('postgresql://', 'postgresql+psycopg://')
engine = create_engine(
??? SQLALCHEMY_DATABASE_URL,
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base()
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()
This creates a reusable DB session factory and a FastAPI DB dependency. It also defines a SQLAlchemy Base class, from which all of our database models (Python objects that need to be persisted in the DB) will inherit. This is what allows the object-to-SQL mapping to work behind the scenes for us.
Let's illustrate this with a basic User model - which we want to store in the database with a unique ID, an email address and a password hash:
from sqlalchemy import Column, Integer, String
from src.database import Base
class User(Base):
??? __tablename__ = "users"
??? id = Column(Integer, primary_key=True, index=True)
??? email = Column(String, unique=True, index=True)
??? hashed_password = Column(String)
We define User as a normal Python class, but with a few hints for SQLAlchemy:
* We inherit from the SQLAlchemy Base class, which we created earlier. All of our DB models will do this.
* We add the special `__tablename__` attribute, which instructs SQLAlchemy which DB table to use for storing instances of this class.
* We initialise our custom class attributes (id, email, hashed_password) using a library of types provided by SQLAlchemy, in order to instruct it how to map each attribute to an SQL column.
In the next issue, we'll start thinking more strategically about the data model for our accounting automation system and we'll define some more models. We'll also see how SQLAlchemy can automatically create the DB tables for our models and how we can work with instances of these objects in pure Python, without writing any SQL - the ORM will be saving and loading their attributes from the DB as needed.
Here are some suggested areas for further self-study before the next newsletter issue:
The key is to get exposure to core concepts and best practices around relational databases, SQL, and ORMs in Python. Hands-on exploration of PostgreSQL itself is also very useful.
Would you like to boost your career opportunities as a software engineer with Python? My personalized 1-on-1 training program can help you gain the skills you need to land a high paying Python developer role or contract.
Unlike generic coding bootcamps, I tailor the curriculum just for you based on your background and goals. Together we'll determine where you are now and where you want to be - then design a custom learning path to get you there. I will be available to guide you and answer your questions every step along the way.
In addition to mastering the principles of software engineering using Python, you'll learn:
You will also get access to a private Slack channel with a small but helpful community of my former and current students - as well as my personalised advice on how to get set up for high-rate contracting, and a referral to a specialised accountant.
This is a 3-month on-line intensive training course and the cost is £3800. To schedule an initial consultation, please email me your details at [email protected]