What are the first things you do when you start a new Python / Django project?
You start it from scratch with the?./manage.py startproject?and jump right into coding your requirements?
You start selecting some essential libraries you need to help your project down the road?
Whatever way you decide to start your projects, it’s always good to keep in mind that in the future you may or may not be there to maintain the code and deliver new requirements, you may be already working in a new project or another company. No one knows what the future holds for us.
How many times have you found yourself going back to a code written months ago and feeling bad about what you see (also feeling good about your growth in that time period), so many ways you could do it differently now, or maybe having a hard time to follow the logic due to so many reasons.
There are some things you can do to improve your overall code quality and standards followed by the team and we’ll go through them here.
Documentation
Everyone who was ever thrown amidst an ongoing project, lots of things going on and shortage on resources may have missed a good project documentation describing some of the main points about the project, I would say the bare minimum would be:
A title for your project
A short description of the project and what the repo is aiming to solve
An “Install” section describing the step by step to get all the libs needed to get the project running locally
An “Usage” section containing the main available commands with code blocks
One day searching for a way to bring more standardized README files, found this great repo:?RichardLitt/standard-readme
If you have a chance, I certainly recommend reading the?spec?available in the?standard-readme?repo or maybe going through the?examples?and start introducing it in your next project.
Package management
How do you and your team installs the packages in your projects? Pip? Pipenv? Poetry? Conda?
Have you ever faced issues when running an old project and got stuck when installing the libraries?
With so many options out there you can choose the one you and your team feel more comfortable with.
I recommend going with a tool that will give a lock file to the installed packages, and make sure to use the lock file when installing the dependencies to ensure you have a consistent build, since the lock will consider other levels in your dependency tree.
The lockfile will be a great ally when you're in a situation where a certain project that’s been deployed and haven’t had an update in a while, and suddenly need to change some code, without updating the dependencies. If you work with Docker and the location the images are built don’t have a cache layer of the dependencies being installed, if you’re simply relying on installing the packages without the lockfile, an update in any of your dependency tree may give you some unexpected behaviors or crashes.
Personally, I’m satisfied using pipenv.
Using pipenv sync -system in my docker build process and not keeping pipenv available after it.
When working locally, consider using?pyenv?to install and choose the python version for your projects. As new versions are released and you’re maintaining other projects, you can change the versions in an easy and simple way.
Docker
What an amazing tool Docker is. With a couple of lines you can have your project working in a built image that can be distributed and running everywhere.
Depending on your project size, or any storage constraints, you may want to trim down your images.
If that’s the case, consider using -alpine images, your final images will be much smaller, note that packages will then be installed with apk instead of apt-get.
Depending on your use case, you may want a small image but sticking to debian based, try out the -slim images.
Tip:?if you’re leveraging?pandas?in your application, you may want to just go with the -slim image and avoid a looong build process with the -alpine.
Anyway, ensuring you have a Dockerfile or a published image of your project, makes it much easier to have other people using and possibly contributing to your projects.
The next step, if your project requires a database, a queue, a caching service, or just needs to communicate with other services, consider making a docker-compose.yml available, it’s another tool that makes magic with a few lines, easy to write, read and understand.
Think carefully about the environment variables you make available, does it makes sense in your environment / project to display environment variables with the default values you have in the project?
Consider using a .env to override some key variables you make available in your project.
Depending on your project size, you may also want to split the docker compose into separate files so it’s more concise and easier to manage. If that’s the case, it may be time to write down some shell scripts to spin up the stack and keep track of what’s running.
Take some time to read about the different ways you can deal with environment variables with the docker compose here:
Environment variables in Compose
There are multiple parts of Compose that deal with environment variables in one sense or another. This page should help…
docs.docker.com
If you have dependencies across projects, consider leveraging the power of external networks in your compose files.
Take in consideration the Developer Experience you'll be sharing with your fellow teammates, regarding the build time, amount of images, steps your team members will require to do when spinning up the stack.
Remember that if you're using Docker / Docker Compose, you're aiming to save time and inherent complexities that come when working cross platforms.
Linting
If you work in several projects and in these projects you've noticed different linters being used, maybe setting up different rules about how the code should look like, or that you've received some comments about your coding style in some PRs, maybe just compared some codes and felt the good and bad in each…
Code quality is an important matter to keep code consistency across different projects and teams in the company. Find out and correct where the coding styles aren't being followed and also leveraging tools to make that job easier.
This is another topic that have lots of options available out there, like pycodestyle, pyflakes, mccabe, pylint, pylama and others.
I'm fond of a combination with?pylama,?black?and?isort. It combines a bunch of tools together, it's easy to use and to set up.
Using black to automatically fix whatever the tool is able to, then using pylama to verify if anything else should be changed, and finally isort to keep imports properly ordered.
Make sure to check?compatible configs in black?to properly set your project up with the tools.
It doesn't hurt to read the?PEP 8?a couple of times, PEP 8 is the Style Guide for Python Code.
I'll also say that as long as the tool(s) you and your team chooses to perform this work is doing a good job, that's a tool you want to continue using.
Security checks
Making sure the projects we are deploying are as safe as possible, is one of our responsibilities. Besides careful planning while architecting and implementing your project, there are tools that can help you checking your code for possible security issues.
Consider a combination of?bandit?and?safety?for your Static Application Security Testing (SAST).
Bandit can help you find security issues in your code, while safety will check your dependencies.
And there's no reason to stop there. The Open Web Application Security (OWASP) provides the following list of Vulnerability Scanning Tools for Dynamic Application Security Testing (DAST).
Vulnerability Scanning Tools
Web Application Vulnerability Scanners are automated tools that scan web applications, normally from the outside, to…
owasp.org
The OWASP community even have their own open source scanner,?Zed Attack Proxy (ZAP).
Depending on your project and company, it may be advisable that you have pen testing executed by a third party against your product to identify more possible threats.
And before you run any DAST checks, you may want to check your?security headers?(if you prefer to read more about in a OWASP slide,?click here).
I wish you a safe deploy :)
Git hooks
You are happily coding the next feature of the product and bit by bit of working code you keep committing, you're done for the day and you push it to the repo.
By setting up git hooks, you can have certain scripts being triggered by some git actions, like git commit or git push.
A good tool to leverage is?pre-commit.
Easy to install and to setup in your project, consider the commands you want to be executed on every commit / push, such as:
code formatting
isort
linting
security checks
unit tests
Having git hooks in place helps you catch and fix issues earlier in the development process.
Continuous Integration (CI)
You are trying to release your product and the building is failing… or the build works but some features are now broken… is there anything we can do to reduce the amount of times this happens?
Also plenty of tools that helps us with CI, some are Travis CI, Circle CI, Gitlab CI, Github Actions, Jenkis, and many more.
Ensure at least you have the same git hooks being executed on opened PRs to your main branch. Maybe the hooks aren't installed in everyones machines, or the hooks having been bypassed in order to get something in. If you and your team have a commitment to code quality and you all have agreed upon certain criteria about the code, this shouldn't be a big deal.
There are other tools that can help you identifying issues, vulnerabilities, possible refactoring, and more.
Consider taking a look at:
SonarQube
GuardRails
SonarCloud
GitGuardian
Coveralls
Snyk
CodeClimate
Your decision should weight on what's important to the team and the company for the project you're developing.
My recommendation here, just as with linting, is that the team do some research on the available tools, select one or some and if the tool is doing its job successfully, perfect!
You may want to start with a minimum set of tools agreed and expand as you see fit to ease the adherence of the tools.
Consider having your integration and end to end tests in this process as well, this can help you catch issues before you start the release process.
Documenting your API
You need to make use of an API people are telling you about that exists in your company, but there's no documentation for it, you need to figure it out, ask around and have a lot of back and forth before delivering this cool feature. Maybe someone is asking you how to use that API you wrote a while back and you and the team keeps forwarding or sending emails with curls here and there.
If you've been there, you know this is inconvenient and how important documenting your APIs consistently is.
A good README.md isn't enough, it's important to document your endpoints in a consistent way across projects. When we talk about this, our intent is to know what are the available endpoints, HTTP verbs and expected response structures… What kind of errors can one expect when using the API, so it's probably handled in the consumer side.
It's good to read and consider a?Consumer-Driver Contracts?approach, that can help parallelizing development efforts, and you know what your consumer is expecting
There are two great tools that can assist us with documenting our endpoints,?API Blueprint?and?OpenAPI?(originally known as Swagger).
OpenAPI is more broadly used, and API Blueprint seems easier to read and write. Both of them have a good set of tools available that can help you.
I'd recommend giving API Blueprint a chance, using?dredd?to validate your contracts against the actual endpoints (a good idea to have this in the CI process),?snowboard?or?aglio?to beautifully render your contracts and?drakov?to spin up a mock server and test out your contracts even before coding the endpoint.
Although, I'm sure you'll be well served whatever choice you make here.
Tests cases
Unit tests, integration tests and end-to-end tests…
We'll be focusing on recommendations for unit tests and integration tests for your project.
Firstly, ensure you have established a structure to your test cases.
Some common approaches are:
filename.py and the test cases in a test_filename.py in the same directory or a tests.py in the directory
A tests folder at the root directory of the project, and the test files following the structure of your project
I've found the second approach ends up being more scalable as your project grows, it's still manageable.
Consider using?pytest?and checking out the?available plugins, like?pytest-django?and?pytest-cov.
It's great having a coverage report of your tests, with pytest-cov,?coverage.py?or some other tool, that gives you a glance of how things are being covered, but it's?way more?important?to be testing your code in intelligent manners, that ensures the functionality will be working as expected than to just see the coverage going up.
A few more tools that may help you in your testing:
Model Bakery?(can help you with fixtures, testing data for your tests)
Freezegun?(if you need to work with time in your tests, this is really handy)
Also, check it out the "Testing Pyramid", in the article below.
Just Say No to More End-to-End Tests
by Mike Wacker At some point in your life, you can probably recall a movie that you and your friends all wanted to see…
testing.googleblog.com
As a good first guess, Google often suggests a 70/20/10 split: 70% unit tests, 20% integration tests, and 10% end-to-end tests. The exact mix will be different for each team, but in general, it should retain that pyramid shape.
Badges
Adding some badges in your README.md file can help contributors to easily identify certain aspects of your project.
Badges in a Github Repository
It's a visually appealing and you could have relevant information in the badges, like:
Build status
Code coverage
Latest released version
Project license
Code quality rating
Amount of vulnerabilities
Technical debt percentage
Social platforms
Dependencies/libraries status
+ much more
A great place to find badges is?https://shields.io/
Changelog
Make sure to maintain a CHANGELOG.md as well. It's a simple Markdown file with a certain format where you write down changes made to the project, new features, fixes, things that have been deprecated, etc.
Following SemVer + keeping a changelog, and knowing what and when something was introduced to the project really pays off, for example, imagine if you wish to upgrade a package's version, you may want to check what happened between versions, verifying if there was any breaking change and there's an upgrade guide you need to go through. Sometimes if you see a change in the PATCH (MAJOR.MINOR.PATCH), you may not even bother because the packages you're using are most likely following Semantic Versioning and you understand that there was nothing major that would cause you headaches to update.
Anyone will be aware of what has been introduced and will be able to make use of the changes. It helps business and developers. It helps everyone.
Check out the format and read more about it at?https://keepachangelog.com/en/1.0.0/
It may be worth taking a look at?semantic-release?to help your project even further.
Settings
I sincerely recommend keeping all settings in one file and changing the values of the variables using environment variables.
There's a great package called?python-decouple?that you can install and leverage a .env file to set your environment variables, by using the package you can force some environment variables to be required in the environment without default values, parse port numbers to integer or use the Csv function to cast environment variables split by comma into lists easily.
In case you commit your .env with default values for the team to be easier to set the project up, I'd recommend to add the ".env" in the .dockerignore file and avoid your sample values to ever be in your images.
Clean up the installed apps and any middleware you don't need.
Tip:?if you do have a sample .env committed in your project and usually have to modify it and can't push the updated version, make sure to check the following commands:
git update-index --skip-worktree <filename>
git update-index --no-skip-worktree <filename>
By running the first one, any changes you make to the file won't show up, and in case there are changes to the file and conflicts happened, you can run the second to have it back to the index and get the changes, fix the conflicts and run the first one again.
Read more about it in this nice article.
.gitignore / .dockerignore
Ever cloned a repository and noticed a lot of files that didn't need or that it shouldn't be there?
Avoid that these files are added to your git history by creating and adding those to a .gitignore file.
I usually use this website to get a good .gitignore when I start the project:
gitignore.io
Create useful .gitignore files for your project by selecting from 521 Operating System, IDE, and Programming Language…
www.toptal.com
Usually typing: macOs, Vim, VisualStudioCode, Python, Django
Depending on the project and team.
Now, talking about the .dockerignore, I recommend always having it, and possibly ignoring the .env, .git, tests, build and cache files in the container.