登录查看更多内容

Tired of Manual Migration and Seeding? Let's Learn This Automated Database Migration and Seeding!

Razaqa Dhafin Haffiyan

Software Engineer at Govtech Procurement

发布日期: 2020年4月14日

Have you ever experienced manual migration and seeding every time you have deployed your new feature or every time you have just reset your database? I know what it feels and it was very tiring. Trust me! Now, after a long time searching how to make it simpler and less tiring, I have found a way to do it. Don't worry, I will tell you how to do that in this article. So, don't hesitate to read this article completely, ya!

Migration

Before we jump to the implementation, let's recognize more about migration. So what does migration mean in software engineering?

Schema migration — refers to the management of incremental, reversible changes and version control to relational database schemas.

A schema migration is performed on a database whenever it is necessary to update or revert that database's schema to some newer or older version. Migrations are performed programmatically by using a schema migration tool. For instance, in Django we can use a function in manage.py called makemigrations and migrate. When we run makemigrations in Django, it will translate all models and their relations into new migrations based on changes detected in the models. Then, we can run migrate that will do a synchronization of the database state with the current set of migrations. After that, all database structure in our models will be placed completely in our DBMS (database management system), such as MySQL, PostgreSQL, and many more.

Seeding

After knowing migration, we also should know about database seeding. So, what does it mean?

Database seeding is a process in which an initial set of data is provided to a database when it is being installed.

Database seeding is very useful when we want to populate the database with data we want to develop. This is often an automated process that is executed upon the initial setup of an application. The data can be dummy data or necessary data such as an initial administrator account. It is also very useful if we still have to add dummy data for testing, such as if we want to test a 20-views length of pagination, we should add 20 data first to test the feature.

Implementation

To illustrate this, let's see one of my website project called Sistem Informasi Penilaian dan Evaluasi Praktikum, a website to help FISIP (Faculty of Social and Political Sciences) UI's students and lecturers to submit and score the practical work's (internship) reports. This project is being developed by my team, MK-PPL. The stacks we use in this project are Django as the back-end framework, dbsqlite as the DBMS in local development, PostgreSQL as the DBMS in staging server, Gitlab as the VCS, and Heroku as the staging server. I will explain you how to do automated migration in Django below.

Automated Migration in Django

The flow of this process can be divided into 4 steps: set up DBMS, creating all models needed, run migration, and build automated script for migration.

Set Up DBMS

Before all other steps, first we should set up our DBMS because we will someday migrate all database structure to our DBMS. In my project, I use dbsqlite as the local DBMS. Setting up dbsqlite is quite easy. I only have to create an empty file named db.sqlite3 (file format using sqlite3). I created it inside my project folder (sip).

After that, I make sure that at my settings file for development, dev.py (for the default configuration, the settings file called settings.py) will inform Django where the db.sqlite3 file is located by this script:

By doing the steps above, it means that I have completed setting up the DBMS in my local environment. But how about the staging environment?

The staging environment is pretty the same. The only big difference is I have to create my PostgreSQL DBMS in our heroku server. Look at the picture below! After login to our dashboard, just go to resources tab and type Heroku Postgres in the search bar until it is showed in the list of add-ons.

After that, I click the Heroku Postgres in the list and go to the Settings tab. Then, I revealed all the database credentials in after clicking the "view credentials" button.

After that, I keep remembering all the credentials because I should add them into my staging.py (a customized settings.py for staging environment). To set that up, I write this code:

I add all my credentials into my .env file. Anyway, I also put my credentials in my default value parameter of os.getenv because this system still has some problems with .env file. I strongly not recommend this way except you have the same problem.

Also make sure that you have installed psycopg2 library. If you have not, you should add that to your requirements.txt and run "pip install -r requirements.txt".

Creating All Models Needed

To illustrate this, I will give you an example with my authentication feature. Here I created a customized User model.

class UserManager(BaseUserManager):
    """
    creating a manager for a custom user model
    https://docs.djangoproject.com/en/3.0/topics/auth/customizing/#writing-a-manager-for-a-custom-user-model
    https://docs.djangoproject.com/en/3.0/topics/auth/customizing/#a-full-example
    """


    def create_user(self, username, email, password=None, **extra_fields):
        """
        Create and return a `User` with an email, username and password.
        """
        user = self.model(
            username=self.model.normalize_username(username),
            email=self.normalize_email(email),
            **extra_fields
        )
        user.set_password(password)
        user.save(using=self._db)
        return user


    def create_superuser(self, email=None, password=None, **extra_fields):
        """
        Create and return a `User` with superuser (admin) permissions.
        """
        if password is None:
            raise TypeError('Superusers must have a password.')


        username = email.split("@")[0]
        user = self.create_user(username, email, password)
        user.is_superuser = True
        user.is_staff = True
        user.save()


        return user



class User(AbstractBaseUser, PermissionsMixin):
    """Custom User model, overrided from django.contrib.auth.models.User."""


    username_validator = UnicodeUsernameValidator()


    username = models.CharField(
        'username',
        max_length=150,
        unique=True,
        blank=False,
        help_text='Required. 150 characters or fewer. Letters, digits and @/./+/-/_ only.',
        validators=[username_validator],
        error_messages={
            'unique': "A user with that username already exists.",
        },
    )
    full_name = models.CharField(
        'full name',
        max_length=150,
        blank=False
    )
    role = models.CharField(
        'role',
        max_length=50,
        choices=[(tag, tag.value) for tag in Role],
        blank=False
    )
    email = models.EmailField(
        'email address',
        unique=True,
        blank=False,
        error_messages={
            'unique': "A user with that email already exists.",
        })
    is_staff = models.BooleanField(
        'staff status',
        default=False,
        help_text='Designates whether the user can log into this admin site.',
    )
    is_active = models.BooleanField(
        'active',
        default=True,
        help_text='Designates whether this user should be treated as active. Unselect this instead of deleting accounts.',
    )
    date_joined = models.DateTimeField('date joined', default=timezone.now)


    objects = UserManager()


    USERNAME_FIELD = 'email'
    REQUIRED_FIELDS = []


    class Meta:
        verbose_name = 'user'
        verbose_name_plural = 'users'


    def get_full_name(self):
        """Return the full name for the user."""
        return self.full_name

Run Migration

Then, I want to create a migration, so I just run this script inside my project folder.py manage.py makemigrations --settings=sip.settings.dev

py manage.py makemigrations --settings=sip.settings.dev

After run that script, a migration file will come out. If there is not any error occurs, the migration file will be placed in the app that the model are placed. In this case, I got my 0001_initial.py file in my /authentication/migration folder.

Finally, I have to migrate it to my DBMS, so I should type this script:

py manage.py migrate --settings=sip.settings.dev

If nothing error, it means we have finished migrating our database

Build Automated Script for Migration

To make migration easier every time I want to deploy new system, I add some script in my .gitlab-ci.yml file. These script will run the migration automatically when the system is being deployed. So, I add backend-deploy pipeline/stage in my .gitlab-ci.yml file and then add this script (inside the red square)

Automated Seeding in Django

Sometimes, it is always a need that we have to create some dummy data or some sample data for further testing the app. These dummy data are very useful if I implement it for User data because in my project, User data only could be taken if there is someone logging into SSO. So, it means I should collect the real SSO credentials from many users, which is quite impossible. To solve this problem, I need to seed some dummy User data, so that I don't need someone to log into SSO at that time. Actually, I can use a default admin interface from Django to seed my data. But, that will waste a lot of time and energy because I should submit the data one by one and should resubmit the dummy data again if one day I reset the database. Therefore, I won't use the admin interface to seed my data.

So, what could I do?

Population script

Yes, I use a python population script. A population script is actually just a regular Python script which automatically populate the database in one run when the file is called. For example, I will create a script named user_data_seeder.py and place it inside the project’s root, where manage.py is placed. Before seeding, the data showed in admin interface only shows one record of user data.

Then, I created user_data_seeder.py file to generate User dummy data using random string iteration (line 17-24). Don't forget to set up the value of DJANGO_SETTINGS_MODULE, which is your settings.py location, otherwise the code won't work. Mine is located in sip/settings.

Then, we run this script to call the user_data_seeder() function.

py user_data_seeder.py user_data_seeder

Finally, the dummy data will be added to the database. If we check again to admin interface, it will shows us 20 new data added.

That's all from me. I hope my article can be useful for you to understand more about Automated Migration/Seeding. Thanks for reading!

要查看或添加评论，请登录

Razaqa Dhafin Haffiyan的更多文章

Software Architecture and Docker

2020年5月14日

Software Architecture and Docker

"Software architecture? What is it?" "Is it something like the architects do?" "I haven't heard docker, so what is it?"…
Integrate Django CAS (SSO UI) with JSON Web Token (JWT) + Microservice!!

2020年5月13日

Integrate Django CAS (SSO UI) with JSON Web Token (JWT) + Microservice!!

In this article, I want to share with you my remarkable experience building a system that requires both CAS…

5 条评论
What is CI/CD? Why is it Important?

2020年5月13日

What is CI/CD? Why is it Important?

Do you know CI/CD? Have you heard that word? If you get used to work smart in software development and boldly declare…
Why Should You Also Clean Your Codes Regularly??

2020年4月30日

Why Should You Also Clean Your Codes Regularly??

Covid-19 teaches us to keep all parts of our body clean as much as they should be. No wonder if lately, we keep washing…
Be Independent! Start Mocking with Django!!

2020年4月30日

Be Independent! Start Mocking with Django!!

I know right, you may have experienced writing a few lines of code which depend on other codes which have been…

1 条评论
Let's Get to Know Agile: The Super Magical Power in Software Development

2020年4月13日

Let's Get to Know Agile: The Super Magical Power in Software Development

Origin Once upon a time, there was a time when many of PC industries faced crises. It was the early 90's when everyone…
TDD, Love-Hate Relationship, and How to Implement That

2020年3月9日

TDD, Love-Hate Relationship, and How to Implement That

Well, the title might be quite confusing for you because you wonder what is the relationship between TDD and Love-Hate…
What is User Persona? How to Create User Persona to Your Website Project?

2020年3月9日

What is User Persona? How to Create User Persona to Your Website Project?

What is User Persona? Have you heard the word "persona"? If you browse it from Cambridge English Dictionary, persona is…

1 条评论
Why Git Is Important in Software Development?

2020年2月26日

Why Git Is Important in Software Development?

Why Using Git? Imagine if you want to do a software project with your team. Maybe one day, your teammates and you will…

See all articles

Tired of Manual Migration and Seeding? Let's Learn This Automated Database Migration and Seeding!

Razaqa Dhafin Haffiyan

Software Engineer at Govtech Procurement

Migration

Seeding

Implementation

Automated Migration in Django

Set Up DBMS

Creating All Models Needed

Run Migration

Build Automated Script for Migration

Automated Seeding in Django

Population script

Razaqa Dhafin Haffiyan的更多文章

社区洞察

其他会员也浏览了

Object Relational Mapping (ORM)

Mastering MySQL: Advanced Query Optimization Techniques

Day 15: Orchestrating Pipelines Using Apache Airflow for Pipeline Orchestration

Apache Airflow

(#50) Why Odoo 18's Latest Python Framework Enhancements Are a Game Changer for 2024

Relationship between SAP and Python

Boosting Query Performance in Django: A Practical Guide

AI and the Future of Scripting: Why Best Practices Still Matter

What is CRUD?

Migration

Seeding

Implementation

Automated Migration in Django

Set Up DBMS

Creating All Models Needed

Run Migration

Build Automated Script for Migration

Automated Seeding in Django

Population script

Razaqa Dhafin Haffiyan的更多文章

Software Architecture and Docker

Integrate Django CAS (SSO UI) with JSON Web Token (JWT) + Microservice!!

What is CI/CD? Why is it Important?

Why Should You Also Clean Your Codes Regularly??

Be Independent! Start Mocking with Django!!

Let's Get to Know Agile: The Super Magical Power in Software Development

TDD, Love-Hate Relationship, and How to Implement That

What is User Persona? How to Create User Persona to Your Website Project?

Why Git Is Important in Software Development?

社区洞察

其他会员也浏览了

Object Relational Mapping (ORM)

Mastering MySQL: Advanced Query Optimization Techniques

Day 15: Orchestrating Pipelines Using Apache Airflow for Pipeline Orchestration

Apache Airflow

(#50) Why Odoo 18's Latest Python Framework Enhancements Are a Game Changer for 2024

Relationship between SAP and Python

Boosting Query Performance in Django: A Practical Guide

AI and the Future of Scripting: Why Best Practices Still Matter

What is CRUD?