Tired of Manual Migration and Seeding? Let's Learn This Automated Database Migration and Seeding!

Tired of Manual Migration and Seeding? Let's Learn This Automated Database Migration and Seeding!

Have you ever experienced manual migration and seeding every time you have deployed your new feature or every time you have just reset your database? I know what it feels and it was very tiring. Trust me! Now, after a long time searching how to make it simpler and less tiring, I have found a way to do it. Don't worry, I will tell you how to do that in this article. So, don't hesitate to read this article completely, ya!

Migration

Before we jump to the implementation, let's recognize more about migration. So what does migration mean in software engineering?

Schema migration — refers to the management of incremental, reversible changes and version control to relational database schemas.

A schema migration is performed on a database whenever it is necessary to update or revert that database's schema to some newer or older version. Migrations are performed programmatically by using a schema migration tool. For instance, in Django we can use a function in manage.py called makemigrations and migrate. When we run makemigrations in Django, it will translate all models and their relations into new migrations based on changes detected in the models. Then, we can run migrate that will do a synchronization of the database state with the current set of migrations. After that, all database structure in our models will be placed completely in our DBMS (database management system), such as MySQL, PostgreSQL, and many more.

Seeding

After knowing migration, we also should know about database seeding. So, what does it mean?

Database seeding is a process in which an initial set of data is provided to a database when it is being installed.

Database seeding is very useful when we want to populate the database with data we want to develop. This is often an automated process that is executed upon the initial setup of an application. The data can be dummy data or necessary data such as an initial administrator account. It is also very useful if we still have to add dummy data for testing, such as if we want to test a 20-views length of pagination, we should add 20 data first to test the feature.

Implementation

To illustrate this, let's see one of my website project called Sistem Informasi Penilaian dan Evaluasi Praktikum, a website to help FISIP (Faculty of Social and Political Sciences) UI's students and lecturers to submit and score the practical work's (internship) reports. This project is being developed by my team, MK-PPL. The stacks we use in this project are Django as the back-end framework, dbsqlite as the DBMS in local development, PostgreSQL as the DBMS in staging server, Gitlab as the VCS, and Heroku as the staging server. I will explain you how to do automated migration in Django below.

Automated Migration in Django

The flow of this process can be divided into 4 steps: set up DBMS, creating all models needed, run migration, and build automated script for migration.

Set Up DBMS

Before all other steps, first we should set up our DBMS because we will someday migrate all database structure to our DBMS. In my project, I use dbsqlite as the local DBMS. Setting up dbsqlite is quite easy. I only have to create an empty file named db.sqlite3 (file format using sqlite3). I created it inside my project folder (sip).

No alt text provided for this image

After that, I make sure that at my settings file for development, dev.py (for the default configuration, the settings file called settings.py) will inform Django where the db.sqlite3 file is located by this script:

No alt text provided for this image

By doing the steps above, it means that I have completed setting up the DBMS in my local environment. But how about the staging environment?

The staging environment is pretty the same. The only big difference is I have to create my PostgreSQL DBMS in our heroku server. Look at the picture below! After login to our dashboard, just go to resources tab and type Heroku Postgres in the search bar until it is showed in the list of add-ons.

No alt text provided for this image

After that, I click the Heroku Postgres in the list and go to the Settings tab. Then, I revealed all the database credentials in after clicking the "view credentials" button.

No alt text provided for this image

After that, I keep remembering all the credentials because I should add them into my staging.py (a customized settings.py for staging environment). To set that up, I write this code:

No alt text provided for this image

I add all my credentials into my .env file. Anyway, I also put my credentials in my default value parameter of os.getenv because this system still has some problems with .env file. I strongly not recommend this way except you have the same problem.

Also make sure that you have installed psycopg2 library. If you have not, you should add that to your requirements.txt and run "pip install -r requirements.txt".

No alt text provided for this image

Creating All Models Needed

To illustrate this, I will give you an example with my authentication feature. Here I created a customized User model.

class UserManager(BaseUserManager):
    """
    creating a manager for a custom user model
    https://docs.djangoproject.com/en/3.0/topics/auth/customizing/#writing-a-manager-for-a-custom-user-model
    https://docs.djangoproject.com/en/3.0/topics/auth/customizing/#a-full-example
    """


    def create_user(self, username, email, password=None, **extra_fields):
        """
        Create and return a `User` with an email, username and password.
        """
        user = self.model(
            username=self.model.normalize_username(username),
            email=self.normalize_email(email),
            **extra_fields
        )
        user.set_password(password)
        user.save(using=self._db)
        return user


    def create_superuser(self, email=None, password=None, **extra_fields):
        """
        Create and return a `User` with superuser (admin) permissions.
        """
        if password is None:
            raise TypeError('Superusers must have a password.')


        username = email.split("@")[0]
        user = self.create_user(username, email, password)
        user.is_superuser = True
        user.is_staff = True
        user.save()


        return user



class User(AbstractBaseUser, PermissionsMixin):
    """Custom User model, overrided from django.contrib.auth.models.User."""


    username_validator = UnicodeUsernameValidator()


    username = models.CharField(
        'username',
        max_length=150,
        unique=True,
        blank=False,
        help_text='Required. 150 characters or fewer. Letters, digits and @/./+/-/_ only.',
        validators=[username_validator],
        error_messages={
            'unique': "A user with that username already exists.",
        },
    )
    full_name = models.CharField(
        'full name',
        max_length=150,
        blank=False
    )
    role = models.CharField(
        'role',
        max_length=50,
        choices=[(tag, tag.value) for tag in Role],
        blank=False
    )
    email = models.EmailField(
        'email address',
        unique=True,
        blank=False,
        error_messages={
            'unique': "A user with that email already exists.",
        })
    is_staff = models.BooleanField(
        'staff status',
        default=False,
        help_text='Designates whether the user can log into this admin site.',
    )
    is_active = models.BooleanField(
        'active',
        default=True,
        help_text='Designates whether this user should be treated as active. Unselect this instead of deleting accounts.',
    )
    date_joined = models.DateTimeField('date joined', default=timezone.now)


    objects = UserManager()


    USERNAME_FIELD = 'email'
    REQUIRED_FIELDS = []


    class Meta:
        verbose_name = 'user'
        verbose_name_plural = 'users'


    def get_full_name(self):
        """Return the full name for the user."""
        return self.full_name

Run Migration

Then, I want to create a migration, so I just run this script inside my project folder.py manage.py makemigrations --settings=sip.settings.dev

py manage.py makemigrations --settings=sip.settings.dev


After run that script, a migration file will come out. If there is not any error occurs, the migration file will be placed in the app that the model are placed. In this case, I got my 0001_initial.py file in my /authentication/migration folder.

No alt text provided for this image
No alt text provided for this image


Finally, I have to migrate it to my DBMS, so I should type this script:

py manage.py migrate --settings=sip.settings.dev


No alt text provided for this image

If nothing error, it means we have finished migrating our database

Build Automated Script for Migration

To make migration easier every time I want to deploy new system, I add some script in my .gitlab-ci.yml file. These script will run the migration automatically when the system is being deployed. So, I add backend-deploy pipeline/stage in my .gitlab-ci.yml file and then add this script (inside the red square)

No alt text provided for this image

Automated Seeding in Django

Sometimes, it is always a need that we have to create some dummy data or some sample data for further testing the app. These dummy data are very useful if I implement it for User data because in my project, User data only could be taken if there is someone logging into SSO. So, it means I should collect the real SSO credentials from many users, which is quite impossible. To solve this problem, I need to seed some dummy User data, so that I don't need someone to log into SSO at that time. Actually, I can use a default admin interface from Django to seed my data. But, that will waste a lot of time and energy because I should submit the data one by one and should resubmit the dummy data again if one day I reset the database. Therefore, I won't use the admin interface to seed my data.

So, what could I do?

Population script

Yes, I use a python population script. A population script is actually just a regular Python script which automatically populate the database in one run when the file is called. For example, I will create a script named user_data_seeder.py and place it inside the project’s root, where manage.py is placed. Before seeding, the data showed in admin interface only shows one record of user data.

No alt text provided for this image

Then, I created user_data_seeder.py file to generate User dummy data using random string iteration (line 17-24). Don't forget to set up the value of DJANGO_SETTINGS_MODULE, which is your settings.py location, otherwise the code won't work. Mine is located in sip/settings.

No alt text provided for this image

Then, we run this script to call the user_data_seeder() function.

py user_data_seeder.py user_data_seeder

No alt text provided for this image

Finally, the dummy data will be added to the database. If we check again to admin interface, it will shows us 20 new data added.

No alt text provided for this image

That's all from me. I hope my article can be useful for you to understand more about Automated Migration/Seeding. Thanks for reading!


要查看或添加评论,请登录

Razaqa Dhafin Haffiyan的更多文章

社区洞察

其他会员也浏览了