Transactions with mongoose/mongodb
Transactions with mongoose

Transactions with mongoose/mongodb

Transactions, for those who might not be familiar, allow you to carry out multiple operations in isolation, with the added benefit of being able to revert all operations if any one of them fails. This feature stands out as an exceptionally efficient way to roll back a complex process with minimal hassle. Recently, I've started an extensive code refactoring, during which I've been using a lot of transactions. To better understand how to implement them in new contexts, I have created a small POC project. I've developed a very simple and direct example that I believe would be interesting to share.

A problem of timing

When you start building an app, you often go straightforward on what do you need from the database and what you need to save in the database. You think logically and step by step. This is good but often, developers disreguard what happen if the entire process has an unexpected error. Dont get me wrong, they will think about catching the error but not reversing it. Let me demonstrate what I mean with the example. Imagine the following:

When building an app, you often start to dive straight into the code and find out quite fast what data you need to fetch from and save to the database. You approach the task logically, proceeding step by step. This method is sound, yet developers frequently overlook the potential for an unexpected error to disrupt the entire process. Don't misunderstand; they do consider error handling, but often neglect the aspect of rollback everything when an error occur. To illustrate this point more clearly, consider the following example:


Simple ERD of student linked to a classroom


In our scenario, there's a classroom that includes just one student. This student has received a grade for an exam. Our task is to record this result along with calculating and saving the classroom's average grade.

In such situation, the resulting code could be as follows:

const save = async (student, mark, averageMarks) => {
  try {
    const studentId = student._id 
    await StudentModel.findOneAndUpdate({ studentId }, { mark })
    
    const classId = student.classId
    await ClassModel.findOneAndUpdate({ classId }, { averageMarks })
  } catch (error) {
    throw new UnexpectedError()
  }
}        

In the "save" function, we accept parameters for the student object, their exam mark, and the averageMarks, which is calculated in a separate function. Everything works fine and everything operates smoothly.

Now, picture this scenario: The database is scheduled for a 1-minute maintenance at 2:00 PM. While the exact timing is irrelevant, it's crucial to remember that a maintenance will happen. At approximately 1:59 PM on the same day, a teacher is in the process of entering a new mark for a student, unaware of the scheduled maintenance. As the update initiates, the first portion of the code executes without issue, and the student's mark is successfully updated. However, the second part, responsible for updating the class's average mark, encounters an error due to the maintenance and fails to complete.

This leads to a significant issue: the data's reliability is compromised.

A possible solution is to manage the rollback process ourselves by creating a method that reverts all changes if an error is caught. Implementing this approach, the code would look something like this:

const save = async (student, mark, averageMarks) => {
  let previousStudent, previousMark, previousAverage;
  try {
    previousStudent = await StudentModel.findOne({ studentId })
    previousMark = previousStudent.mark
    classId = previousStudent.classId

    if (previousClass ) {
       previousClass = await ClassModel.findOne({ classId })
       previousAverage = previousClass.averageMarks
    }

    const studentId = student._id 
    await StudentModel.findOneAndUpdate({ studentId }, { mark })
    
    const classId = student.classId
    await ClassModel.findOneAndUpdate({ classId }, { averageMarks })
  } catch (error) {
    await StudentModel.findOneAndUpdate({ studentId }, { previousMark })
    if (previousAverage) {
      await ClassModel.findOneAndUpdate({ classId }, { averageMarks:     previousAverage  })
    }
  }
}        

Obviously, the code suddenly becomes significantly more complex and cumbersome. This example illustrates just two write operations. Imagine the complexity with 5 to 10 write operations or even more; it could quickly become a logistical nightmare.

Transactions to the rescue

This is precisely the type of error that transactions are designed to manage. By implementing a transaction, you ensure that changes are only saved once the transaction has been committed.. There are 3 main methods to remember when it comes to transactions:

  • startTransaction(): This method will start a session to regroup your write and read operation
  • commitTransaction(): This method will commit the transaction and it will be process.
  • abortTransaction(): Rollback the transaction at the original state

To utilize transactions effectively, you'll need to create a replica of your database. A replica is essentially a complete copy of your data, allowing you to read from this duplicate instead of the original dataset. To set this up on a local MongoDB environment, you'll need to adjust the settings slightly.

Let's edit first the mongodb config:

$ sudo nano /etc/mongod.conf        

Once open, go down a bit and write the following under "replication":

#replication:
  replSetName: "rs0"        

Save the modification and restart your mongo service:

$ sudo systemctl restart mongod        

And now, let's connect to your database. Depending of how you setup your mongo, it might differ a bit. In my case, I need to use "mongosh" but you might also use the command "mongo"

$ mongosh
# OR
$ mongo        

And finally, initiate your replica with the following command:

rs.initiate()        

Let's redo start with transaction

Returning to our earlier example, let's revise our code to incorporate the transaction:

import mongoose from 'mongoose';
const conn = mongoose.connection;

const save = async (student, mark, averageMarks) => {
  const session = await conn.startSession();
  try {
    session.startTransaction();
    const studentId = student._id 
    await StudentModel.findOneAndUpdate({ studentId }, { mark }, { session })
    
    const classId = student.classId
    await ClassModel.findOneAndUpdate({ classId }, { averageMarks }, { session })
    await session.commitTransaction();
  } catch (error) {
    await session.abortTransaction();
  }
}        

As you can see, the code remains largely unchanged. The key difference now is that our operations are managed within a session and will only be executed once we invoke the commitTransaction method. If any issue arise between the start of the transaction and its commitment, all operations will be rolled back to their previous state, effectively making it as though nothing had occurred.

This approach is significantly simpler than managing the process manually, as we attempted earlier. While our example is straightforward, imagine applying transactions to more complex operations. It can truly be a game-changer!

Not a silver bullet

However, it's important to exercise caution with transactions; they aren't without drawback. The primary issue stems from their operational nature. To ensure ACID compliance—Atomicity, Consistency, Isolation, Durability—transactions must lock the documents they intend to modify. In scenarios with a low workload, this isn't a concern. Transactions execute so quickly that you'll hardly notice, and your database performance is unlikely to be affected.

However, if you're dealing with a significant workload, you could encounter latency issues. With millions of documents involved in your transaction, the sheer number of documents locked can adversely affect the rest of your application. While in some situations this cannot be avoided, it's crucial to be aware of the potential implications before employing transactions.

Last words

I have been playing with transactions for many years and I can attest that they are incredible to preserve the integrity of your data. Managing a rollback manually is complex; it's way too easy to overlook something or make a mistake. It's preferable to let MongoDB handle the complexities, allowing you to focus on more critical aspects of your work. And you, have you ever tried mongodb transaction? If yes, has it ever been useful to you?


要查看或添加评论,请登录

Kevin Justal的更多文章

社区洞察

其他会员也浏览了