Django Deletion Dragons

Reposted from my personal blog.

Django models offers an ORM API that abstracts the database layer. There are many options provided to take advantage of common database features. But certain combinations of these options may catch you by surprise; going so far as unintentional data deletion.

As an example, consider the following models:

No alt text provided for this image

Presume the developers that created your database schema did not have lunch often with the developers who created the software using the database. The software was written to create the instances from the bottom up. First OrderLines, then Orders from OrderLines, then OrderReports from Orders. This has a robustness advantage; as soon as an entity is defined it is saved to the database. The software may do something like this:

# Somebody bought some things from us! Hooray!

lines = [OrderLine.objects.create()] * 2


# We package the lines up into an order

order = Order.objects.create()

for line in lines:

    line.order = order

    line.save()


# It's the end of the month, we create a report to generate a document

report = OrderReport.objects.create()

report.orders.add(order)

report.save()

But since the OrderReport instance was only used to generate a report it is no longer needed (at least for the MVP version you pitched to the VCs). Your savvy software developers like things tidy so they delete it.

report.delete()

A few weeks later you go to access the OrderLines of the Order again but find they no longer exist.

Oh snap! What happened?

What happened is the database developers decided to be nice and provide convenience features. They made it so you could create an OrderLine first, then create an Order from OrderLines, as in the above code. To have this behavior, they told Django in the ForeignKey declaration it was OK for a child (OrderLine) to be created without a parent (Order) with the argument

null=True

The database developers were also kind enough to provide an auto-delete feature such that if you delete some entity, then all of it's children would be deleted. This is done with the ForeignKey argument

on_delete=CASCADE

This is why when the software deleted the OrderReport all the Orders and OrderLines in the report were also deleted.

This may seem like a contrived example, but happens often in real life, especially before Django 2.0 came out. As in the release notes

The on_delete argument for ForeignKey and OneToOneField is now required in models and migrations.

Before Django version 2.0, the on_delete argument defaulted to CASCADE. So if your dev team created models, but forgot to include the on_delete parameter, and did not test properly then data loss could easily happen.

If you would like to play around with this example, and also the combinations of [One, Many] X [CASCADE, SET_NULL, PROTECT], see this sandbox project which provides a Dockerized Jupyter environment on a Django project. The models have the ability to report what children they have, and all models report when they are being deleted.

Model definitions (including those used by the example code).

Some sample pre-run notebooks are provided in this sandbox project:

...you had me at dragons!

回复

要查看或添加评论,请登录

Randy Moore的更多文章

  • Top 20 Software Dev Skills Over Time

    Top 20 Software Dev Skills Over Time

    Also posted on my personal site. V2 of my last blog post.

  • Top 20 Skills Over Time - remoteok.io

    Top 20 Skills Over Time - remoteok.io

    Reposted from https://randalmoore.me/posts/top-20-skills-remoteok/ As a professional developer you must always be…

  • Keeping it DRY with OAS

    Keeping it DRY with OAS

    Reposted from my blog Don't Repeat Yourself (DRY) is a well known principle in software development. An Open API…

  • Concept Map: Humble, Powerful

    Concept Map: Humble, Powerful

    Reposted from my personal blog How To Create A Concept Map Begin with brainstorming the list of nouns which may be…

  • Exploring NLP Parsed Audit Documents

    Exploring NLP Parsed Audit Documents

    Learning more Python because my machine is slow Original post and the Named Entity Explorer (click "Audits" in upper…

  • Full Stack Walkthrough

    Full Stack Walkthrough

    Original post on the subject stack here. Summary High level development walk through for a toy example of a modern full…

  • Asynchronous Programming (and why it's all the rage for web services)

    Asynchronous Programming (and why it's all the rage for web services)

    Why Care? Browsing through job postings you often notice a job requirement along the lines of: Able to write highly…

    1 条评论

社区洞察

其他会员也浏览了