Why Relational Databases Are Called “Relational” (It’s Not About Relationships!)

Why Relational Databases Are Called “Relational” (It’s Not About Relationships!)


As a software engineer, you’ve probably worked with relational databases—tools like MySQL, PostgreSQL, or Oracle. And chances are, at some point, you’ve been told they’re called “relational” because they allow you to create relationships between tables using foreign keys.

That explanation seems logical, right? After all, foreign keys are one of the defining features of how we use these databases. But here’s the kicker: it’s wrong.

The word “relational” in relational databases has nothing to do with relationships between tables. It’s about something far deeper—something rooted in the world of mathematics. So, let’s break it down like experienced engineers chatting about database fundamentals.



The Math Behind "Relational"

Back in 1970, Edgar F. Codd, a computer scientist working at IBM, introduced the concept of relational databases in his groundbreaking paper, "A Relational Model of Data for Large Shared Data Banks."

Codd didn’t come up with the term “relational” because tables relate to each other. Instead, he was inspired by relational algebra, a branch of mathematics. In relational algebra, a relation is a mathematical concept—a structured way of organizing data.

To put it simply, a relation is a set of tuples that share the same attributes.



What Is a Relation in Plain Terms?

Let’s map this to something we deal with every day: tables in a database.

  • Relation: A table. Think of a relation as a logical representation of data.
  • Tuple: A row in that table, representing a single item or object.
  • Attribute: A column in the table, representing a property of the object.

For example, imagine a table of employees:


Here:

  • The table itself is the relation.
  • Each row (e.g., Alice’s data) is a tuple.
  • The columns (EmployeeID, Name, Role) are the attributes.



What Makes a Database Relational?

A relational database is one where all data is stored in relations (tables) that conform to strict principles. Whether or not there are relationships between tables doesn’t matter—it’s all about organizing data in a structured, logical way.

Core Principles of Relational Databases:

  1. Data is stored in tables (relations). These tables have rows (tuples) and columns (attributes).
  2. Set theory and relational algebra underpin operations. The database supports operations like selection, projection, and joins.
  3. Logical independence from storage. You interact with the data at a high level using queries (e.g., SQL), without worrying about how it’s stored underneath.
  4. Attributes have constraints. Columns are defined by domains (e.g., integers, strings) and can enforce rules like uniqueness or non-null values.



Where Relationships Come Into Play

So, where does the confusion come from? Well, foreign keys and relationships between tables are a very visible feature of relational databases.

For example, you might have an Employees table and a Departments table, with a foreign key linking EmployeeID to a department:


you’ve created a relationship between Employees and Departments.

While this feature is incredibly useful, it’s just one aspect of relational databases—it’s not what makes them relational. Even if you had a single table with no foreign keys, the database would still be relational as long as it adheres to the relational model.



Why This Matters to You

So, why should you, as a software engineer, care about this distinction?

  1. Understanding the Foundation: Knowing the theory behind relational databases helps you make better design decisions. For instance, understanding that a “relation” is a set means you should avoid duplicate rows unless absolutely necessary.
  2. Appreciating Relational Algebra: SQL might seem like magic, but it’s grounded in relational algebra. Each time you write a SELECT or JOIN, you’re invoking principles from this mathematical model.
  3. Avoiding Misconceptions: Misunderstanding why relational databases are called “relational” can lead to confusion when exploring other database paradigms, like NoSQL.


Final Thoughts

Relational databases are called “relational” because they’re based on relations in relational algebra, not because of the relationships between tables. It’s a subtle but important distinction.

As engineers, it’s easy to get caught up in the features of a technology and lose sight of its theoretical underpinnings. By digging into the "why" behind relational databases, we can better understand their strengths, limitations, and how to use them effectively in our systems.

So next time someone says relational databases are about relationships, you’ll know better—and maybe even enjoy setting the record straight.



References:

  1. Codd, E. F. (1970). "A Relational Model of Data for Large Shared Data Banks." Communications of the ACM, 13(6), 377-387.
  2. Silberschatz, A., Korth, H. F., & Sudarshan, S. (2020). Database System Concepts (7th Edition). McGraw-Hill Education.
  3. Date, C. J. (2004). An Introduction to Database Systems (8th Edition). Addison-Wesley.


要查看或添加评论,请登录

Sajith Dilshan的更多文章

社区洞察

其他会员也浏览了