Null Value Differences in RDBMS - Oracle, SQLServer, Postgres, MySQL, Snowflake
Sumit Sengupta
Multi-Cloud Architect 12x certified - Azure, AWS, GCP, OCI | Ex- (Microsoft, Apple, MongoDB) | Cybersecurity Instructor | AWS Academy Educator | 2x Top Voice - Database, Data Architecture | Mentor / Tech Volunteer
Nulls have different values between SQL and NoSQL databases. And even within themselves separately. If you are migrating your database, (whether on-prem or cloud), this is something to watch for in data migration and as well as in application SQL/NoSQL code changes.
In this article, we consider RDBMS as SQL and show examples of the most popular databases per "db-engines" ranking - Oracle, Postgres, MySQL, and SQL Server. In the following article, we will look into NoSQL databases like MongoDB, Cassandra, Redis, Azure CosmosDB, AWS DynamoDB, and Google Firebase.
Question: Is a zero-byte string data field the same as null?
Answer: Not unless it is Oracle ( and not recommended to use this ).
From the Oracle documentation page, "Oracle Database currently treats a character value with a length of zero as null. However, this may not continue to be true in future releases, and Oracle recommends that you do not treat empty strings the same as nulls."
In ISO SQL:2016 standard Null is defined in section 3.1.1.13 as "a special value that is used to indicate the absence of any data value." More specifically, comparing two values stored in a SQL ( RDBMS ) database - "two values are distinct if neither is?null?and the values are not equal. A?null?value and a non-null?value are distinct. Two?null?values are not distinct.?"
Ok, that is in the ISO standard. Let's test this among Oracle, MySQL, SQL Server, and Postgres the top 4 of all databases currently as per ranking.
The DDL to create our sample table on Oracle, SQL Server, Postgres and MySQL - is identical in all of them.
create table null_test (id smallint,content varchar(45),description varchar(45));
Now we insert 2 rows. In the content column, we will insert a Null and a zero-byte character. Again, standard ISO syntax works across all four RDBMS in an identical fashion.
insert into null_test values (1, NULL, 'Content for id 1 is NULL') ;
insert into null_test values (2, '', 'Content for id 2 is zero character') ;
Now let's find out the Null values.
SQL Server, MySQL, Postgres and Snowflake do not treat the zero-byte character (id =2) as null. So it returns only one row.
Not_Oracle> select * from null_test where content is null ;
id ? ? content ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? description ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
------ --------------------------------------------- ---------------------------------------------
?? ? 1 NULL? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Content for id 1 is NULL? ? ? ? ? ? ? ? ? ? ?
But oracle treats a zero-byte character the same as null. So it returns two.
Oracle> select * from null_test where content is null
id ? ? content ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? description ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
------ --------------------------------------------- ---------------------------------------------
?? ? 1 NULL? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Content for id 1 is NULL? ? ? ? ? ? ? ? ? ? ?
?? ? 2 NULL Content for id 2 is zero character
The official Oracle documentation warns against this un-common behavior - "Currently a zero character length is null but it may change in future. Oracle recommends you do not treat the empty string as null.".
Thus, if you are migrating to or from Oracle, please know about this. If your data has a zero-byte value, scan your application code to ensure it is not treated as Null or unknown. Now if we insert a 3rd row and this time, we will not mention any value for our column - the expectation is that all databases should treat having no values as null - after all that is the definition of null.
Not_Oracle> insert into null_test (id, description ) values ( 3, 'No content for id 3' ) ;
Let us verify that this last row inserted is indeed null. HEre it "#" sign is character to put in to indicate it is null
领英推荐
select * from null_test where content is null ;
id | content | description
----+---------+--------------------------
1 | # | Content for id 1 is NULL
3 | # | No content for id 3
Question: Can I have a unique constraint on columns with null values?
Answer: Yes
The following SQL creates a unique index on the nullable column.
alter table null_test add constraint null_table_content_unique unique (content) ;
Question: What does the unique index guarantee on null values?
Answer: For Oracle, MySQL, and Postgres - null values are allowed and among non-null values, uniqueness is enforced. For SQL Server, only one null value is allowed!
Thus the following SQL works fine on SQL Server, MySQL, and Postgres.
Not_SQL> insert into null_test (id, description ) values ( 3, 'No content for id 3' ) ;
But in an SQL server, it does not allow you to enter another null value
SQL> insert into null_test (id, description) values (3, 'No content for id 3') ;
Violation of UNIQUE KEY constraint 'null_table_content_unique'. Cannot insert duplicate key in object 'dbo.null_test'. The duplicate key value is (<NULL>)
The statement has been terminated..
Thus if you are migrating from SQL Server to any other database, an index on a nullable column will allow nulls in the other databases that you need to consider application changes for.
Finally, this is not related to nulls but something we closely touched on here. In an SQL server, a zero-byte character is the same as any number of empty space characters. Duh!
insert into null_test values (3, ' ? ? ', 'Content for id many is one space character') ;
Violation of UNIQUE KEY constraint 'null_table_content_unique'. Cannot insert duplicate key in object 'dbo.null_test'. The duplicate key value is ( ? ? ).
That means I can delete that zero-byte character using a multi-space character where clause as well!
More on this and storage, later
delete from null_test where content = ' ? ' ;
Floccinaucinihilipilification is a very long English word that means estimating something as a value less. The value of Null is not that, it is unknown! But it is important to understand what that means.
In the next blog, we will dive into NoSQL databases. As people move between RDBMS and NoSQL, those are important considerations.