Why do I get "ORA-12899 value too large for column" when migrating to Autonomous Database?
This article is part of a series of short blogs (even though the blogs are not really becoming short) answering your common questions about Oracle Database Cloud services.
Answer: you'll get this message if
- Your source database uses a single-byte character set (e.g. WE8ISO8859P1), and
- Your column width is defined using BYTE length semantic, and
- You have some column values that do not fit when using a multi-byte character set (e.g. UTF8) as Autonomous Database does
If you are not familiar with these terms, then let's discover them by example. Screenshots in gray are from the source, in turquoise from the target database. Download all SQL queries used in this blog and try it out yourself.
Source Database
For testing, I created a DBCS VM database using the WE8ISO8859P1 character set.
col parameter for a20 col value for a20 select * from nls_database_parameters where parameter = 'NLS_CHARACTERSET';
Let's create a table with one column that should hold a maximum of 5 characters (using CHAR length semantic), and insert some values into it.
create table tab_char (col1 varchar2(5 CHAR)); insert into tab_char values ('Hello'); insert into tab_char values ('Hell?');
Now we will repeat this, but this time using the BYTE length semantic, so our column can store a maximum of 5 bytes.
create table tab_byte (col1 varchar2(5 BYTE)); insert into tab_byte values ('Hello');
insert into tab_byte values ('Hell?');
Again, everything is fine, as every character (even the character "?") requires one byte in a single-byte character set.
Check the metadata of the tables:
set lines 300 col table_name for a15 col column_name for a15 col data_type for a15 col char_used for a10 col char_length for 999 col data_length for 999 select table_name, column_name, data_type, char_used, char_length, data_length from dba_tab_cols where table_name in ('TAB_CHAR', 'TAB_BYTE') order by 1,2,3;
As we see, the data length (last column in our query) is 5, regardless of BYTE or CHAR length semantic.
If you define neither CHAR nor BYTE while table creation, then the value of the parameter NLS_LENGTH_SEMANTICS is used. It also can be set at the session's level.
select value from v$parameter where name = 'nls_length_semantics'; alter system set nls_length_semantics=[CHAR | BYTE]; alter session set nls_length_semantics=[CHAR | BYTE];
Target Database
My target is an Autonomous Database that uses the UTF8 character set. To simplify testing, I'll just run the same commands on the target instead of running Data Pump export and import.
So, using CHAR length semantic works fine. Let's try BYTE semanic next:
When using BYTE length semantic, we get the message "too large for column", as the character "?" requires 2 bytes in a multi-byte character set. This is why we end up with 6 bytes, which is too large for a 5-byte column.
Now we check the metadata of the tables
and see that in the case of CHAR length semantic, our 5 characters can take up to 20 bytes on storage. So let's check the length of the values in BYTE using the function lengthb
select col1, lengthb(col1) from tab_char;
Aha! As we see, the string "Hell?" takes 6 bytes, while in our source database only 5 bytes by using a single-byte character set:
The Solution
Option 1: we keep the BYTE length semantic and extend the column width in our tab_byte table from 5 to 20 byte (even though 6 would be enough for this example).
alter table tab_byte modify (col1 varchar2(20 BYTE));
Option 2: we keep the column width equal to 5, but switch from BYTE to CHAR length semantic
alter table tab_byte modify (col1 varchar2(5 CHAR));
Where to do the modifications?
Here again, two options:
Option 1: Modify your source database before exporting the data using Data Pump.
Option 2: if you don't want to modify your source database, then import the metadata only to the target first, modify the column definition on target, and finally import the data.
How to figure out what columns need to be modified?
The best way is, as you should do it anyway, to run the Autonomous Database Schema Advisor. It contains a section that will tell you about existing columns using the BYTE length semantic.
And provides you with a SQL statement to retrieve this information:
You could also just use the following query by providing your schemas:
set lines 300 col owner for a30 col table_name for a30 col column_name for a30 col data_type for a30 col char_length for 99999 col char_used for a10 select t.owner, t.table_name, t.column_name, t.data_type, t.char_length, t.char_used from dba_tab_cols t join dba_objects o on (t.owner = o.owner and t.table_name = o.object_name) where 1=1 and o.object_type = 'TABLE' and t.char_used = 'B' and t.data_type in ('CHAR', 'VARCHAR', 'VARCHAR2') and t.owner in ('your_schemas') order by t.owner, t.table_name, t.column_name, t.data_type;
How do I easily modify hundreds or thousands of column definitions?
Use the following queries to create the needed SQL commands.
Option 1: keep the BYTE length semantic and extend the column width
set lines 300 set pages 0 col sql for a300 spool run.sql set echo on select 'alter table ' || t.owner || '.' || t.table_name || ' modify (' || t.column_name || ' ' || t.data_type || '(' || CASE WHEN t.data_type='CHAR' THEN least(t.char_length*4, 2000) ELSE least(t.char_length*4, 32767) END || ' byte' || '));' as sql from dba_tab_cols t join dba_objects o on (t.owner = o.owner and t.table_name = o.object_name) where 1=1 and o.object_type = 'TABLE' and t.char_used = 'B' and t.data_type in ('CHAR', 'VARCHAR', 'VARCHAR2') and t.owner in ('your_schemas'); spool off #clean up the script run.sql and execute it !vi run.sql @run.sql
Option 2: keep the column width and switch from BYTE to CHAR length semantic
set lines 300 set pages 0 col sql for a300 spool run.sql set echo on select 'alter table ' || t.owner || '.' || t.table_name || ' modify (' || t.column_name || ' ' || t.data_type || '(' || t.char_length || ' char' || '));' as sql from dba_tab_cols t join dba_objects o on (t.owner = o.owner and t.table_name = o.object_name) where 1=1 and o.object_type = 'TABLE' and t.char_used = 'B' and t.data_type in ('CHAR', 'VARCHAR', 'VARCHAR2') and t.owner in ('your_schemas'); spool off #clean up the script run.sql and execute it !vi run.sql @run.sql
Summary
If the database says your value is too large, then your value IS too large :-)
Do you have any further suggestions or experiences? Please feel free to share them in a comment or sending me a private message.
Thank you for reading!
Programmer en S. A. SER
2 年Good morning, thanks for the information. Losing information if I go from Bytes to Char. The table already has information loaded.
Technical Lead | SME | Senior Oracle Database Administrator | Golden Gate | Database Management | Business Intelligence | Solutions Development | Process Improvement
2 年Thanks much for sharing.
Architect (DB Specialist) at Persistent Systems|H1-B USA Visa petition Approved | Oracle DBA||MongoDB/CassandraDB DBA|Azure Cloud Migration (IaaS/PaaS/SaaS)|Oracle@Azure DB|MS SQL | MySQL|PostgreSQL|Linux.
4 年Thank you for sharing