I've installed Oracle Database 10g Express Edition (Universal) with the default settings:
SELECT * FROM NLS_DATABASE_PARAMETERS;
NLS_CHARACTERSET AL32UTF8
NLS_NCHAR_CHARACTERSET AL16UTF16
Given that both CHAR
and NCHAR
data types seem to accept multi-byte strings, what is the exact difference between these two column definitions?
VARCHAR2(10 CHAR)
NVARCHAR2(10)
I don't think answer from Vincent Malgrat is correct. When
NVARCHAR2
was introduced long time ago nobody was even talking about Unicode.Initially Oracle provided
VARCHAR2
andNVARCHAR2
to support localization. Common data (include PL/SQL) was hold inVARCHAR2
, most likelyUS7ASCII
these days. Then you could applyNLS_NCHAR_CHARACTERSET
individually (e.g.WE8ISO8859P1
) for each of your customer in any country without touching the common part of your application.Nowadays character set
AL32UTF8
is the default which fully supports Unicode. In my opinion today there is no reason anymore to useNLS_NCHAR_CHARACTERSET
, i.e.NVARCHAR2
,NCHAR2
,NCLOB
. Maybe the only reason is when you have to support mainly Asian characters whereAL16UTF16
consumes less storage compared toAL32UTF8
.Am wondering if NVARCHAR2(1) and VARCHAR2(1) act differently regarding null / empty value?
From testing, seems same ..
Can get some surprises
i.e. comparing empty string to != 'Y' will not return the empty string rows .. i.e. Empty string neither does nor doesn't equal 'Y' ...
need nvl wrapper function
e.g. and nvl(upper(WP."OW_IS_MISRUN"),'N') != 'Y'
select count(*) from "DATA_HUB"."OW_WELL_PERFORATION" WP where WP.UWI = 17038046
7
select count(*) from "DATA_HUB"."OW_WELL_PERFORATION" WP where WP.UWI = 17038046 and upper(WP."OW_IS_MISRUN") != 'Y'
1
select count(*) from "DATA_HUB"."OW_WELL_PERFORATION" WP where WP.UWI = 17038046 and upper(WP."OW_IS_MISRUN") = 'Y'
2
select count(*) from "DATA_HUB"."OW_WELL_PERFORATION" WP where WP.UWI = 17038046 and nvl(upper(WP."OW_IS_MISRUN"),'N') != 'Y'
5
nVarchar2 is a Unicode-only storage.
Though both data types are variable length String datatypes, you can notice the difference in how they store values. Each character is stored in bytes. As we know, not all languages have alphabets with same length, eg, English alphabet needs 1 byte per character, however, languages like Japanese or Chinese need more than 1 byte for storing a character.
When you specify varchar2(10), you are telling the DB that only 10 bytes of data will be stored. But, when you say nVarchar2(10), it means 10 characters will be stored. In this case, you don't have to worry about the number of bytes each character takes.
The NVARCHAR2 datatype was introduced by Oracle for databases that want to use Unicode for some columns while keeping another character set for the rest of the database (which uses VARCHAR2). The NVARCHAR2 is a Unicode-only datatype.
One reason you may want to use NVARCHAR2 might be that your DB uses a non-Unicode character set and you still want to be able to store Unicode data for some columns without changing the primary character set. Another reason might be that you want to use two Unicode character set (AL32UTF8 for data that comes mostly from western Europe, AL16UTF16 for data that comes mostly from Asia for example) because different character sets won't store the same data equally efficiently.
Both columns in your example (Unicode
VARCHAR2(10 CHAR)
andNVARCHAR2(10)
) would be able to store the same data, however the byte storage will be different. Some strings may be stored more efficiently in one or the other.Note also that some features won't work with NVARCHAR2, see this SO question: