I have created a table with an identity column. When I insert values in that table, Identity column shows huge gap of increment in between the values. Identity value jumps from 6 to 10001. This is the output ordered by Department id:
Output Screenshot Here
This is table I have created:
Create Table STG2.Department
(
DepartmentID int GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1 Cycle),
Name varchar(100),
GroupName varchar(100)
)
PRIMARY INDEX (DepartmentID);
This is how I am inserting values into the Department
table:
insert into STG2.Department (Name, GroupName)
Select Distinct
Department, GroupName
from DP_STG.HR;
What am I doing wrong?
This is too long for a comment.
What am I doing wrong?
What you are doing wrong is worrying about gaps in the identity column. These are a natural part of using databases. The most typical reasons are due to delete
or failed insert
s. The only guarantee (I think) is that the numbers are increasing and not duplicated.
In your case, my guess is that Teradata is reserving a bunch of numbers for some good reasons -- for parallelism or some other efficiency (I know SQL Server does this). The gaps will cause no harm, and the order of the inserts should be pretty-well preserved.
Maintaining gapless identity columns is a huge overhead for a database, particularly a powerful, parallel database such as Teradata. In essence, it means that each insert has to complete all queries on the table, lock the table, find the maximum value, add one, and use that. The people who write databases know what a performance killer this is and have looser requirements for such columns.
As already stated the gaps are due to each AMP (the logical Processing Unit of Teradata to have MPP) has each own range of IDs. So it's not wrong to have these gaps but it's by design.
If you rely on IDs without gaps (for any reason), you have to do by your own. Either before loading in your ETL process or after/during loading and define "ID = ROW_NUMBER() + MAX(ID)" (pseudo code).