可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
SQL Server 2005 Question:
I'm working on a data conversion project where I'm taking 80k+ rows and moving them from one table to another. When I run the TSQL, it bombs with various errors having to do with converting types, or whatever. Is there a way to find out what row caused the error?
=====================
UPDATE:
I'm performing an INSERT INTO TABLE1 (...) SELECT ... FROM TABLE2
Table2 is just a bunch of varchar fields where TABLE1 has the right types.
This script will be put into a sproc and executed from an SSIS package. The SSIS package first imports 5 large flat files into TABLE2.
Here is a sample error message: "The conversion of a char data type to a datetime data type resulted in an out-of-range datetime value."
There are many date fields. In TABLE2, there are data values like '02/05/1075' for Birthdate. I want to examine each row that is causing the error, so I can report to the department responsible for the bad data so they can correct it.
回答1:
What I do is split the rowset in half with a WHERE clause:
INSERT MyTable(id, datecol) SELECT id, datecol FROM OtherTable WHERE ID BETWEEN 0 AND 40,000
and then keep changing the values on the between part of the where clause. I've done this by hand many times, but it occurs to me that you could automate the splitting with a little .Net code in a loop, trapping exceptions and then narrowing it down to just the row throwing the exception, little by little.
回答2:
This is not the way to do it with SSIS. You should have the data flow from your source, to your destination, with whatever transformations you need in the middle. You'll be able to get error details, and in fact, error rows by using the error output of the destination.
I often send the error output of a destination to another destination - a text file, or a table set up to permit everything, including data that would not have been valid in the real destination.
Actually, if you do this the standard way in SSIS, then data type mismatches should be detected at design time.
回答3:
I assume you do the update with the INSERT INTO ...
Instead try to do the update with the cursor, use exception handling to catch the error and log all you need: the row number it failed on etc.
回答4:
Not exactly a cursor, but as effective - I had over 4 million rows to examine with multiple conversion failrues. Here is what I used, and it resulted in a two temp tables one with all my values and assigned rows and one that simply contained a list of rows in the first temp table that failed to convert.
select row_number() over (order by TimeID) as rownum,timeID into #TestingTable from MyTableWithBadData
set nocount on
declare @row as int
declare @last as int
set @row=0
select @last = count(*) from #TestingTable
declare @timeid as decimal(24,0)
create table #fails (rownum int)
while @row<=@last
begin
Begin Try
select @timeid=cast(timeID as decimal(24,0)) from #TestingTable where rownum = @row
end try
begin catch
print cast(@row as varchar(25)) + ' : failed'
insert into #fails(rownum) values(@row)
end catch
set @row = @row+1
end
回答5:
if you are looping, add prints in the loop.
if you are using set based operations, add a restrictive WHERE condition and run it. Keep running it (each time making it more and more restrictive) until you can find the row in the data. if you could run it for blocks of N rows, then just select out those rows and look at them.
ADD CASE statements to catch the problems (converting that bad value to NULL or whetever) and put a value in a new FlagColumn telling you the type of problem:
CASE WHEN ISNUMERIC(x)!=1 then NULL ELSE x END as x
,CASE WHEN ISNUMERIC(x)!=1 then 'not numeric' else NULL END AS FlagColumn
then select out the new converted data where FlagColumn IS NOT NULL
you could try using select statements with isnumeric() or isdate() functions on the various columns of the source data
EDIT
There are many date fields. In TABLE2,
there are data values like
'02/05/1075' for Birthdate. I want to
examine each row that is causing the
error, so I can report to the
department responsible for the bad
data so they can correct it.
Use this to return all bad date rows:
SELECT * FROM YourTable WHERE ISDATE(YourDateColumn)!=1
回答6:
If you are working with cursors, yes and is trivial. If you are not working with cursors, I don't think so because SQL operations are ACID, or transactions per se.
回答7:
John Sauders has the right idea, there are better ways to do this kind of processing using SSIS. However, learning SSIS and redoing your package to completely change the process may not be an option at this time, so I offer this advice. You appear to be having trouble with the dates being incorrect. So first run a query to identify those records which are bad and insert them into an execptions table. Then do you insert only of those records that are left. Something like:
insert exceptiontable (field1, field2)
select field1, field2 from table2 where isdate(field2) = 0
insert table1 (field1, field2)
select field1, field2 from table2 where isdate(field2) = 1
Then of course you can send the contents of the exception table to the people who provide the bad data.