I have the SSIS package, which will load the excel file into Database. I have created Excel Source task to map the excel column name to Database table column name and its working fine.
In rare case, We are receiving the excel file column name with some space (for example : Column name is "ABC" but we are receiving "ABC ") and which cause the mapping issue and SSIS got failed.
Is there any possible to trim the column name without opening the excel.
Note : Page name will be dynamic and Column position may change (eg: Column "ABC may exist in first row or second row or ..").
I am fairly new to the forum, so if you think this is silly, take it with a grain of salt.
MS Access has much of the same VBA functionality as Excel or you could script a a new stub Excel workbook that parses and formats before your SQL import and then import that (a middle ware if you will).
For the problem regarding trailing or leading spaces I have used the following on many occasion:
myString = trim(msytring)
'This will remove all leading and trailing spaces but not mess around with any spaces between characters. So on import you can run trim on the column headers as you import them.There is also LTrim and RTrim 'you can guess what those do left and right of the string
https://support.office.com/en-us/article/LTrim-RTrim-and-Trim-Functions-e340ced1-67df-435f-b078-1527a4eddea2
For Uppercase you can use UCase
And Replace always comes in handy if there is a situation as I often deal with where sometimes a user might use a # char and sometimes not.
Example: " Patterson #288 " or " PatTeRson 288 "
myString = UCase(Trim(Replace(myString,"#","")
'eliminates the # sign and gets rid of the leading and trailing spaces and also Uppercases the letters in case the user also made a mistakePretty handy to run this is loops importing and exporting.
Now if the file name is changing (this is the Workbook name) or if the Worksheet names are changing you could also have your "middleware" always name the workbook to the same name (with the contents of the workbook that you are going to import) same with the sheets, or you can count the # of sheets and record the names (again a chance to standardize and rename them in your "middle ware")
I suppose it is not an SQL answer, but because I am not that good with SQL I would prep the data, in this case an excel Workbook first and standardize it for import so the code doesn't break on the DB side (Server side).
I use excel as a front end to Access with SQL query scripts and it can be linked directly to SQL but it is much more difficult. A .CSV friendly DB like PostGre SQL helps in that regard.
I hope this helps. If you need help formatting the workbook prior to import by make a copy and applying all of your changes (naming, field name convention // column header) let me know. I could probably help with that.
This is similar to V's comment of running a pre-processing script on the workbook. That is how I would approach it.
Cheers, WWC
Is the file being created manually or automatically? In either case you could remove the header row (either programmatically or tell the people to delete it before saving the file) from the Excel file altogether. Once you do that, go into the Excel Connection Manager and find the box that indicates 'First row has column names'. If you can clear out that box then map the columns again to the destination that should solve your problem. You would never have to worry about a misspelled (or extra spaces in) the column names.
I think there is also an option in SSIS to skip the first row altogether but I cannot remember where that option is. If you can find that then just skip the first row of the Excel file. Same mappings still remain.
Thank you
First of all, my solution is based on @DrHouseofSQL and @Bhouse answers, so you have to read @DrHouseofSQL answer first then @BHouse answer then continue with this answer
Problem
This situation is a little complex and can be solved using the following workaround:
Solution Overview
Delay Validation
property to true)Solution Details
@[User::strQuery]
as ReadWrite Variable, and@[User::ExcelFilePath]
as ReadOnly Variable (in the script task window)Note: you have to imports
System.Data.OleDb
In the code below, we search the excel first 15 rows to find the header, you can increase the number if the header can be found after the 15 rows. Also i assumed that the columns range is from
A
toI
Select * from [Sheet1$A2:I]
to the variable@[User::strQuery]
@[User::strQuery]
Image taken from @BHouse answer
Delay Validation
property toTrue
UPDATE 1:
From the OP comments:
sometimes excel with empty data will come.(i.e) we have only header row not not data... in that case it fails entire task
Solution:
If your excel file contains no data (only header) you have to do these steps:
@[User::ImportFile]
)@[User::ImportFile]
to the script task ReadWrite variables@[User::ImportFile]
= True, else@[User::ImportFile]
= FalseWrite the following expression
Note: The new Script Task code is:
UPDATE 2:
From the OP comments:
is there any other work around available to process the data flow task without skipping all data flow task,Actually one of the task will log the filename and data count and all, which are missing here
Solution:
@[User::ImportFile] == False
(same steps of the first connector)Or Instead of adding another
Data Flow Task
, you can add anExecute SQL Task
to insert a row in the Log TableThis has been documented well in MSDN , running through the steps similar to as @houseofsql mentioned
Step1:
Exclude column names in first row in excel connection, use sql command as data access mode
Step2: Alias column names in output column as matching your destination,
Select * from
[Sheet1$A2:I]
will select from second rowFinally Add destination as OLEDB destination