I have a table called customers which contains around 1,000,000 records. I need to transfer all the records to 8 different flat files which increment the number in the filename e.g cust01
, cust02
, cust03
, cust04
etc.
I've been told this can be done using a for loop in SSIS. Please can someone give me a guide to help me accomplish this.
The logic behind this should be something like "count number of rows", "divide by 8", "export that amount of rows to each of the 8 files".
To me, it will be more complex to create a package that loops through and calculates the amount of data and then queries the top N segments or whatever.
Instead, I'd just create a package with 9 total connection managers. One to your Data Database (Source) and then 8 identical Flat File Connection managers but using the patterns of FileName1, Filename2 etc. After defining the first FFCM, just copy, paste and edit the actual file name.
Drag a Data Flow Task onto your Control Flow and wire it up as an OLE/ADO/ODBC source. Use a query, don't select the table as you'll need something to partition the data on. I'm assuming your underlying RDBMS supports the concept of a ROW_NUMBER() function. Your source query will be
SELECT
MT.*
, (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) % 8 AS bucket
FROM
MyTable AS MT;
That query will pull back all of your data plus assign a monotonically increasing number from 1 to ROWCOUNT
which we will then apply the modulo (remainder after dividing) operator to. By modding the generated value by 8 guarantees us that we will only get values from 0 to 7, endpoints inclusive.
You might start to get twitchy about the different number bases (base 0, base 1) being used here, I know I am.
Connect your source to a Conditional Split. Use the bucket
column to segment your data into different streams. I would propose that you map bucket 1 to File 1, bucket 2 to File 2... finally with bucket 0 to file 8. That way, instead of everything being a stair step off, I only have to deal with end point alignment.
Connect each stream to a Flat File Destination and boom goes the dynamite.
You could create a rownumber with a Script Component (don't worry very easy): http://microsoft-ssis.blogspot.com/2010/01/create-row-id.html
or you could use a rownumber component like http://microsoft-ssis.blogspot.com/2012/03/custom-ssis-component-rownumber.html or http://www.sqlis.com/post/Row-Number-Transformation.aspx
For dividing it in 8 files you could use the Balanced Data Distributor or the Conditional Split with a modulo expression (using your new rownumber column):