process the new files only

2019-03-05 09:21发布

问题:

I have a source from which the files are to be processed. Multiple files are coming to that loacation at any time randomly (Package should run every 2 hours). I have to process only the new files, i can not delete, move the already processed files from that location. I can only copy the files to Archive location. How can I achieve this ?

回答1:

You can achieve this using the following steps.

  1. Use the foreach file enumerator for your incoming folder and save the filename in "IncomingFile" variable. Configure to select "Name and Extension"[In my code I have used that otherwise you need to do some modification to the script]
  2. Create tow SSIS variables Like "ArchivePath" as string and "IsLoaded" as Boolean[default to false].
  3. Create the SSIS script component and use "IncomingFile" and "ArchivePath" as the readonly variable. "IsLoaded" should be the ReadandWrite variable.
  4. Write the following code in the script component. If file is already exists then it will return true. Otherwise False.

    public void Main()
    {
        var archivePath = Dts.Variables["ArchivePath"].Value.ToString();
        var incomingFile = Dts.Variables["IncomingFile"].Value.ToString();
    
        var fileFullPath = string.Format(@"{0}\{1}",archivePath,incomingFile);
    
        bool isLoaded = File.Exists(fileFullPath);
    
        Dts.Variables["IsLoaded"].Value = isLoaded;
    
        Dts.TaskResult = (int)ScriptResults.Success;
    }
    
  5. Use the Precedence constraint to call the Data flow task and evaluation operation should be "Expression" . Set something as follows in your expression box.

    @IsLoaded==False

Hope this helps.



回答2:

Your package should process the files in a given directory, then move them to another directory once processed. That way, each time the package runs, it has to fully process the source directory.

To process each files in a directory, use the ForEach Container. You can specify a folder to look in, and some expressions to filter. If, for instance, your filename contains a timestamp, you could use that timestamp to filter your files in or out.

You use a flat file source to read files, then use the filesystem task to move them around.



回答3:

To start, take a look at the answer here: Enumerate files in a folder using SSIS Script Task

The SSIS Script Task should enumerate all the files in a given folder, then take a snapshot of the already processed files from a table where you will keep a log of what's processed, ignore the already processed ones and just return the non-processed in an object variable for a for-each task to consume.