If I have the option of using .Net and can do data transformations just fine in .Net, when would I need SSIS? Is there a certain task that SSIS would be better for? Are the added benefits of transparency worth it? Is it just what I am more comfortable with? What are the best practices for determining this?
相关问题
- sql execution latency when assign to a variable
- Generic Generics in Managed C++
- How to Debug/Register a Permanent WMI Event Which
- What is the best way to cache a table from a (SQL)
- php PDO::FETCH_ASSOC doesnt detect select after ba
As the name suggests, SSIS is an integration system. It can be very difficult in .net to handle connectors to disparate data sources such as excel, teradata, oracle etc and also to live up to the responsibility to gracefully close those connections, garbage collection, handling memory issues.
So, SSIS is out of the box product perfect for scenarios where data not only needs to be pulled from, say, two different sources, but then a series lookups, transformations, merges, derivations and calculations need to be performed before writing it to a target location(be it sql server, a flat file or another db system).
SSIS also has checkpoints where, if the package fails due to any reason, it will pick up from where it left off (it needs to be configured as this is not default behavior).
In addition, SSIS will save you a lot of time because its tasks are reusable and its deployment process is fairly easy to implement and schedule, supported by great event handling.
SSIS has many built in ways of doing transformations from different data sources and you can string them together in a way that makes it very customizeable. They have built in optimizations that make them fast.
You can also use .NET to make your own custom transformations to take advantage of the speed and repeatability of an SSIS job.
Day-to-Day Tasks , which are used by a SSIS Deveoper and are relatively easy as compared to .Net can include :
Data Comparison between the tables.
Conditional Splitting,data blocking the data on the basis of some logic.
Data Conversion,look up , merge , unionall , relatively easy to use.
File Handling(Modifying , validations).
Error Handing , Email Alerts.
Containers , FOR/FOReach loops are easy to use.
Posting data on web services is easy using the WebService task.
Checkpointing,Re-runablity of the data loads is easy to handle.
Debugging is easy in ssis - can be done on conatiner lever , package level.
Scripting can also be done , if the task is not available. Also , you can customize your own tasks
Bit late to answer this question but I hope it worth,
SSIS is often misunderstood when compared to programming language. SSIS is a framework whereas C# is a language on .NET Framework. I have extensive experience in handling & developing large data warehousing solution using (MSBI suite) and also had developed large websites (ASP.NET) - so I can't be biased.
SSIS if not used properly can decrease performance by par. SSIS packages have three kind of transformation:
SSIS works exceptionally good with non blocking transformation with proper setting on control flow and data flow. I have used it on larger (over 2 TB of data warehouse) and I can guarantee that it was the fastest load experience. You can check Microsoft blog about We Loaded 1TB in 30 Minutes with SSIS, and So Can You
I agree that SSIS degraded performance when dealing with blocking transformation and they should be carried by T-SQL whenever required.
Coming to C#, I accept that SSIS uses .NET framework and data provider to accomplish task. But C#, as a language is bit more logical and must be treated to deal with business logic. For example, If we have to run exe with different parameter based on condition, you can write a package which will consider parameters and then logically decide what parameter needs to be passed to run an exe file. It would be lengthy process to do that in SSIS while I can do that easily in C# because logical thing can be easily done in language instead of a framework.
Now the point here is what is more convenient approach to solve your problem statement. SSIS is a sure winner loading large amount of records loading data from source to destination while C# is perfect for writing logic. Even if you like C#, I won't recommend you to choose for doing ETL (Extract Transform Load) operation on large data warehouse systems.
Whatever folks say in previous answers are correct but I think that the most important aspect of using SSIS instead of coding is to have easy maintenance process and also a reusable product.
I think main advantage is defining the entire programming construct visually. Any one look at the SSIS package is it pretty much self explainer. The tight integration with the SSIS with SQL allows you to be part of SQL for back up scheduling and huge plus.
As every one explained if you are doing the lot of data manipulation it is good tool. It is free if you have SQL you all set to go and very easy to learn with VS 2008 BIDS