If I have the option of using .Net and can do data transformations just fine in .Net, when would I need SSIS? Is there a certain task that SSIS would be better for? Are the added benefits of transparency worth it? Is it just what I am more comfortable with? What are the best practices for determining this?
相关问题
- sql execution latency when assign to a variable
- Generic Generics in Managed C++
- How to Debug/Register a Permanent WMI Event Which
- What is the best way to cache a table from a (SQL)
- php PDO::FETCH_ASSOC doesnt detect select after ba
SSIS is great for BI applications, you can manipulate the data on Stage Table and than make avaiable on DataWarehouse tables to be used for BI.
I can connect on SAP, Oracle to get employee information and make avaiable on PowerBI, QlikView, etc...
Its a nice tool if you know where and why use it. Use ir because its cool you will have troubles.
I think project time/budget constraints and the use of a standard tool are some of the biggest arguments for using SSIS. Creating an SSIS package is most of the times way faster than trying to code something similar in .NET.
But with that said, it seems like SSIS have a lot of pain points that sometimes might invalidate this argument. It did for me when developing a solution that needed to run in different environments at many different clients. SSIS simply looked too painful the more I evaluated it for the project. A properly architected .NET solution is easier to deploy, more reliable, more flexible, easier to understand and can also achieve very good performance.
IMHO: consider using SSIS for projects that you only need to deploy to one or maybe two in house SQL Server environments. Otherwise, the .NET approach will quickly become more appealing.
good question.
if the amount of data transfer huge? are you processing multiple data files and need transactions (both at file system level and database level)? are you dealing with multiple data sources at different locations (for eg ftp, local file system, database)?
if answers to above are yes then go ahead with ssis. basically .net is cool with small data import / export jobs, but when you have anything more complex, ssis is a definite winner
the other thing which i look at is - is it worth writing .net code when everything is available inside ssis. (dont mistake me - i love coding) however, anything you code, you need to maintain :-)
I guess it depends on what you are doing. SSIS is very powerful, just like old DTS. If you are loading lots of items and expect to have constant change, I would go SSIS all the way. If you are looking to load only a few items and it’s for lots of customers, I would put it in code. I prefer SSIS for in house ETL processes, but I use .Net at client shops when I need to load data from a legacy system into a SQL database. Now as I stated before if you have a lot of transformations and lots of different data silos to load, I think you would be crazy to do this in .Net and I would go SSIS. If you have only a few items to load and it’s for a single application and may be installed as part of an application at various clients, I would go .Net all the way. Just my 2 cents.
My arguments for not using SSIS are:
Design greenfield products so that they have RESTful data feeds for reporting and extraction built-in to the project plan and budget, preferably to a standard like OData so that other tools can plug right in.
Data feeds should pull and transform from upstream systems and feeds on demand; such that schedule tasks, configuration of scheduled tasks, task runner VMs and staff to run all this unreliable scheduling stuff is negated.
RESTful data feeds leverage HTTP caching.
Feeds/services/APIs can be moved to elastic-scale cloud easily.
SSIS requires finding people with SSIS skills that enjoy doing that stuff for weeks. In my experience, finding and retaining SSIS developers is hard and expensive and the people found tend to be sub-par.
SSIS doesn't work well with source control and collaborative work.
SSIS doesn't lend itself well to code reuse, unlike microservices and traditional code libraries.
SSIS doesn't version easily, unlike a REST service.
SSIS doesn't lend itself to modular designs and continuous deployment of many small changes, it tends to be large-batch with scary releases.
SSIS promotes the use of stored-procedures which places a lot of demand on SQL which is the hot-spot. Favour designs that place demands on a scaleable, stateless middle tier.
The tooling is clunky and unreliable.
You're at the mercy of Microsoft's roadmap for SSIS.
Consider writing to tables/services that support analysis, reporting and views as soon as the data comes into the application; see CQRS and other application architecture patterns.
Never use Excel as a data source; train employees.
Code is king.
Ultimately, I see SSIS as a relic of Enterprise IT. I like to ask, "Would Google use SSIS?" How else can the problem be solved? Think outside the box.
I have a lot of experience with SSIS from small projects to large, complex ETL. Without going into the details, this is my guidance for you:
If you are a DBA and you are not familiar with .NET, or if you are a developer quite familiar with SSIS, then you can use SSIS for small, simple, fairly straightforward extract, transform, load (ETL) tasks.
SSIS is very quirky and there are many pitfalls, gotchas, and what might be considered outright bugs. It is extremely powerful if you are intimately familiar.
C# now has TPL Dataflow. Simple performance tests put it ahead of SSIS. (eg http://mymemoryleaks.blogspot.cz/2013/10/ssis-vs-tpldataflow.html)
If you want to do anything beyond trivial, and if you can use .NET skills, use .NET instead of SSIS.