I'm currently writing an SSIS package that retrieves data from a stored procedure via an OLE DB Source. The stored procedure contains a rather nasty query that I've been able to improve with the use of temp tables. If I switch these temp tables to table variables, the logical reads jump from about 1.3 million to about 56 million. I'm uncomfortable enough with the 1.3 million, but there is no way that I can be satisfied with the 56 million logical reads. Thus, I can't really convert the temp tables to table variables.
However, SSIS (or rather SQL Server) can't resolve the metadata for this query, so the package won't run. I've found a few different solutions online, but none of them seem to work for both SQL Server 2008 and SQL Server 2014. We are currently in the process of upgrading all of our servers to 2014, and this particular package runs against 2008 in DEV, 2014 in QA, and 2008 in production currently. By the fall, the PROD tier will be 2014, and the DEV tier will be promoted sometime after that. Unfortunately, I can't wait until these upgrades happen to release this SSIS package. The data needs to start moving by next week. Thus, I need to figure out a way to get the metadata resolved for both environments. Here's what I've tried so far:
Add a dummy select in an
IF 1=0
block which returns the proper metadata. This works in 2008, but not 2014.Use
SET FMTONLY OFF
at the beginning of the stored procedure. This works in 2008, but not 2014. Furthermore, it causes the stored procedure to run once for each column returned (over 30 in this case), which is a deal-breaker even if it did work.Use
EXEC ... WITH RESULT SETS (( ... ));
. This works in 2014, but not in 2008.Deploy a stored procedure which returns the proper metadata, build and deploy the SSIS package, then modify the stored procedure to the proper version. This hasn't seemed to work in either environment, and this would complicate of any other ETL applications developed within our ETL framework.
If I can't figure anything out, I could either deploy different stored procedures and packages to the different tiers, but I would very much prefer against this. For one, this would complicate future releases, and I would also need to ensure that I don't forget about updating the stored procedure and package once we upgrade the servers.
I could also make real tables in the database which would take the place of these temp tables. I don't really like this solution, but it's something that I could tolerate. If I end up doing this, I would probably switch to using the WITH RESULT SETS
in the future.
However, I personally don't care much for either of these solutions, so I was wondering if there is any workaround that I missed that might work a bit better.
Despite your reluctance, I think you've made the right choice and a dedicated staging area is the right way to go. Most of the production ETLs I've worked with have a dedicated staging database, never mind tables. You then have the benefit of being able to control the storage more explicitly, which makes performance more reliable and the whole thing generally more maintainable. For example, you can create a dedicated contiguous block of fast disk space for these tables with their own file group etc. I'd certainly rather see 2 separate SPs relying on a few physical tables than a really gnarly single one.
That said, without knowing any specifics this is just my experience, so a caveat for future readers: As with all things database, be sure to measure the actual performance of your scenario (before and after) rather than making any assumptions based on the query plan - it might be misleading you.