Statistics on large table presented on the web

2019-09-09 09:51发布

问题:

We have a large table of data with about 30 000 0000 rows and growing each day currently at 100 000 rows a day and that number will increase over time.

Today we generate different reports directly from the database (MS-SQL 2012) and do a lot of calculations.

The problem is that this takes time. We have indexes and so on but people today want blazingly fast reports.

We also want to be able to change timeperiods, different ways to look at the data and so on.

We only need to look at data that is one day old so we can take all the data from yesterday and do something with it to speed up the queries and reports.

So do any of you got any good ideas on a solution that will be fast and still on the web not in excel or a BI tool.

Today all the reports are in asp.net c# webforms with querys against MS SQL 2012 tables..

回答1:

You have an OLTP system. You generally want to maximize your throughput on a system like this. Reporting is going to require latches and locks be taken to acquire data. This has a drag on your OLTP's throughput and what's good for reporting (additional indexes) is going to be detrimental to your OLTP as it will negatively impact performance. And don't even think that slapping WITH(NOLOCK) is going to alleviate some of that burden. ;)

As others have stated, you would probably want to look at separating the active data from the report data.

Partitioning a table could accomplish this if you have Enterprise Edition. Otherwise, you'll need to do some hackery like Paritioned Views which may or may not work for you based on how your data is accessed.

I would look at extracted the needed data out of the system at a regular interval and pushing it elsewhere. Whether that elsewhere is a different set of tables in the same database or a different catalog on the same server or an entirely different server would depend a host of variables (cost, time to implement, complexity of data, speed requirements, storage subsystem, etc).

Since it sounds like you don't have super specific reporting requirements (currently you look at yesterday's data but it'd be nice to see more, etc), I'd look at implementing Columnstore Indexes in the reporting tables. It provides amazing performance for query aggregation, even over aggregate tables with the benefit you don't have to specify a specific grain (WTD, MTD, YTD, etc). The downside though is that it is a read-only data structure (and a memory & cpu hog while creating the index). SQL Server 2014 is going to introduce updatable columnstore indexes which will be giggity but that's some time off.