可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have some application, which makes database requests. I guess it doesn't actually matter, what kind of the database I am using, but let's say it's a simple SQLite-driven database.

Now, this application runs as a service and does some amount of requests per minute (this number might actually be huge).

I'm willing to benchmark the queries to retrieve their number, maximal / minimal / average running time for some period and I wish to design my own tool for this (obviously, there are some, but I need my own for some appropriate reasons :).

So - could you advice an approach for this task?

I guess there are several possible cases:

1) I have access to the application source code. Here, obviously, I want to make some sort of cross-application integration, probably using pipes. Could you advice something about how this should be done and (if there is one) any other possible solution?

2) I don't have sources. So, is this even possible to perform some neat injection from my application to benchmark the other one? I hope there is a way, maybe hacky, whatever.

Thanks a lot.

回答1:

My answer is valid just for the case 1).

In my experience profiling it is a fun a difficult task. Using professional tools can be effective but it can take a lot of time to find the right one and learn how to use it properly. I usually start in a very simple way. I have prepared two very simple classes. The first one ProfileHelper the class populate the start time in the constructor and the end time in the destructor. The second class ProfileHelperStatistic is a container with extra statistical capability (a std::multimap + few methods to return average, standard deviation and other funny stuff).

The ProfilerHelper has an reference to the container and before exit the destructor push the data in the container.You can declare the ProfileHelperStatistic in the main and if you create on the stack ProfilerHelper at the beginning of a specific function the job is done. The constructor of the ProfileHelper will store the starting time and the destructor will push the result on the ProfileHelperStatistic.

It is fairly easy to implement and with minor modification can be implemented as cross-platform. The time to create and destroy the object are not recorded, so you will not polluted the result. Calculating the final statistic can be expensive, so I suggest you to run it once at the end.

You can also customize the information that you are going to store in ProfileHelperStatistic adding extra information (like timestamp or memory usage for example).

The implementation is fairly easy, two class that are not bigger than 50 lines each. Just two hints:

1) catch all in the destructor!

2) consider to use collection that take constant time to insert if you are going to store a lot of data.

This is a simple tool and it can help you profiling your application in a very effective way. My suggestion is to start with few macro functions (5-7 logical block) and then increase the granularity. Remember the 80-20 rule: 20% of the source code use 80% of the time.

Last note about database: database tunes the performance dynamically, if you run a query several time at the end the query will be quicker than at the beginning (Oracle does, I guess other database as well). In other word, if you test heavily and artificially the application focusing on just few specific queries you can get too optimistic results.

回答2:

See C++ Code Profiler for a range of profilers.

Or C++ Logging and performance tuning library for rolling your own simple version

回答3:

I guess it doesn't actually matter, what kind of the database I am using, but let's say it's a simple SQLite-driven database.

It's very important what kind of database you use, because the database-manager might have integrated monitoring.

I could speak only about IBM DB/2, but I beliefe that IBM DB/2 is not the only dbm with integrated monitoring tools.

Here for example an short overview what you could monitor in IBM DB/2:

statements (all executed statements, execution count, prepare-time, cpu-time, count of reads/writes: tablerows, bufferpool, logical, physical)
tables (count of reads / writes)
bufferpools (logical and physical reads/writes for data and index, read/write times)
active connections (running statements, count of reads/writes, times)
locks (all locks and type)
and many more

Monitor-data could be accessed via SQL or API from own software, like for example DB2 Monitor does.

回答4:

Under Unix, you might want to use gprof and its graphical front-end, kprof. Compile your app with the -pg flag (I assume you're using g++) and run it through gprof and observe the results.

Note, however, that this type of profiling will measure the overall performance of an application, not just SQL queries. If it's the performance of queries you want to measure, you should use special tools that are designed for your DBMS - for example, MySQL has a builtin query profiler (for SQLite, see this question: Is there a tool to profile sqlite queries? )

回答5:

There is a (linux) solution you might find interesting since it could be used on both cases.

It's the LD_PRELOAD trick. It's an environment variable that let's you specify a shared library to be loaded right before your program is executed. The symbols load from this library will override any other available on the system.

The basic idea is to this custom library as a wrapper around the original functions.

There is a bunch of resources available that explain how to use this trick: 1 , 2, 3

回答6:

Here, obviously, I want to make some sort of cross-application integration, probably using pipes.

I don't think that's obvious at all.

If you have access to the application, I'd suggest dumping all the necessary information to a log file and process that log file later on. If you want to be able to activate and deactivate this behavior on-the-fly, without re-starting the service, you could use a logging library that supports enabling/disabling log channels on-the-fly. Then you'd only need to send a message to the service by whatever means (socket connection, ...) to enable/disable logging.

If you don't have access to the application, then I think the best way would be what MacGucky suggested: let the profiling/monitoring tools of the DBMS do it. E.g. MS-SQL has a nice profiler that can capture requests to the server, including all kinds of useful data (CPU time for each request, IO time, wait time etc.).

And if it's really SQLite (plus you don't have access to the source) then your chances are rather low. If the program in question uses SQLite as a DLL, then you could substitute your own version of SQLite, modified to write the necessary log files.