How can I profile C++ code running on Linux?

2020-01-22 10:19发布

I have a C++ application, running on Linux, which I'm in the process of optimizing. How can I pinpoint which areas of my code are running slowly?

17条回答
Luminary・发光体
2楼-- · 2020-01-22 10:56

Use -pg flag when compiling and linking the code and run the executable file. While this program is executed, profiling data is collected in the file a.out.
There is two different type of profiling

1- Flat profiling:
by running the command gprog --flat-profile a.out you got the following data
- what percentage of the overall time was spent for the function,
- how many seconds were spent in a function—including and excluding calls to sub-functions,
- the number of calls,
- the average time per call.

2- graph profiling
us the command gprof --graph a.out to get the following data for each function which includes
- In each section, one function is marked with an index number.
- Above function , there is a list of functions that call the function .
- Below function , there is a list of functions that are called by the function .

To get more info you can look in https://sourceware.org/binutils/docs-2.32/gprof/

查看更多
可以哭但决不认输i
3楼-- · 2020-01-22 10:57

As no one mentioned Arm MAP, I'd add it as personally I have successfully used Map to profile a C++ scientific program.

Arm MAP is the profiler for parallel, multithreaded or single threaded C, C++, Fortran and F90 codes. It provides in-depth analysis and bottleneck pinpointing to the source line. Unlike most profilers, it's designed to be able to profile pthreads, OpenMP or MPI for parallel and threaded code.

MAP is commercial software.

查看更多
一夜七次
4楼-- · 2020-01-22 10:58

Newer kernels (e.g. the latest Ubuntu kernels) come with the new 'perf' tools (apt-get install linux-tools) AKA perf_events.

These come with classic sampling profilers (man-page) as well as the awesome timechart!

The important thing is that these tools can be system profiling and not just process profiling - they can show the interaction between threads, processes and the kernel and let you understand the scheduling and I/O dependencies between processes.

Alt text

查看更多
啃猪蹄的小仙女
5楼-- · 2020-01-22 11:02

For single-threaded programs you can use igprof, The Ignominous Profiler: https://igprof.org/ .

It is a sampling profiler, along the lines of the... long... answer by Mike Dunlavey, which will gift wrap the results in a browsable call stack tree, annotated with the time or memory spent in each function, either cumulative or per-function.

查看更多
贼婆χ
6楼-- · 2020-01-22 11:02

Actually a bit surprised not many mentioned about google/benchmark , while it is a bit cumbersome to pin the specific area of code, specially if the code base is a little big one, however I found this really helpful when used in combination with callgrind

IMHO identifying the piece that is causing bottleneck is the key here. I'd however try and answer the following questions first and choose tool based on that

  1. is my algorithm correct ?
  2. are there locks that are proving to be bottle necks ?
  3. is there a specific section of code that's proving to be a culprit ?
  4. how about IO, handled and optimized ?

valgrind with the combination of callrind and kcachegrind should provide a decent estimation on the points above and once it's established that there are issues with some section of code, I'd suggest do a micro bench mark google benchmark is a good place to start.

查看更多
登录 后发表回答