|
Software profiling is a form of dynamic program analysis that measures, for example :
- the space or time complexity of a program
- the usage of particular instructions
- the frequency and duration of function calls, ...
@copyright wikipedia
@copyright highscalability
@copyright macifcourseaularge
You need measurements to improve continuously your application performance.
You want to understand what is consuming your CPU.
You want to understand what your CPUs are doing.
int main(void)
{
return 0;
}
int func1(void) {
return 0;
}
Use gcc to compile it
gcc -c app.c -o app
readelf - Displays information about ELF files
readelf -s app
45: 0000000000400580 2 FUNC GLOBAL DEFAULT 13 __libc_csu_fini
46: 00000000004004f8 11 FUNC GLOBAL DEFAULT 13 func1
...
57: 0000000000601040 0 NOTYPE GLOBAL DEFAULT 25 _end
58: 0000000000400400 0 FUNC GLOBAL DEFAULT 13 _start
59: 0000000000601038 0 NOTYPE GLOBAL DEFAULT 25 __bss_start
60: 00000000004004ed 11 FUNC GLOBAL DEFAULT 13 main
...
60: 00000000004004ed 11 FUNC GLOBAL DEFAULT 13 main
Generally we can list 3 type of profilers :
Gprof, Callgrind, ...
You can instrument your execution with callgrind and explore on kcachegrind.
Perf, Oprofile, Intel Vtune, ...
To understand how simple a sampling profiler is, write your own thread dump using gdb.
gstack() {
tmp=$(tempfile)
echo thread apply all bt >"$tmp"
gdb -batch -nx -q -x "$tmp" -p "$1"
rm -f "$tmp"
}
You execute with frequency to know where your program is spending time
while sleep 1; do gstack @pid@ ; done
We don't have any time record on mat_new, even if it's called 3 times.
git clone https://github.com/brendangregg/FlameGraph.git
sudo ln -s $PWD/flamegraph.pl /usr/bin/flamegraph.pl
sudo ln -s $PWD/stackcollapse-perf.pl /usr/bin/stackcollapse-perf.pl
sudo ln -s $PWD/stackcollapse-jstack.pl /usr/bin/stackcollapse-jstack.pl
sudo ln -s $PWD/stackcollapse-gdb.pl /usr/bin/stackcollapse-gdb.pl
git clone https://github.com/memcached/memcached.git
cd memcached
./configure && make
readelf -s ./memcached
...
434: 000000000040edf0 10 FUNC GLOBAL DEFAULT 13 slabs_rebalancer_resume
435: 0000000000000000 0 FUNC GLOBAL DEFAULT UND setuid@@GLIBC_2.2.5
436: 0000000000000000 0 FUNC GLOBAL DEFAULT UND event_base_loop
437: 0000000000412fd0 315 FUNC GLOBAL DEFAULT 13 pause_threads
438: 00000000004135e0 10 FUNC GLOBAL DEFAULT 13 STATS_LOCK
439: 0000000000000000 0 FUNC GLOBAL DEFAULT UND getaddrinfo@@GLIBC_2.2.5
440: 0000000000000000 0 FUNC GLOBAL DEFAULT UND strerror@@GLIBC_2.2.5
441: 000000000040f550 201 FUNC GLOBAL DEFAULT 13 do_item_unlink
442: 0000000000000000 0 FUNC GLOBAL DEFAULT UND event_init
443: 0000000000000000 0 FUNC GLOBAL DEFAULT UND sleep@@GLIBC_2.2.5
444: 0000000000412b40 247 FUNC GLOBAL DEFAULT 13 assoc_delete
...
Understand what's happen internally by following execution trace.
valgrind --tool=callgrind --instr-atstart=no ./memcached
On another terminal
callgrind_control -i on
php memcache-set.php
callgrind_control -i off
kcachegrind callgrind.out.@pid@
./memcached &
while sleep 0.1; do gstack 8748; done > stack.txt
cat stack.txt | stackcollapse-gdb.pl | flamegraph.pl > gdb_graph.svg
In an another terminal
php memcache-set.php
We capture events to build callgraph
perf record -g ./memcached
In an another terminal
php memcache-set.php
To show an interactive report
perf report
perf report --stdio
perf script | stackcollapse-perf.pl | flamegraph.pl > graph_stack_missing.svg
Flamegraph
Some information from kernel are missing.
./memcached &
sudo perf record -a -g -p @pid@
In an another terminal
php memcache-set.php
Generate the flamegraph
perf script | stackcollapse-perf.pl | flamegraph.pl > graph.svg
Flamegraph
./memcached &
sudo perf record -e branch-misses -a -g -p @pid@
sudo perf record -a -g
You can export a flamegraph from jstack output
Logstash contention flamegraphPrefer :