Skip to content

Commit bba7151

Browse files
anakryikoacmel
authored andcommitted
dwarf_loader: increase the size of lookup hash map
One of the primary use cases for using pahole is BTF deduplication during Linux kernel build. That means that DWARF contains more than 5 million types is loaded. So using a hash map with a small number of buckets is quite expensive due to hash collisions. This patch bumps the size of the hash map and reduces overhead of this part of the DWARF loading process. This shaves off about 1 second out of about 20 seconds total for Linux BTF dedup. Committer testing: Before: $ perf stat -r5 pahole -J vmlinux Performance counter stats for 'pahole -J vmlinux' (5 runs): 8,953.80 msec task-clock:u # 0.998 CPUs utilized ( +- 0.09% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 353,855 page-faults:u # 0.040 M/sec ( +- 0.00% ) 35,775,730,539 cycles:u # 3.996 GHz ( +- 0.07% ) (83.33%) 579,534,836 stalled-cycles-frontend:u # 1.62% frontend cycles idle ( +- 2.21% ) (83.33%) 5,719,840,144 stalled-cycles-backend:u # 15.99% backend cycles idle ( +- 0.93% ) (83.33%) 73,035,744,786 instructions:u # 2.04 insn per cycle # 0.08 stalled cycles per insn ( +- 0.02% ) (83.34%) 16,798,017,844 branches:u # 1876.077 M/sec ( +- 0.05% ) (83.33%) 237,777,143 branch-misses:u # 1.42% of all branches ( +- 0.15% ) (83.34%) 8.97077 +- 0.00803 seconds time elapsed ( +- 0.09% ) $ After: $ perf stat -r5 pahole -J vmlinux Performance counter stats for 'pahole -J vmlinux' (5 runs): 8,735.92 msec task-clock:u # 0.998 CPUs utilized ( +- 0.34% ) 0 context-switches:u # 0.000 K/sec 0 cpu-migrations:u # 0.000 K/sec 353,978 page-faults:u # 0.041 M/sec ( +- 0.00% ) 34,722,167,335 cycles:u # 3.975 GHz ( +- 0.12% ) (83.33%) 555,981,118 stalled-cycles-frontend:u # 1.60% frontend cycles idle ( +- 1.53% ) (83.33%) 5,215,370,531 stalled-cycles-backend:u # 15.02% backend cycles idle ( +- 1.31% ) (83.33%) 72,615,773,119 instructions:u # 2.09 insn per cycle # 0.07 stalled cycles per insn ( +- 0.02% ) (83.34%) 16,624,959,121 branches:u # 1903.057 M/sec ( +- 0.01% ) (83.33%) 229,962,327 branch-misses:u # 1.38% of all branches ( +- 0.07% ) (83.33%) 8.7503 +- 0.0301 seconds time elapsed ( +- 0.34% ) $ 2.94% less cycles, good :-) Signed-off-by: Andrii Nakryiko <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
1 parent 2e719cc commit bba7151

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

dwarf_loader.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ static void dwarf_tag__set_spec(struct dwarf_tag *dtag, dwarf_off_ref spec)
9393
*(dwarf_off_ref *)(dtag + 1) = spec;
9494
}
9595

96-
#define HASHTAGS__BITS 8
96+
#define HASHTAGS__BITS 15
9797
#define HASHTAGS__SIZE (1UL << HASHTAGS__BITS)
9898

9999
#define obstack_chunk_alloc malloc

0 commit comments

Comments
 (0)