Tweak ThinLTO inling heuristics in absense of PGO profile

We previously disabled inlining and unrolling completely during ThinLTO
in absense of PGO profile. For global ThinLTO, we want to better balance
binary size and performance.

We evaluated a number of combination of heuristics with global ThinLTO
configuration:
                                binary size change
  no LTO                          baseline
  no inline, no unroll            -0.54%
  no inline, unroll               -0.50%
  import-instr-limit=5, unroll    +0.02%
  import-instr-limit=10, unroll   +0.13%

Loop unrolling does not contribute much to the binary size, therefore
it is re-enabled.

import-instr-limit=5 balances the binary size savings from ThinLTO and
size incrase due to additional optimisation.

Bug: 78485207
Bug: 169004486
Test: TreeHugger
Change-Id: I1c21153605e2ae42daa8857b06e27c081ee8ad85
This commit is contained in:
Yi Kong 2020-09-23 00:54:50 +08:00
parent c5a0e64d82
commit 2f5f16d574
1 changed files with 3 additions and 4 deletions

View File

@ -117,12 +117,11 @@ func (lto *lto) flags(ctx BaseModuleContext, flags Flags) Flags {
flags.Local.LdFlags = append(flags.Local.LdFlags, cachePolicyFormat+policy)
}
// If the module does not have a profile, be conservative and do not inline
// or unroll loops during LTO, in order to prevent significant size bloat.
// If the module does not have a profile, be conservative and limit cross TU inline
// limit to 5 LLVM IR instructions, to balance binary size increase and performance.
if !ctx.isPgoCompile() {
flags.Local.LdFlags = append(flags.Local.LdFlags,
"-Wl,-plugin-opt,-inline-threshold=0",
"-Wl,-plugin-opt,-unroll-threshold=0")
"-Wl,-plugin-opt,-import-instr-limit=5")
}
}
return flags