Tweak ThinLTO inling heuristics in absense of PGO profile

We previously disabled inlining and unrolling completely during ThinLTO in absense of PGO profile. For global ThinLTO, we want to better balance binary size and performance. We evaluated a number of combination of heuristics with global ThinLTO configuration: binary size change no LTO baseline no inline, no unroll -0.54% no inline, unroll -0.50% import-instr-limit=5, unroll +0.02% import-instr-limit=10, unroll +0.13% Loop unrolling does not contribute much to the binary size, therefore it is re-enabled. import-instr-limit=5 balances the binary size savings from ThinLTO and size incrase due to additional optimisation. Bug: 78485207 Bug: 169004486 Test: TreeHugger Change-Id: I1c21153605e2ae42daa8857b06e27c081ee8ad85
2020-09-23 00:54:50 +08:00 · 2020-09-23 00:54:50 +08:00 · 2f5f16d574
parent c5a0e64d82
commit 2f5f16d574
1 changed files with 3 additions and 4 deletions
--- a/cc/lto.go
+++ b/cc/lto.go
@ -117,12 +117,11 @@ func (lto *lto) flags(ctx BaseModuleContext, flags Flags) Flags {
 			flags.Local.LdFlags = append(flags.Local.LdFlags, cachePolicyFormat+policy)
 		}

-		// If the module does not have a profile, be conservative and do not inline
-		// or unroll loops during LTO, in order to prevent significant size bloat.
+		// If the module does not have a profile, be conservative and limit cross TU inline
+		// limit to 5 LLVM IR instructions, to balance binary size increase and performance.
 		if !ctx.isPgoCompile() {
 			flags.Local.LdFlags = append(flags.Local.LdFlags,
-				"-Wl,-plugin-opt,-inline-threshold=0",
-				"-Wl,-plugin-opt,-unroll-threshold=0")
+				"-Wl,-plugin-opt,-import-instr-limit=5")
 		}
 	}
 	return flags