PGO in Go: Optimizing AOT Code Like JIT
AOT-compiled languages like Go or C++ build code without knowledge of runtime usage patterns—unlike JIT-based languages (Java, C#, JavaScript), which analyze and optimize on the fly. At first glance, this seems like a disadvantage for Go: just a static binary, no dynamic tweaks.
Enter Profile-Guided Optimization (PGO). Typically associated with JIT compilers, PGO works surprisingly well for AOT, too. Go has supported it since v1.20 (docs), enabling usage-tuned builds without runtime overhead.
How PGO Works in Go
Unlike VM languages with automatic profiling, PGO in Go requires manual setup:
-
Build a profiling binary: Create an unoptimized binary with profiling enabled (e.g., via
runtime/pprof
), then collect CPU data under real-world workloads. Key: profiles must be representative. For multiple scenarios (e.g., APIs vs. batch jobs), consider separate optimized builds. -
Feed the profile to the compiler: The profile guides optimizations like:
- Inlining hot functions
- Stack-allocation of heap variables with local scope
- Branch prediction tweaks
-
Iterate: The optimized binary can generate new profiles for further refinements.
The Payoff
No magic, but a measurable 2–14% speed boost (per the Go team). The kicker? Zero runtime cost—unlike JIT.
When to Use PGO
- Long-running services (e.g., HTTP servers) benefit most (stable usage patterns).
- Short-lived tools? Rarely worth the setup.
→ Try it: Tinker with the Go PGO docs on a CPU-heavy project.