Settings

Theme

Show HN: Perforator – cluster-wide profiling tool for large data centers

github.com

78 points by BigRedEye a year ago · 15 comments · 2 min read

Reader

Hey HN! We are happy to share Perforator – our internal cluster-wide profiler with great support for native languages and a built-in AutoFDO pipeline to simplify sPGO builds. Perforator allows you to profile most binaries without having to recompile or adjust the build process. We use it at Yandex to profile each pod inside a large cluster at modest speed (99Hz), collecting petabytes of profiles every day.

There's a blog post about it at https://medium.com/yandex/yandexs-high-performance-profiler-....

Inspired by Google-Wide Profiling, we started continuous profiling years ago with simple tools like poormansprofiler.org. With the rise of eBPF, we came up with a simple and elegant solution providing detailed profiles without noticeable overhead. Pretty wild when you can see the guts of your production binaries in a flamegraph without them even noticing.

Some technical details:

- Our main contribution is infrastructure for continuous PGO using AutoFDO. Google and Meta have done tremendous work on building PGO infrastructure, and we made the last missing piece of the puzzle to make this work well and scalable.

- Native binaries are profiled through eh_frame analysis, interpreted/JIT-compiled languages are profiled through perf-pid.map or hardcoded structure offsets.

- We render profiles in multiple ways, the most common one is a fast implementation of FlameGraphs, rendering 1M frames in 100ms.

- We provide Helm charts to easily deploy Perforator on your k8s cluster.

- You can use Perforator in standalone mode as a replacement for perf record.

I'd love to answer your questions about the tool!

znpy a year ago

I just learned about poormansprofiler (https://poormansprofiler.org/): it's brilliant in its simplicity.

brancz a year ago

If I'm understanding correctly, this is collecting LBR data through hardware support for PGO/AutoFDO, right?

  • dang a year ago

    (These are older comments that we merged from https://news.ycombinator.com/item?id=42888185, in case anyone was confused by the timestamps)

  • BigRedEyeOP a year ago

    Yes. Although we are studying CSSPO, which uses a mixed (LBR + software-sampled stacks) approach.

    • brancz a year ago

      I'm familiar with the paper, but it doesn't improve the situation in terms of LBR availability on cloud providers, does it?

      • BigRedEyeOP a year ago

        Yes, existing limitations apply. Without hardware LBR support, we cannot provide sPGO profiles. However, the basic profiling should work fine.

        • menaerus a year ago

          Blog is packed with information, thanks!

          Isn't it the case that from stack traces it is rather impossible to read that function foo() is burning CPU cycles because it is memory-bound? And the reason could be rather somewhere else and not in that particular function - e.g. multiple other threads creating contention on the memory bus?

          If so, doesn't this make the profile somewhat an invalid candidate for PGO?

          • BigRedEyeOP a year ago

            It depends on the event that was sampled to generate the profiles. For example, if you sample instructions by collecting a stack trace every N instructions, you won't actually see foo() burning the CPU. However, if you look at CPU cycles, foo() will be very noticeable. Internally, we use sPGO profiles from sampling CPU cycles, not instructions.

            • menaerus a year ago

              Right, perhaps I was a little bit too vague but what I was trying to say is that by merely sampling the CPU cycles we cannot infer that the foo() was burning CPU because it was memory-bound and which in itself is not an artifact of foo() implementation but rather application-wide threads that happen to saturate the memory bus more quickly.

              Or is my doubt incorrect?

be-hase a year ago

I'm curious about the differences from Pyroscope. https://github.com/grafana/pyroscope

  • BigRedEyeOP a year ago

    Great question! Perforator indeed looks similar to Pyroscope. However, we think that the closest existing solutions are https://parca.dev, closed-source Google Wide Profiling, and, speaking of the agent, the beautiful OpenTelemetry eBPF profiler. The main technical differences with Pyroscope we see are:

    - Pyroscope's Java support is superior as of now because Pyroscope offloads it to the amazing async-profiler.

    - Pyroscope expects native binaries to be compiled with frame pointers: https://grafana.com/docs/pyroscope/latest/configure-client/g.... This is often not the case, and that's the problem we've tried to solve with Perforator. Perforator uses .eh_frame, which is nearly universal and does not impose additional requirements on compiled binaries.

    - Pyroscope symbolizes using symtab: https://grafana.com/docs/pyroscope/latest/configure-client/g.... We use DWARF/GSYM to get as correct and verbose stacks as possible (we benchmark our stacks against stacks from gdb).

    - Pyroscope symbolizes profiles on an agent, while Perforator symbolizes profiles offline, greatly reducing symbolization costs and agent's overhead. It seems Pyroscope is heading toward the same architecture we use: https://github.com/grafana/pyroscope/pull/3799.

    - Perforator can be (and should be!) run as a standalone replacement for perf record.

    - Perforator supports sPGO profiles.

    In summary, we try to implement native profiling almost perfectly. It's worth noting that Pyroscope is a mature, well-established product that integrates excellently with the Grafana ecosystem. We have just focused on different things: our focus has been on optimizing native code profiling and making it as accurate and low-overhead as possible.

eoranged a year ago

Any plans on grafana integration? It would’ve been great to have an ability to match performance metrics with other app indicators

KingOfCoders a year ago

    You don't have to wait for later, here's a new eliminator
    Ask your local weapon trader for the superperforator

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection