- rel:: [[JVM]] [[Java Flight Recorder|JFR]] [[Performance|performance]] # async-profiler ## Reference - [Profiling Java Applications With Async Profiler](https://hackernoon.com/profiling-java-applications-with-async-profiler-049s2790) - [Async-profiler - manual by use cases](https://krzysztofslusarski.github.io/2022/12/12/async-manual.html) ([devonthink](x-devonthink-item://AA618EE0-70BB-4964-96E4-B5E5331DBC15)) ### Safepoint Bias - [Why Most Sampling Profilers Are Terrible](x-devonthink-item://E03E154D-9E31-47D4-87BC-E70CF46EB892) - [The Pros and Cons of AsyncGetCallTrace Profilers](x-devonthink-item://497D85EF-9018-4C42-936F-F600BC981A67) ## Log - Researching `timer_create` and `timer_settime` as improvement over `itimer` - https://github.com/golang/go/issues/35057 - https://elinux.org/Kernel_Timer_Systems - http://hpctoolkit.org/man/hpcrun.html#section_15 - [profiling improvements in go 1.18](https://felixge.de/2022/02/11/profiling-improvements-in-go-1.18/) by [[Felix Geisendörfer]] ^6a4959 - limitations of setitimer sampling - *Only a single signal can be pending delivery at a time. If another signal of the same kind is generated while one is already pending, the new signal gets dropped.* - in a multi-threaded process on multiple cores, one signal per core can be generated at the same time, causing all but one to be dropped - [perftest](https://github.com/felixge/proftest) to observe the signal generation and delivery behavior of `setitimer(2)` and `timer_create(2)` - `setitimer(2)`’s process-directed signals are not fairly distributed among CPU-consuming threads. Instead [one thread tends to get screwed](https://twitter.com/felixge/status/1400356284285792262) and receives less signals than all other threads. - the good news is that, except for jiffy resolution, `timer_create(2)`doesn’t appear to suffer from any of the ailments plaguing `setitimer(2)`. Thanks to per-thread signal accounting, it reliably delivers the right amount of signals and shows no bias towards any particular threads. The only issue is that `timer_create(2)` requires the CPU profiler to be aware of all threads. - if we use a hybrid approach, setitimer + timer_create/timer_settime, will need to distinguish between non-JVM-registered threads (GC, compiler) and JVM-registered threads respectively - [GetAllThreads](https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#GetAllThreads), [ThreadStart](https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#ThreadStart), and [ThreadEnd](https://docs.oracle.com/en/java/javase/11/docs/specs/jvmti.html#ThreadEnd) are call-backs invoked on the registered thread. We can use these to maintain a list of known JVM thread IDs. The list will need to be thread-safe. - the setitimer handler would need to deal with being executed on timer_create/timer_settime threads and deal with things appropriately - when setitimer handler is called for a thread that isn't part of the JVM thread IDs, we may want to sample as we are today or maybe we want to figure out a way to make timer_create/timer_settime work with that case as well - testing under load is necessary to trigger the abnormal behavior we saw