Overview
I've seen Brendan Gregg's talk on generating mixed-mode flame graphs and I wanted to reproduce those flamegraphs for myself. Setting up the tools is a little bit of work, so I wanted to capture those steps. Check out the Java in Flames post on the Netflix blog for more information.I've created github repo (github.com/jerometerry/perf) that contains the scripts used to get this going, including a Vagrantfile, and JMeter Test Plan.
Here's a flame graph I generated while applying load (via JMeter) to the basic arithmetic Tomcat sample application. All the green stacks are Java code, red stacks are kernel code, and yellow stacks are C++ code. The big green pile on the right is all the Tomcat Java code that's being run.
Tools
Here's the technologies I used (I'm writing this on a Mac).- VirtualBox 5.1.12
- Vagrant 1.9.1
- bento/ubuntu-16.04 (kernel 4.4.0-38)
- Tomcat 7.0.68
- JMeter 3.1
- OpenJDK 8 1.8.111
- linux-tools-4.4.0-38
- linux-tools-common
- Brendan Gregg's FlameGraph tools
- Johannes Rudolph's Perf Map Agent tool
Steps
Here's the steps to set up the VM- Created a Ubuntu 16.04VM using VirtualBox / Vagrant (vagrant init bento/ubuntu-16.04; vagrant up)
- SSH into the VM (vagrant ssh)
- Update apt (sudo apt-get update)
- Install Java 8 JDK (sudo apt-get install openjdk-8-jdk)
- Install Java 8 Debug Symbols (sudo apt-get install openjdk-8-dbg)
- Install Tomcat 7 (sudo apt-get install tomcat7 tomcat7-examples)
- Configure JAVA_HOME in /etc/default/tomcat7 (JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64)
- Configure JAVA_OPTS in /etc/default/tomct7 (JAVA_OPTS="-Djava.awt.headless=true -Xmx1024m -XX:+UseG1GC -XX:+PreserveFramePointer -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints")
- Restart tomcat7 service (sudo service tomcat7 restart)
- Install cmake (sudo apt-get install cmake build-essential)
- Install Linux perf (sudo apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`)
- Download flamegraph (git clone --depth=1 https://github.com/brendangregg/FlameGraph)
- Download perf-map-agent (git clone --depth=1 https://github.com/jrudolph/perf-map-agent)
- Build perf-map-agent (cd perf-map-agent; export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64; cmake .; make)
At this point, Tomcat 7, Java 8, Linux Perf, FlameGraphs, and Perf Map Agent are all configured, and we're just about ready to generate a flame graph.
Applying Workload
Next step is to apply load to the Tomcat. From the host- Download JMetere 3.1
- Run JMeter
- Add a Thread Group: Number of Threads (users): 25, Ramp up period (in seconds): 30, Loop Count: Forever
- Add new Sampler, HTTP Listener. Server Name or IP: localhost. Port Number: 8080. Path: /examples/jsp/jsp2/el/basic-arithmetic.jsp
- Hit play
That is a basic test plan to drive the Tomcat basic arithmetic sample using 25 users ramped up over 30 seconds, forever. That's enough load to capture a decent profile. You might need to tweak JMeter HTTP Listener, depending on how your VM is set up. In my case, I have the VM port forwarding guest port 8080 to host port 8080.
Generating Flamegraph
With the VM configured and JMeter running, we can now generate a flame graph. From the home directory (~/) in the VM
- sudo perf record -F 99 -a -g -- sleep 30
- sudo ./FlameGraph/jmaps
- sudo chown root /tmp/perf-*.map
- sudo chown root perf.data
- sudo perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl --color=java --hash > flamegraph.svg
And with that, the Flame Graph is now generated!
You can copy the flame graph to the host as follows
You can copy the flame graph to the host as follows
- cp ~/flamegraph.svg /vagrant
This is put the flamegraph.svg file in the directory you ran the vagrant up command in, since by default vagrant syncs that folder.
Automation
To simplify this, I've added the Vagrantfile to my github repo, along with the JMeter Test plan.git clone https://github.com/jerometerry/perf.git cd ./perf vagrant up ... Start JMeter test plan vagrant ssh sudo ~/generate-flamegraph.sh exit
AWS EC2
The steps above are pretty close to what you would do to set this up in AWS EC2. Assuming you have an EC2 Ubuntu instance running with OpenJDK 8 installed, you would need to- SSH into the EC2 instance
- Update apt (sudo apt-get update)
- Install Java 8 Debug Symbols (sudo apt-get install openjdk-8-dbg)
- Add PreserveFramePointer to JAVA_OPTS "XX:+PreserveFramePointer"
- Restart Java process
- Install cmake (sudo apt-get install cmake build-essential)
- Install Linux perf (sudo apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`)
- Download flamegraph (git clone --depth=1 https://github.com/brendangregg/FlameGraph)
- Download perf-map-agent (git clone --depth=1 https://github.com/jrudolph/perf-map-agent)
- Build perf-map-agent (cd perf-map-agent; export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64; cmake .; make)
Once you have Linux Tools installed, the Java debug symbols installed, flamegraph and java-perf-agent configured, you can apply the workload to the Java process, then generate a flamegraph
- sudo perf record -F 99 -a -g -- sleep 30
- sudo ./FlameGraph/jmaps
- sudo chown root /tmp/perf-*.map
- sudo chown root perf.data
- sudo perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl --color=java --hash > flamegraph.svg
Then you would scp the flamegraph.svg file down to your local machine. For example, assuming my username on the EC2 instance is jerome_terry, and I have an SSH tunnel configured named EC2Instance, then I would just run
- scp jerome_terry@EC2Instance:/home/jerome_terry/flamegraph.svg ./
Safepoints
Nitsan Wakart recommended that I include the JVM options -XX:+UnlockDiagnosticVMOptions and -XX:+DebugNonSafepoints, and pass unfoldall to perfmap agent. Doing this paints a better picture of what's actually running on CPU.
Adding the JVM options is trivial (add it to JAVA_OPTS in the Tomcat example). Passing unfoldall to perfmap agent requires modification of Brendan Greggs jmaps script by changing
Adding the JVM options is trivial (add it to JAVA_OPTS in the Tomcat example). Passing unfoldall to perfmap agent requires modification of Brendan Greggs jmaps script by changing
net.virtualvoid.perf.AttachOnce $pid
to
net.virtualvoid.perf.AttachOnce $pid unfoldall
You'll also need to pass the ---inline flag to the stackcollapse-perf.pl script.
References
- Linux perf Examples
- Java in Flames
- The Flame Graph, Brendan Gregg, ACM Queue
- Blazing Performance with Flame Graphs
- Java Mixed Mode FlameGraphs - SlideDeck
- github.com/brendangregg/FlameGraph
- github.com/jrudolph/perf-map-agent
- Linux perf_events Off-CPU Time Flame Graph
- Java Warmup
- A Funny Thing Happened on the Way to Java 8 - Indeed Engineering
- Linux BPF Superpowers - Brendan Gregg, Performance @ Scale
- Ubuntu Version History (includes Kernel Versions)
- Systems Performance Book - Brendan Gregg

Comments
Post a Comment