Skip to main content

Generating Java Mixed Mode Flame Graphs

Overview

I've seen Brendan Gregg's talk on generating mixed-mode flame graphs and I wanted to reproduce those flamegraphs for myself. Setting up the tools is a little bit of work, so I wanted to capture those steps. Check out the Java in Flames post on the Netflix blog for more information.

I've created github repo (github.com/jerometerry/perf)  that contains the scripts used to get this going, including a Vagrantfile, and JMeter Test Plan.

Here's a flame graph I generated while applying load (via JMeter) to the basic arithmetic Tomcat sample application. All the green stacks are Java code, red stacks are kernel code, and yellow stacks are C++ code. The big green pile on the right is all the Tomcat Java code that's being run.


Tools

Here's the technologies I used (I'm writing this on a Mac).
  • VirtualBox 5.1.12
  • Vagrant 1.9.1
  • bento/ubuntu-16.04 (kernel 4.4.0-38)
  • Tomcat 7.0.68
  • JMeter 3.1
  • OpenJDK 8 1.8.111
  • linux-tools-4.4.0-38
  • linux-tools-common
  • Brendan Gregg's FlameGraph tools
  • Johannes Rudolph's Perf Map Agent tool 

Steps

Here's the steps to set up the VM
  1. Created a Ubuntu 16.04VM using VirtualBox / Vagrant (vagrant init bento/ubuntu-16.04; vagrant up)
  2. SSH into the VM (vagrant ssh)
  3. Update apt (sudo apt-get update)
  4. Install Java 8 JDK (sudo apt-get install openjdk-8-jdk)
  5. Install Java 8 Debug Symbols (sudo apt-get install openjdk-8-dbg)
  6.  Install Tomcat 7 (sudo apt-get install tomcat7 tomcat7-examples)
  7. Configure JAVA_HOME in /etc/default/tomcat7 (JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64)
  8. Configure JAVA_OPTS in /etc/default/tomct7 (JAVA_OPTS="-Djava.awt.headless=true -Xmx1024m -XX:+UseG1GC -XX:+PreserveFramePointer -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints")
  9. Restart tomcat7 service (sudo service tomcat7 restart)
  10. Install cmake (sudo apt-get install cmake build-essential)
  11. Install Linux perf (sudo apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`)
  12. Download flamegraph (git clone --depth=1 https://github.com/brendangregg/FlameGraph)
  13. Download perf-map-agent (git clone --depth=1 https://github.com/jrudolph/perf-map-agent)
  14. Build perf-map-agent (cd perf-map-agent; export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64; cmake .; make)

At this point, Tomcat 7, Java 8, Linux Perf, FlameGraphs, and Perf Map Agent are all configured, and we're just about ready to generate a flame graph.

Applying Workload

Next step is to apply load to the Tomcat. From the host
  1. Download JMetere 3.1
  2. Run JMeter
  3. Add a Thread Group: Number of Threads (users): 25, Ramp up period (in seconds): 30, Loop Count: Forever
  4. Add new Sampler, HTTP Listener. Server Name or IP: localhost. Port Number: 8080. Path: /examples/jsp/jsp2/el/basic-arithmetic.jsp
  5. Hit play
That is a basic test plan to drive the Tomcat basic arithmetic sample using 25 users ramped up over 30 seconds, forever. That's enough load to capture a decent profile. You might need to tweak JMeter HTTP Listener, depending on how your VM is set up. In my case, I have the VM port forwarding guest port 8080 to host port 8080. 

Generating Flamegraph

With the VM configured and JMeter running, we can now generate a flame graph. From the home directory (~/) in the VM
  1. sudo perf record -F 99 -a -g -- sleep 30
  2. sudo ./FlameGraph/jmaps
  3. sudo chown root /tmp/perf-*.map
  4. sudo chown root perf.data
  5. sudo perf script | ./FlameGraph/stackcollapse-perf.pl | grep -v cpu_idle | ./FlameGraph/flamegraph.pl --color=java --hash > flamegraph.svg
And with that, the Flame Graph is now generated!

You can copy the flame graph to the host as follows
  • cp ~/flamegraph.svg /vagrant
This is put the flamegraph.svg file in the directory you ran the vagrant up command in, since by default vagrant syncs that folder.

Automation

To simplify this, I've added the Vagrantfile to my github repo, along with the JMeter Test plan.

git clone https://github.com/jerometerry/perf.git
cd ./perf
vagrant up

... Start JMeter test plan

vagrant ssh
sudo ~/generate-flamegraph.sh
exit


AWS EC2

The steps above are pretty close to what you would do to set this up in AWS EC2. Assuming you have an EC2 Ubuntu instance running with OpenJDK 8 installed, you would need to

  1. SSH into the EC2 instance
  2. Update apt (sudo apt-get update)
  3. Install Java 8 Debug Symbols (sudo apt-get install openjdk-8-dbg)
  4. Add PreserveFramePointer to JAVA_OPTS "XX:+PreserveFramePointer"
  5. Restart Java process
  6. Install cmake (sudo apt-get install cmake build-essential)
  7. Install Linux perf (sudo apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`)
  8. Download flamegraph (git clone --depth=1 https://github.com/brendangregg/FlameGraph)
  9. Download perf-map-agent (git clone --depth=1 https://github.com/jrudolph/perf-map-agent)
  10. Build perf-map-agent (cd perf-map-agent; export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64; cmake .; make)
Once you have Linux Tools installed, the Java debug symbols installed, flamegraph and java-perf-agent configured, you can apply the workload to the Java process, then generate a flamegraph 

  1. sudo perf record -F 99 -a -g -- sleep 30
  2. sudo ./FlameGraph/jmaps
  3. sudo chown root /tmp/perf-*.map
  4. sudo chown root perf.data
  5. sudo perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl --color=java --hash > flamegraph.svg
Then you would scp the flamegraph.svg file down to your local machine. For example, assuming my username on the EC2 instance is jerome_terry, and I have an SSH tunnel configured named EC2Instance, then I would just run
  • scp jerome_terry@EC2Instance:/home/jerome_terry/flamegraph.svg ./

Safepoints


Nitsan Wakart recommended that I include the JVM options -XX:+UnlockDiagnosticVMOptions and -XX:+DebugNonSafepoints, and pass unfoldall to perfmap agent. Doing this paints a better picture of what's actually running on CPU.

Adding the JVM options is trivial (add it to JAVA_OPTS in the Tomcat example). Passing unfoldall to perfmap agent requires modification of Brendan Greggs jmaps script by changing


net.virtualvoid.perf.AttachOnce $pid 

to

net.virtualvoid.perf.AttachOnce $pid unfoldall

You'll also need to pass the ---inline flag to the stackcollapse-perf.pl script. 

References

Comments

Popular posts from this blog

Multi Threaded NUnit Tests

Recently I needed to reproduce an Entity Framework deadlock issue. The test needed to run in NUnit, and involved firing off two separate threads. The trouble is that in NUnit, exceptions in threads terminate the parent thread without failing the test.

For example, here's a test that starts two threads: the first thread simply logs to the console, while the other thread turfs an exception. What I expected was that this test should fail. However, the test actually passes.

readonly ThreadStart[] delegates = { () => { Console.WriteLine("Nothing to see here"); }, () => { throw new InvalidOperationException("Blow up"); } }; [Test] public void SimpleMultiThreading() { var threads = delegates.Select(d => new Thread(d)).ToList(); foreach (var t in threads) { t.Start(); } foreach (var t in threads) { t.Join(); } }
Peter Provost posted an article that describes how to make this test fail. It works…

Basic Web Performance Testing With JMeter and Gatling

Introduction In this post I'll give a quick way to get some basic web performance metrics using both JMeter and Gatling.

JMeter is a well known, open source, Java based tool for performance testing. It has a lot of features, and can be a little confusing at first. Scripts (aka Test Plans), are XML documents, edited using the JMeter GUI.  There are lots of options, supports a wide variety of protocols, and produces some OK looking graphs and reports.

Gatling is a lesser known tool, but I really like it. It's a Scala based tool, with scripts written in a nice DSL. While the scripts require some basic Scala, they are fairly easy to understand and modify. The output is a nice looking, interactive, HTML page.
Metrics Below are the basic metrics gathered by both JMeter and Gatling. If you are just starting performance testing, these might be a good starting point.

Response Time – Difference between time when request was sent and time when response has been fully received

Latency –…