Using GraalVM, JavaFX and a Clanker to Build a Cross Platform Desktop App

So, this particular rabbit hole started when I, to my dismay, was running out of diskspace on one of my disks, filling it up with local AI models. I wanted to use a pretty and nice tool to start reclaiming space, something using a sunburst visualization. After some quick research, I couldn’t find anything that I really liked. Nothing was pretty, except DaisyDisk on Mac. But that was only available on Mac. And there was nothing out there that allowed you to interact with it whilst data was being collected. Also the stuff that I found was committed to just one of many useful visualizations. And I wanted to dynamically be able to switch theme. And coloring. And <insert impossibly long list of wants>…

Since there was a bunch of other more technical things I’d wanted to try for a while anyways, I decided to go ahead and use this project as an excuse to try them:

  1. Is it possible to reasonably use GraalVM native image to make binary desktop apps with Java?
  2. Is it possible to use GraalVM with JavaFX?
  3. If I need to do native bindings, how does that work in GraalVM?

I had already used GraalVM native image for various other projects where it makes sense, such as in my email MCP server, but it had always mostly been for smaller Java projects where I needed quick startup and low memory utilization. To make this one work well, I was pretty sure I’d need to do platform specific magic to make it as fast as can be.

DiskSpace showing the default sunburst visualization.

My Friend the AI Model

This project was a proper collaboration between me and my AI models, in an area where they usually are very strong — a new greenfield project. Well defined outcomes. Very strong design cues and up-front technology choices. I worked on the design document first, together with the AI. I laid out incremental goals, like first trying to get the thing to compile with a simple scene, trying to get a simple native binding to work, etc. Before implementing anything, that design/architectural document was discussed, debated, edited and finalized. Next, I was providing guidance and help every step of the way, sometimes coding actively when I felt things were hitting a dead end and I could just do the right thing quicker myself.

Design language is pretty hard to communicate, and abstract ideas like how much space something needs to take to “look just right”, or the particular shade of grey that is required to make something “look good” are so subjective and very hard to get to a good place without a lot of back and forth.

At the same time, very technical work was sometimes (not always) executed flawlessly, saving tons of time. Did this project go quicker than if I had done it just on my own? Undoubtedly so. All the hunting down weird linkage problems, setting up the project to build properly on all platforms. All that tedious, no fun work, went massively faster. I spent most of the time doing back and forth with the AI model and getting it to do the right thing, and providing it with all the ideas and context to be able to solve most of its problems with as little additional context as possible. Which brings me to…

DiskSpace, now in DarkMode, showing the squarified tree map visualization.

Performance

You’d think that performance in the rendering would not ever become a problem in a project like this, but lo and behold, it did. Every once in a while when scanning my biggest and most diversely populated disks (C: on my Windows box), it would freeze. I first suspected GC. Since I was going to do the analysis with JFR anyways, I also introduced custom JFR events for context, to make it easier to attribute the JFR data to actual real world events.

In the end, I added four of them:

  1. Scan
    One event per scan run (carrying strategy, root path, file count, total bytes, using the @DataAmount annotation to ensure everything is rendered properly in JMC).
  2. Render
    A render event per UI repaint (mode, trigger, node count, canvas dimensions).
  3. VisualizationActive
    An event spanning the time a specific visualization was active, with a repaint counter.
  4. UserAction
    An event per keypress or click.

The events were all stitched together via a per-scan correlation ID, so that it’s easy to contrast different scans to each other and pivot between scans and all the renders, visualizations and user actions changing them. For the render event, I disabled stack trace capturing. It wouldn’t be that exciting given the other events I have, not to mention the additional cost of walking the stacks over and over again.

@Name("se.hirt.diskspace.Render")
@Label("Visualization Render") 
@Category({"DiskSpace", "UI"})
@Description("One repaint of the visualization on the JavaFX UI thread.")
@StackTrace(false) // 60+ Hz: per-event stack capture unnecessary work here
public static class RenderEvent extends Event {
  @Label("Mode")
  @Description("SUNBURST or HEATMAP.")
  String mode;
  @Label("Trigger")
  @Description("scan-update / mode-change / resize / ...") 
  String trigger; 
  @Label("Node Count")
  @Description("Live file count at render time.")
  int nodeCount;
  @Label("Width")
  int widthPx;
  @Label("Height")
  int heightPx;
  @Label("Scan ID")
  @Description("Correlation ID matching the Scan event.")
  long scanId;
}

JFR isn’t properly supported in the GraalVM JDK version (21) used by GluonFX, so I ran the project in plain Java when measuring. I then ran the recordings through my JMC AI view (coming in JMC 10, I’ll write more about it later), which was able to quickly root cause the algorithmic complexity in the squarified tree map renderer as the culprit, and suggest improvements.

Properly chosen contextual events is as usual King, and the self describing nature of JFR makes adding them, with the proper metadata, a breeze for the right AI tools to use. No parser updates, or any additional changes required to have them use it other than just adding it.

A Few More Fun Bits

This project has more under-the-hood stuff than fits one post, but a few bits are worth calling out:

The Windows MFT Scanner

Walking a directory tree the obvious way (FindFirstFile / FindNextFile) issues a
syscall per file just to read its size. On a C: drive with millions of files, that can add up to minutes. NTFS has a much better idea: the Master File Table is already a giant table of every file’s metadata, and Windows exposes the whole thing via FSCTL_ENUM_USN_DATA. Read it once, get every file on the volume in a few seconds.

The catch is that it needs SeBackupPrivilege, which means UAC elevation. That’s the reason DiskSpace on Windows has an auto-elevation preference (Ask once / Always / Never). Once elevated, DiskSpace will drop all privileges except for the backup one, and scans drop from minutes to seconds on the biggest drives I can find to throw at it.

The macOS Bulk Scanner

The macOS equivalent of the MFT trick is getattrlistbulk(2). Instead of one stat syscall per file, it returns metadata for roughly 500–800 entries per call. Combined with a per-directory ForkJoinPool and parallelism tuned to the storage profile, APFS scans finish very quickly. 

The storage-profile detection (the thing that decides “parallel-16 for NVMe, sequential for spinning disks”) walks the IORegistry via DiskArbitration + IOKit to find the underlying media type. Same plumbing diskutil info uses internally, just routed through GraalVM’s @CFunction bindings rather than the command line.

Interact While It’s Still Scanning

The thing that bugged me about every other disk-usage tool I tried: you click “scan”, wait until it finishes,
then you can explore. With DiskSpace the sunburst materialises as data streams in. You can drill into a folder before its size is even known, switch tabs, switch visualization modes (sunburst ↔ treemap), or kick off a second scan on a different drive — all without waiting for anything to complete.

The volume picker has the same property. If you have a dodgy SD card that mounts but doesn’t respond, the old picker would block startup for thirty seconds or more. The new one, thanks to one virtual thread per volume root, shows up instantly and fills in each row as it resolves, with the slow ones spinning quietly until they decide to answer.

So Animate, Much LERPs, Wow

The drill-in / drill-out animation on the sunburst (the one where you click a sector and the whole visualization smoothly reflows so that sector becomes the new centre) is built out of nothing but linear interpolation and a single ease-out curve. The layout system computes the before and after sector positions (depth, start angle, sweep angle) and the animation just lerps each value from one to the other over a few hundred milliseconds.

The fun bits aren’t in the lerp itself, they’re in what you give it at the endpoints. New sectors that
weren’t visible before the drill, deeper nodes that only show up at the new zoom level, start their animation
with sweep = 0 at the midpoint of their final position. So instead of popping into existence, they blossom outward from a single angular point until they reach full width. Departing sectors do the inverse: they collapse to their own midpoint and fade to transparent. The outgoing inner ring stays in place and fades out while the new one fades in over it.

Native Image + JavaFX + Native Bindings

GraalVM native-image + Gluon Substrate + JavaFX gives you a single-binary desktop app with native application startup times. The platform-specific APIs (the MFT scanner, getattrlistbulk,
DiskArbitration, IOKit, the Win32 storage queries) are all bound through GraalVM’s @CFunction annotations — a handful of lines of Java per native function, no JNI shim, no separate C side, no -Djava.library.path dance at runtime. The native-image build resolves the symbols at link time against the system libraries, and at runtime the call overhead is comparable to a regular C function call. Clean and fast.

Sub-pixel Arc Culling

A small rendering trick on the sunburst side: when an arc would project to less than one screen pixel, it’s literally invisible — so don’t issue the draw call at all. On deep trees with millions of leaves that drops the per-frame node walk to a fraction of what it would otherwise be. Same principle a frustum-culling game engine uses, just in polar coordinates.

TL;DR

  • The AI was immensely helpful getting to the bottom of various linkage problems and ensuring that the project would build on the various platforms.
  • AI ended up doing much of the typing for this greenfield project.
  • Even the latest models need a ton of supervision and course corrections to yield good results.
  • With today’s AI models, building well still takes the same skill as building it yourself.
  • Yes, GraalVM native image works well for building native desktop apps with Java. GluonFX bridges it well to JavaFX, and @CFunction makes native bindings painless.

I just released 0.3.1 of DiskSpace here:
https://github.com/thegreystone/diskspace/releases

This release marks the first community contribution! Many thanks to Scott Gerring (@scottgerring,
https://github.com/scottgerring) for adding APFS clone deduplication to the macOS bulk scanner, so that scan totals on Mac now match Disk Utility and df properly.

Please give it a spin, and if you like it, consider giving it a star on GitHub. This will help me getting free signing of the DiskSpace artifacts from the SignPath foundation in the future.

Claude and My Email

I declared bankruptcy on my personal inbox a long time ago. There were tens of thousands of emails sitting in there, and my various folders and rules were only partly helping. Since I’ve been having fun with AI for quite some time now, I thought I might try to do something about it. Since I have multiple different providers (Google, Apple, my own mail server, and so on) I decided to have IMAP and SMTP be the common denominator.

I am very hesitant to run OpenClaw on any of the computers on my network, and I’m mostly spending time in Claude Code and the Claude App anyways, so I decided to build an MCP server with tools needed to facilitate the workflow, and then create a skill that can be scheduled in the Claude App. I quite like having the results rendered in proper tables and structured summaries.

The MCP Server

The first version was slow and token-heavy. I quickly realized that mail headers are enormous these days, so I changed the triage flow to be a bit more lightweight while still keeping access to the full headers when needed. There is still a lot of useful information in them.

I also discovered that the Claude app won’t accept binary data from MCP servers other than images, which meant I couldn’t directly access information in PDF attachments. Fortunately, there are great Java libraries for extracting text from PDFs, so adding that support was straightforward.

To make email reading more efficient, HTML emails are converted to Markdown. That cuts the size dramatically, whilst still preserving important structural cues for the AI.

Since this is an MCP server, I also didn’t want to require people to need to have a full Java runtime installed. And since the server doesn’t do especially heavy processing, peak runtime performance wasn’t the main concern. I decided to build it as a GraalVM native image and optimize for memory footprint and startup time instead.

Here is the MCP server, with instructions on how to install it in both Claude Code and the Claude App:
https://github.com/thegreystone/mcp-email

In early experiments, after interacting with it a bit, I asked it whether there were any email conversations with my friends where I dropped the ball. It identified two. Correctly. Ugh. I also asked it things like:

Please find any emails from my tax lawyers around the taxation year 2025, identify all the necessary documentation asked for, and provide an action plan for getting that documentation.

It did very well. Being able to read PDF attachments turned out to be especially useful.

I also often ask it to prepare drafts for me, especially when I need to communicate in German, for example with authorities or businesses. The draft ends up in the Drafts folder, properly set up with the right thread, recipients, and so on, leaving me only to decide whether to send it as-is or make a few edits first.

Initially, this was my favourite prompt:

Please help triage the emails in my inboxes. As per usual, flag anything that is actionable. Don’t hesitate to read emails that you are uncertain about. For anything actionable, provide an action plan. For anything non-actionable, file according to the folder layout.

Adding a Skill

Once the MCP server is installed, adding a skill to Claude is as easy as asking it to create one for itself. Claude has a skill-creator tool that it can use for this.

I asked it to do something along these lines (it was actually edited and improved a few times iteratively, so this is a rough approximation):

Please create a skill using the skill-creator that triages the email across all my accounts in parallel. Flag anything that is actionable. Don’t hesitate to read emails that you are uncertain about. For anything actionable, provide an action plan. For anything non-actionable, file according to the folder layout. At the end of the triage, provide a summary of everything that was done by folder.

The skill should:

  1. Load a persistent ACTIONABLE.md file at the start of each session (path configurable), carry forward unresolved items, and update it at the end.
  2. Run triageCompact on all accounts in parallel (unread first, then a pass over read-but-unfiled emails).
  3. Read full emails when context is needed (e.g. financial, GitHub notifications, ambiguous senders, shipping, calendar invites).
  4. Identify and move spam (will be used to train SpamAssassin later), leave pre-classified Junk alone.
  5. Flag actionable emails before filing.
  6. File everything according to the existing folder hierarchy (if an additional folder would be useful, suggest it).
  7. Cross-reference flagged emails against open ACTIONABLE.md items to detect resolutions.
  8. Produce a structured summary with action plan.

Key resolution logic: Only mark items resolved with positive evidence (delivery confirmed, payment received, PR closed, reply sent). When in doubt, keep open and ask.

Initially, the skill had no memory between sessions. Each scheduled run would rediscover the same actionable items, and it couldn’t tell whether something had been resolved since last time. This became obvious when I asked about an ongoing conversation with my tax lawyers; Claude couldn’t find it because the earlier session was no longer accessible. The fix was simple: a persistent ACTIONABLE.md file that the skill reads at the start of each triage and updates at the end. Open items carry forward, resolved ones get cleared (but only with positive evidence — a delivery confirmed, a payment received, a reply sent), and new items get added. Getting the file location right took a couple of iterations. Claude initially computed it via a convoluted relative path before I pointed out that with the workspace folder configured, ./ACTIONABLE.md was all it needed. It’s just a Markdown file on disk, but it gives the scheduled runs the continuity they were missing.

A quick note on SpamAssassin: the reason I move detected spam to a separate folder rather than deleting it is that I use it to train SpamAssassin. Over time, the obvious spam gets caught before it ever reaches Claude, saving tokens and triage effort.

With the skill installed in both Claude Code and the Claude App, there is no longer any need to write long prompts every time. You just run the skill.

Claude Cowork

With a skill, it becomes very easy to run the triage on a schedule. Simply go to the Cowork tab in Claude App, click on Scheduled on the left, and hit New task on the right.

image

Add the prompt you want to run, for example:

Triage my emails using the email skill.

Fill out the details, like the frequency (e.g. daily), the time (e.g. 18:30), the model to use for the work, and a folder where you want Claude to have read and write access.

Considerations

Let’s be clear about what this setup does: it autonomously moves, files, and flags your email. If it gets something wrong, an important email could end up in the wrong folder or, worse, in Spam. You should go into this with your eyes open.

I would not have tried this without a solid backup strategy for my emails. I do regular IMAP backups, so if something goes sideways, I can recover. That said, I’ve been running this daily for over two months now and haven’t lost anything. The skill is also deliberately conservative — it prefers moving emails to deleting them, and requires explicit approval for deletions.

You may also want to think about where you run the MCP server. I run it on a Mac mini that I mostly use for testing and that doesn’t contain anything sensitive. Running it in a virtual machine, or at least under a user account with tightly scoped permissions, is worth considering. The MCP server holds the IMAP and SMTP credentials and connects directly to your mail servers; Claude never sees those credentials. It only interacts with your email through the tools the MCP server exposes. This isn’t fundamentally different from how any email client works — Mail.app, Thunderbird, and Outlook all hold your credentials and maintain persistent connections to your mail servers. The difference is that this one takes actions autonomously, so please treat it accordingly.

There’s also the question of trust. The skill does a genuinely impressive job. It correctly identified dropped conversations with friends, it handles German-language correspondence with authorities, and it went from 30,000+ emails to fewer than ten in my inbox at any given time. But “impressive” isn’t “infallible.” I review the triage summaries after each run, and I’d recommend you do the same, at least until you’ve built up confidence in how it handles your particular email patterns. I may have to eat these words one day, but today I trust it to do a better job than me at keeping track of my ongoing email conversations.

TL;DR

I declared email bankruptcy at 30,000+ unread emails and built an MCP server exposing tools over IMAP/SMTP so Claude can triage, file, and draft replies across all my accounts — no third-party service required. A scheduled skill in Claude Cowork now runs the whole workflow daily, keeping my inbox down to a handful of actively tracked items.

A Pretty Shitty Week

I am writing this, since I often process crappy stuff by writing about it, and also since I know that my family isn’t reading my blog. A long time ago, my now oldest daughter Natalie (and therefore my wife and I) experienced what was then the scariest day of our life. We almost lost Natalie that time. Earlier this week it all came crashing over us again.

My daughter came home from school complaining about stomach ache. She said that it would probably pass in a bit, but my mind was already racing. We waited for a bit, but the pain didn’t seem natural and wasn’t getting any better. Given her history we weren’t taking any chances, and I convinced my daughter that it is better to go once too many and that she would sleep better tonight if a doctor told her that it wasn’t anything scary. I drove directly to the nearest large hospital, LUKS, and went straight for the emergency room. Meanwhile my wife stayed at home to search for a letter from the Karolinska Hospital describing her prior medical history.

When we arrived Natalie was already in a lot of pain, but her usual brave self. I told the emergency crew about her previous history and they hooked her up to fluids. Meanwhile my wife had found the letter and sent a photo of it to my phone. The emergency crew read it and asked me to email it to them. I believe this made them take this more seriously, which was turned out to be good.

They did an ultra sound, after which they called the surgeons who in turn asked for a CT scan. Whilst waiting for the CT scan to be done, the nurse asked me how I was doing. I must admit that I broke down. As a parent, I guess one of the hardest things you do is to try to stay clear headed and strong for your kids, and try not to show them how scared you are. I was terrified, and all the memories from that day almost 15 years ago came crashing down over me again.

After this we were directed to an emergency room, and they put in a stomach probe through my daughter’s nose. Everyone was friendly and professional, but my heart broke when I heard my daughter ask the nurse “will I die”, and the nurse replied “many things will have to go wrong for that to happen”. That is such a dumb thing to say to very scared 16 year old girl. Natalie looked over at me, absolutely terrified. I tried my best to reassure her and to let her know that nothing was going to happen, and that she was at the best possible place and that she was getting the best possible care.

They tried to blow air through the probe to get things started and gave her fluids and painkillers. They said they wanted to avoid surgery if they could, and wanted to wait to see if the treatment would sort it out. Meanwhile I tried calling my insurance company to ask if there was any way I could stay with her. I was certainly not going to leave her there alone. I couldn’t get through, but luckily I had installed the insurance company app, and it turned out I had a Flex insurance which allowed for “Rooming In”, which meant we got a room for just the two of us and that I could stay with her. We got to the room at 2 am or so. She was still in pain, and it was hard for her sleep with the tubes and lines. She is such a strong kid though.

For a while it seemed like the treatment was improving things a little. My wife gathered some clothes for Natalie and she came over to stay with her, and I went home to take care of the other kids. Then things went downhill. Yesterday they made the decision to perform surgery. They had three options. They said that they would try to do peephole surgery, but that in the worst case they’d have to open her up, with another two options there. If we were lucky it would be over in 15 minutes. She was rolled into surgery around 9 a.m., and we didn’t hear anything back until after 11. My wife got to see her after 12. Apparently there were three separate places where there intestines were obstructed.

Today she is on the mend. She has eaten a little bit of soup, and as far as I know she got to keep it. She’ll likely be in the hospital for a few more days, but it looks like the crisis has been averted for this time. Please, pretty please, let this be the last and final one. I’m not sure I can stand watching her in that kind of pain again.

“Best of Java” Feature Face-off 2025

The results are in from this year’s “Best of Java” Feature Face-Off on Twitter/X!

While Mission Control once again secured a win in the Tools category, it ultimately (and perhaps not too surprisingly) lost against Virtual Threads — arguably the most impactful runtime feature added to Java in the past years.

jmc-feature-faceoff

In the last edition of the Feature Face-Off (2020), JDK Mission Control won it all:

faceoff

Here’s to hoping there are even more exciting language and/or runtime features in the face-off for 2030! Smile

If you’re interested in open source development, and would like to get involved in the development of JDK Mission Control, feel free to reach out:

TL;DR:

  • 2020: JDK Mission Control won the “Best of the JDK” Feature Face-Off

  • 2025: JDK Mission Control won the Tools category

  • 2025: Virtual Threads won the final and took the overall crown

What is that countdown all about?

If you’ve visited hirt.se lately, you may have noticed a new countdown ticking away. Since a few people have reached out to ask what it means, I figured I should probably explain myself. In my defence, the counter was really just added for the family. 

The short answer: when the timer hits zero, my family and I will be moving to a new house. Still in Küssnacht — I can literally see the new place from where we live now, just a few hundred meters away.

So no, it’s not a doomsday clock, and it’s definitely not heralding the rapture. It’s just a move.

Knowing how these things typically go, expect the usual moving-day(s) comedy of errors: internet disruptions, servers bumping down staircases, spontaneous cable sacrifices… If the blog disappears for a bit, that’s why.

JDK Mission Control 9.1.1 Released!

Important: If you are using JDK Mission Control on Windows – update to this version!

The source release of JMC 9.1.1 was tagged 2025-07-11. As per usual it may take some time until vendors have binary builds of JDK Mission Control available.

Mission Control 9.1.1 – New and Noteworthy


General


JMC 9.1.1 – New Release!
This is the latest (July 2025) release of JDK Mission Control. JMC 9.1.1 requires JDK 21+ to run and introduces only a platform update that fixes a critical issue on Windows. This version continues to support connecting to, and parsing JFR recordings from, OpenJDK 8u272+ and Oracle JDK 7u40+, and can open and visualize flight recordings from JDK 7 and 8. JDK Mission Control is available for Windows (x86_64), Mac OS X (ARM and x86_64), and Linux (ARM and x86_64).


Eclipse 4.36 support
The Mission Control client is now built to run optimally on Eclipse 2025-06 and later. To install JDK Mission Control into Eclipse, go to the update site (Help | Install New Software…). The URL to the update site will be vendor specific, and some vendors will instead provide an archive with the update site.


Bug Fixes


Area: JFR
Issue: 7003
Synopsis: The graph view, heatmap view and dependency view now work on Windows

The update to the latest Eclipse platform includes an updated browser component for windows, that enables the Graph View, Heatmap View, and Dependency View to now work properly.

9.1.1-windows

JDK Mission Control 9.1.0 Released!

The source release of JMC 9.1.0 was tagged 2025-01-31. As per usual it may take some time until vendors have binary builds of JDK Mission Control available.

Here are the release notes:

Mission Control 9.1 – New and Noteworthy


General


JMC 9.1 – New Release!
This is the latest (January 2025) release of JDK Mission Control. JMC 9.1 requires JDK 21+ to run and introduces several new features, enhancements, and bug fixes. This version continues to support connecting to, and parsing JFR recordings from, OpenJDK 8u272+ and Oracle JDK 7u40+, and can open and visualize flight recordings from JDK 7 and 8. JDK Mission Control is available for Windows (x86_64), Mac OS X (ARM and x86_64), and Linux (ARM and x86_64).

jmc


Eclipse 4.34
The Mission Control client is now built to run optimally on Eclipse 2024-12 and later. To install JDK Mission Control into Eclipse, go to the update site (Help | Install New Software…). The URL to the update site will be vendor specific, and some vendors will instead provide an archive with the update site.

eclipse


Support for Jolokia JMX Connection and Discovery
JMC now supports connecting to JVMs using Jolokia, and also supports Jolokia’s auto discovery mechanism.

jolokia


Minor bugfixes and improvements
There are 64 fixes and improvements in this release. Check out the JMC 9.1 Result Dashboard (https://bugs.openjdk.org/secure/Dashboard.jspa?selectPageId=23411) for more information.

issues


Core


API to easily write annotated Java JFR events
The JFR Writer API has been extended with the ability to use the custom JFR event types (i.e. classes extending jdk.jfr.Event) and register new writer type for them and also directly accept the instances of those types to write them in the recording.

jfreventsupport


Allow primitive types in converters
Previously a converter could not be used to convert from a primitive type like long. This capability can for example be useful to convert a timestamp (type long) into a human readable string.

agentconverters


Rule for detecting Java process with PID 1
PID 1 is treated specially in Linux, and is assumed to be used by the init process. The init process has some additional responsibilities, such as assuming the responsibility for orphaned processes. The init process is assumed to never quit, and e.g. any signal handler registered for SIGSEV will not be run. This can be problematic, even when running in containerized environments. The rule will detect if this is the case, and propose a path to fixing it (e.g. using tini).

rule


The halt rule result has been improved
The halt rule will now show a table of the top 5 thread halting VM operations.

rule


Better descriptions for the code cache rule
The descriptions for the code cache rule have been improved, highlighting what the effects of a full code cache can be and what actions to take to increase the size of the code cache.

rule


Add support for the new allocation profiler in rules
The following rules have been updated to be able to use the new ObjectAllocationSample events: AllocationByClassRule, AllocationByThreadRule, AutoBoxingRule.

rule


Java Flight Recorder (JFR)


JMC displaying long value in scientific notation
The TLSHandshakeEvent.java records “Certificate Id” as a long value, and JMC was showing it in scientific notation. This was also true for process identifiers. This has now been fixed.

noimage[1]


Showing RSS
JMC will now show the the resident set size (RSS), both on the memory page as well as the java application page. The RSS can for example be helpful when trying to determine if there is a native memory leak or heavy native memory fragmentation occurring. The RSS graphs can be toggled on and off using the check box legends to the right of the graphs.

rss


Showing thread counts
JMC will now show thread counts on the java application page. This can for example be useful when trying to determine if there is a thread leak. The thread count graphs can be toggled on and off using the check box legends to the right of the graphs.

threadcounts


Thread id on hover
In the threads page, hovering over a thread name will show the threadid in the tooltip.

threadid


Bug Fixes


Area: Platform
Issue: 8306
Synopsis: Missing plug-ins while installing JMC in an Eclipse IDE

Mission Control would fail to install because some third party libraries were not included in the update site. This has now been fixed.

Area: Core
Issue: 8295
Synopsis: Shutdown event type id was not properly translated for Oracle JDK 8

The shutdown event type id for legacy Oracle JDK 8, leading to the shutdown time and shutdown reason not being properly displayed on the JVM Internals page.

Area: Core
Issue: 8287
Synopsis: Fix the JMX protocol extenders

The JMC protocol extension mechanism was broken in JMC 9.0.0 when some code was migrated from application to core. This has now been fixed.

Area: Core
Issue: 8303
Synopsis: NPE when running jfr rules reports

The rules report could throw NPEs when an accessor for an attribute could not be found. That has been fixed, and the faulty query responsible for the reported problem has been fixed as well.

Area: JFR
Issue: 8248
Synopsis: Low contrast for the stacktrace view when running in windows high contrast mode

With high contrast mode enabled in Windows 7 and above the contrast on the Stacktrace View was quite low, with the values being barely visible. This has now been fixed.


Known Issues


Area: JFR
Issue: 7071
Synopsis: JMC can’t attach to jlinked JVMs

This one is still under investigation, but it seems JMC can’t attach to certain jlinked images.

Area: JFR
Issue: 7003
Synopsis: The graph view, heatmap view and dependency view does not work on Windows

This is due to a problem with the Windows based browser component in SWT. We’re hoping for a fix in the component for a future version of the Eclipse platform.

AI Assistant for OpenJDK Contributors

Subtitle: ”Soon You’ll Know What I Did This Summer”

This summer my family and I decided to stay in Küssnacht during my summer vacation. The summers here are lovely, and for a family of six going for a vacation abroad it typically means a lot of time is spent preparing, packing and unpacking, for both directions of the journey. A staycation seemed like a good choice. Most days were spent in the water with the family, and on one occasion, we were joined by a team member and his family. It can get rather hot here, and the lake is the perfect place to cool down.

Me on a SUP overlooking part of my family on another SUP on Lake Lucerne

That said, I did end up in front of my computer a bit in the evenings after everyone else had fallen asleep. There were a few things I really wanted to get done this vacation. One was upgrading my server. Another one was to put some left over Raspberry Pi 4s and a few new Raspberry Pi 5s to good use in my 19” rack, creating my very own a little Kubernetes cluster.

Recently I was also playing with Custom GPTs, which is OpenAI’s way of letting you build little RAG (Retrieval Augmented Generation) systems without writing code. I first played a bit with Custom GPTs on a train ride back from the Datadog office in Paris a little while ago, and it was a quite lot of fun, but for some of the things I wanted it to do, it performed rather badly, and I couldn’t find any fitting APIs available to help my Custom GPT do better.

Since I was setting up a Kubernetes cluster anyway, I thought I might as well hack something together to make my OpenJDK Project Assistant a little bit better.

Setting Up the Cluster

Ah, the pain and the anguish. I ended up using my Apache server as a proxy (since I still want https://hirt.se to serve my old homepage and blog) for an Nginx that is acting as reverse proxy and load balancer to my Kubernetes cluster.

If anyone wants to know how I set up the cluster, that is an entirely different blog that probably already exists in a thousand versions, written by people with more patience to write about configuration than me, so I will not dwell on that. Suffice to say that I consulted many blogs and Claude 3.5 Sonnet to get it all up and running. Here are the steps in short:

  • Format and setup the Raspberries with Raspberry Pi OS (64-bit) (I used the Raspberry Pi Imager).
  • I added my Pis to the hosts file of my server and to my dhcpd config by MAC address, so that I can refer to the Pis by name instead of IP (I’m using dnsmasq).
  • I installed kubernetes on them, which involved properly enabling cgroups in /boot/firmware/cmdline.txt, setting up containerd properly, getting the debian package for k8s, installing kubelet, kubeadm, kubectl, turning off swap, and finally using kubeadm init to create the cluster and then applying flannel. Then using the join command you get from performing the init to have the other Pis join the cluster.

There was a bunch of trial and error, and there is probably a much easier way. Anyway, I now have a cluster. I can easily deploy. I can easily scale. What more could you ever wish for?

The API exposed to the CustomGPT is available to the world at https://api.hirt.se and described here, should anyone be interested. The implementation is available on GitHub – feel free to contribute if the custom GPT isn’t doing what you think it should be doing or if you simply want to make it more capable.

What Can the OpenJDK Project Assistant Do?

It can do lots of things. It can get information about open PRs and repos without overwhelming GPT with information that will invariably make it fail (this is what typically happened before the api.hirt.se API). For example, here’s getting the oldest open PR for the JDK project:

image (3)

It can summarize the information in the PRs:

image (4)

 

You can also ask for information about the related bug:

image

It can answer questions around people involved in OpenJDK:

image

For people like me, who have a really hard time remembering names, it can help answer questions about people involved in OpenJDK:

image

And about projects and groups:
image

Please let me know if you find it useful!

Summary

  • I recently had a stay-at-home vacation in Küssnacht.
  • Aside from spending a lot of the time in Lake Lucerne with my family, I did have some fun building a Kubernetes cluster out of my Raspberry Pis.
  • I put the cluster to good use to support my OpenJDK Project Assistant Custom GPT.
  • If you’re one of the few OpenJDK committers that also have access to Custom GPTs, or you simply like the idea, feel free to put a star on my API repo, or even better – contribute things you want it to be able to do, or do better!
  • Silly bonus: I also noticed that the silly old slogan image generator I was using on my homepage isn’t available under https (after my various upgrades, I now have permanent redirects to always use https). I couldn’t find any replacement that worked as a drop-in replacement, so I made my own variant that is now also running in my little cluster. Smile

Picture looking out over lake Lucerne after vacation dinner with my wife.

JDK Mission Control 9.0.0 Released!

The 9.0.0 GA release of JDK Mission Control was just tagged in the JMC repo at GitHub! Since this is the source release, it may still take a bit of time until the downstream vendors release binary builds of JDK Mission Control 9.0.0. I will try to remember to tweet or say something on the JMC Facebook page once the binaries start showing up.


Mission Control 9.0 – New and Noteworthy


General


JMC 9 – New Release!
This is the latest (2024) major release of JDK Mission Control. JMC 9 requires JDK 17+ to run and introduces several new features, enhancements, and bug fixes. This version continues to support connecting to, and parsing JFR recordings from, OpenJDK 8u272+ and Oracle JDK 7u40+, and can open and visualize flight recordings from JDK 7 and 8. JDK Mission Control is available for Windows (x86_64), Mac OS X (ARM and x86_64), and Linux (ARM and x86_64).

jmc


Eclipse 4.30 support
The Mission Control client is now built to run optimally on Eclipse 2023-12 and later. To install JDK Mission Control into Eclipse, go to the update site (Help | Install New Software…). The URL to the update site will be vendor specific, and some vendors will instead provide an archive with the update site.

eclipse


Support for Linux/aarch64
JMC 9 is now built for Linux aarch64.

linuxaarch64


Support for dark mode
JMC 9 now supports dark mode. Go to Preferences, General | Appearance, and select the Dark theme to enable.

darkmode


Minor bugfixes and improvements
There are 118 fixes and improvements in this release. Check out the JMC 9.0 Result Dashboard for more information.

issues


Add user configuration for local JVM refresh interval
Previously the JVM Browser checked every 5000 ms for new JVMs. This can now be configured.

refreshinterval


Core


Better JFR parser performance
Multiple efforts have been made to reduce allocations in the JMC parser, including: reduced allocation of Doubles, reduced allocation rate in ParserStats. Also, when duration events aren’t ordered by their end time (e.g. events which stack so that the last event finishes first, or file reads with overlaps) `DisjointBuilder.add` can be slow because of the linear search for the lane, and then a linear time reordering. This has been improved with a binary search.

jfrperformance


Support checkpoint event sizes beyond u4 limit
The JMC JFR parser now support checkpoint event sizes beyond the u4 limit.

checkpointsize


Move non-Eclipse dependent classes from org.openjdk.jmc.ui.common to org.openjdk.jmc.common
There were a number of classes previously in jmc.ui.common that would be a great asset to the core distribution (and the third-party applications that consume jmc-core), and these classes now live in jmc.common. Please see JMC-7308 for further information.

reorganize


Move rjmx bundle from application to core
The rjmx classes and related services (FlightRecorderService) are now exposed for third-party application usage. Please see JMC-7069 for further information.

reorganize


Move org.openjdk.jmc.flightrecorder.configuration bundle from application to core
The org.openjdk.jmc.flightrecorder.configuration bundle contains many classes useful for working with jfr, and are now available in core. Please see JMC-7307 for further information.

reorganize


Java Flight Recorder (JFR)


The Event Browser now supports searching and showing event type ids
Searching in the search bar now also searches event type IDs, and there is also a (by default hidden) column that makes it easy to show the event type IDs for the shown events.

eventtypeid


Add support for enabling jfr on native images
Previously JMC was unable to start flightrecorder on a graalvm native image, even if there is built-in jfr support. This has now been fixed.

native-image-alt


Java based flamegraph visualization
The previous flamegraph visualization takes place in an embedded browser component (provided by the Eclipse platform), unfortunately this approach has some drawbacks, the first being a bit slow. This view is now using a Java (Swing) based flamegraph library. Also, the flame graph model creation performance have been improved.

flamegraph


Visualization and Rule for FileChannel.force()
The File I/O page has been updated to show force related information. There are two new columns added – Force Count and Update Metadata. Both are hidden by default and can be enabled by right clicking the table. The chart will also include a File Force row. There is a preference setting for the associated file force rule, where the peak duration warning limit can be set. See JMC PR#533 for more information.

fileforce


Rule that checks on G1 pause time target compliance
New rule that looks at the pause time target and compares it to the actual pauses.

rule


Rule that looks at finalization statistics
JDK 18 comes with a FinalizationStatistics event that helps users find where in their application finalizers are run. This is important as finalization has been deprecated for removal in a future release. For more information about finalization and its flaws, see https://openjdk.java.net/jeps/421. Even if an application doesn’t implement any finalize() methods, it may rely on third-party libraries that does. Static analysis of third-party libraries using “jdeprscan –for-removal” can be used to list those classes, but it will not tell if they are being used. For example, an application may be missing a call to a close() method, so the resource is cleaned up by the finalizer, which is sub-optimal.

rule


Rule that detects GC Inverted Parallelism
Rule inspired by the “Inverted Parallelism” analysis in Garbagecat. See JMC-8144 for more information.

rule


Support for the new JPLIS agent events
There is now a new page and rule for loaded JPLIS agents. See JMC-8054 for more information.

agent


Twitter plug-in removed
Due to changes in APIs and cost of maintenance, the Twitter plug-in has been removed.

twitterplugin


Bug Fixes


Area: Agent
Issue: 8045
Synopsis: retransformClasses() doesn’t re-transform all needed classes

The retransformClasses() methods in Agent and AgentController use Class.forName() to try to get the class objects of classes needed to re-transform. This obviously doesn’t work for classes loaded by classloaders different from the one which loads the agent. Those classes would be instrumented if they were loaded after their event probes were defined the AgentController. But when loaded earlier they would not be instrumented. This has been fixed.

Area: Agent
Issue: 8048
Synopsis: Agent throws exceptions on missing or empty descriptions

When the description of an event or value is empty or missing, the agent fails with exceptions. This has now been fixed.

Area: Console
Issue: 8154
Synopsis: Some JMX attributes are missing unit specifications in the Console

The missing unit specifications have now been added.

Area: Core
Issue: 8063
Synopsis: IMCFrame Type cache not synchronized

The type cache used in the IMCFrame Type inner class wasn’t synchronized and could cause a concurrent modification exception during e.g. JFR parsing. This has been fixed.

Area: Core
Issue: 8156
Synopsis: JfrRulesReport.printReport does not respect verbosity for text and json

The verbosity flag for text and json reports didn’t work. This has been fixed.

Area: Core
Issue: 8041
Synopsis: JfrRulesReport json reports produce incomplete results

While generating JFR Rules Reports in json format, the results were incomplete. The components “message” and “detailedMessage” were not populated. This has been fixed.

Area: JFR
Issue: 7885
Synopsis: Graphical rendering of dependency view fails due to heap memory drain

Also JMC-7496. The dependency view drains the heap memory and causes out-of-memory exceptions and performance delays. This has been improved.


Known Issues


Area: JFR
Issue: 7071
Synopsis: JMC can’t attach to jlinked JVMs

This one is still under investigation, but it seems JMC can’t attach to certain jlinked images.

Area: JFR
Issue: 7003
Synopsis: The graph view, heatmap view and dependency view does not work on Windows

This is due to a problem with the Windows based browser component in SWT. We’re hoping for a fix in the component for a future version of the Eclipse platform.

What is this thing called Profiling?

If you’re reading this blog (I’m originally posting this on hirt.se), you probably already know who I am and my background. As some of you may know, one of my current responsibilities at Datadog is the Continuous Profiler product. After some discussions with profiling team members, I found it interesting that there are many subtly different ideas about what profiling is – often influenced by what a particular ecosystem is calling profiling.

So, here are my (unsolicited) thoughts around what software profiling is. 😉

What’s in a Word

Profiling literally means trying to understand behaviour. When asked to define Profiling, Google will say this:

“the recording and analysis of a person’s psychological and behavioural characteristics, so as to assess or predict their capabilities in a certain sphere or to assist in identifying categories of people.”

This is analogous to what we typically mean when we talk about the profiling of software, which I would simply state as:

“The recording and analysis of a program’s runtime behaviour.”

We’re simply trying to understand how it behaves (recording data), and why it is behaving that way (analysis of the data), so that we can improve some aspect of the program. There are other techniques that we can use to understand the semantic behaviour of a program, such as using debuggers. In profiling though, we’re trying to understand the runtime behaviour – how the program (your code) is behaving in terms of utilization of constrained or costly resources, such as CPU, memory, locking primitives and other potentially thread latency inducing operations and so on.

As to why we want to understand the runtime behaviour, there are a lot of different reasons these days:

  • To support performance engineering, for example:

    • Reducing the cost of running a program.

    • Making the software run faster (throughput).

    • Making the software run smoother (e.g. less latency outliers, less STW-interruptions, less variance).

    • Understanding performance deltas between versions of a program.

    • Optimizing resource utilization.

  • To aid in diagnostics, for example:

    • Help discovering and understanding unknown-unknowns.

    • To help explain deviations from normal behaviour, e.g. why there was suddenly a 40 second delay in the execution of a particular program at a particular point in time.

    • Help provide a link to the source code in the context of something else the user cares about, making whatever occurred more actionable.

I would argue that there are a few additional constraints for a profiler to be truly usable these days:

  • The Heisenberg Observer Effect notes that we can’t observe a system without affecting it. That said, a profiler that materially changes the runtime behaviour of the software it is profiling is not very useful – it will make us bark up the wrong tree.

  • Because of this, profilers will be making trade-offs. If the profiler, in the quest to not affect the runtime behaviour of the software it is profiling, misrepresents the runtime behaviour too much, it is also not very useful.

  • Also, since it is notoriously difficult to build a test which will perfectly mirror the system behaviour on Black Friday, 8:00 p.m. in production, these days you typically want a profiler that has a low enough overhead (it will be too costly otherwise), and that is stable enough, that you can use it continuously in production. A continuous profiler is a very powerful tool for finding the unknown-unknowns, especially when you’re in a tough spot.

  • With the uptake of continuous integration and continuous delivery practices (CI/CD), a new version of a program can be published every few hours, or even more often than that. You will want, at least, to have production data around for every version you publish, and probably from multiple different time periods during the process lifecycle. (Of course, with continuous profiling, this point is moot – you have data for all time periods, should something happen.)

Sampling Profilers

Today, most profilers will be sampling profilers. A sampling profiler is a type of profiler that collects data about a program’s execution by periodically sampling the program’s state at specific intervals. In contrast to other profilers, which typically capture every function call or at specific runtime events, sampling profilers gather information by intermittently observing the program’s execution state.

This means that the correlation with a certain runtime characteristic will depend on when the sample was taken. To do CPU profiling, simply wait for a thread to use up a certain amount of CPU time, then signal the thread and take a sample. To do allocation profiling, wait until a certain amount of memory has been allocated, then take a sample (in the allocation path of the runtime). To do lock profiling, wait until a monitor has been waited on for a certain amount of time, then take the sample (in the appropriate monitor handling path of the runtime). The reason for why one must sample, is that tracing every method/function invocation will cause too much overhead, quite possibly affecting the runtime behaviour of the application.

A sampling profiler will try to sample uniformly over some quantity, for example every 9 ms of CPU-time consumed. This gives some rather nice statistical properties. It is easy to aggregate the samples and relate them to that quantity – “this method is on average using 456.32 ms of CPU time / s”, “that method is responsible for an allocation rate of 845 MiB / s (which in turn is why your garbage collector is running hot)”.

Note that these sampling profilers do not need to pre-aggregate data to be proper sampling profilers. With the advent and adoption of the pprof format, this is sometimes assumed, but there are plenty of sampling profilers that also capture the time the sample was taken. This makes the samples much more useful for diagnostics. One recent example was a Go service where it wasn’t discovered, until the time stamps were preserved, that the service has bursts of activity for a couple of 10s of milliseconds every 10 seconds, which stood out very well in a heat map, once time stamp information was included per sample. Collecting timestamps per sample (and adding context) helps immensely with diagnostics, but more on this later.

We don’t necessarily need stack traces for this to be profiling. We capture whatever is needed to understand how we came to present the observed behaviour. That said, having no execution context at all, for example a simple performance metric, will usually not be enough to satisfyingly help with the analysis part. It’s usually understood that the stack trace will be one of the primary sets of data included in the sample, since it is indeed very useful in many cases and languages.

Also note that the data production rate can still be hard to understand even with a sampling profiler. For CPU it’s relatively easy – the upper limit will be #cpus * average sample size / sample interval. For allocation sampling, it was hard enough that we (Datadog) introduced a new rate limited allocation profiler in OpenJDK, conceptually using a discrete PID controller to control how to subsample (we can’t use reservoir sampling, since we don’t want to do the work up front, and then decide which samples to keep), and also record the amount of memory allocated since the last sample in each sample to be able to normalize.

Execution Tracers and Event Recorders

Another kind of profilers are the so-called execution tracers. An execution tracer instrument certain operations in the runtime and typically provide events around them, often containing useful diagnostics information. For example, the monitor event in JFR will contain information about the monitor class, the thread holding on to the monitor blocking our hero thread, the address of the monitor (so that we can see if there are multiple monitor instances at play), and more. Note that sampling profilers (especially runtime specific profilers) can capture such information as well, so the difference is mostly in how the sample is taken.

Since emitting data for every invocation of a pathway in the runtime can be prohibitively expensive, tools like JFR will provide configuration options to subsample the data in different ways. This can be useful, for example, for outlier profiling. An example of outlier profiling are all the thread latency events in JFR, for example the monitor enter event, where you can specify to only pick events with a duration longer than a specified interval.

Serious drawbacks of execution tracers, even the ones that only capture specific events, are that:

  • Unless there is a subsampling strategy, and even then, the amount of data emitted can be very hard to reason about.

  • Depending on the subsampling strategy, some aggregations will be harder to reason about. For example, if you’re looking at latency outliers, you can paint a picture of where the latency outliers are, but you will not be able to tell what the average latency is.

Performance Engineering

Performance engineering is the black art of optimizing the software to do better on some runtime characteristics. To do that, profiling data is analyzed, for example learning where CPU is spent, so that the program can be optimized to use less resources (e.g. CPU), which in turn makes it less costly to run (need less hardware). Supporting performance engineering is what people most commonly will associate with profiling, and for that use case, it’s most commonly CPU profiling that springs to mind. Many programming languages/runtimes will have the concept of a stack, and ultimately, the underlying hardware will have the concept of hardware threads, and stacks to execute. Therefore, to understand how we came to execute a particular method or function, profilers will often capture a stack trace. Depending on the programming language and environment (such as frameworks used) this can be more or less useful, as practitioners of reactive programming and async frameworks will be very well familiar with. (Project Loom is in a way a response to this problem. Make Stack Traces Great Again! 😉 )

Both sampling profilers and execution tracers can be put to good use to understand the runtime profile of a program. For example, if you demand a little bit more of your distributed tracer and the tracer integrations, and keep tabs of when threads are doing work in the context of a certain trace / span / operationName, you can start aggregating profiling information by endpoint, for example showing the amount of CPU-time spent by a specific endpoint.

endpointprof

Using Profiling for Diagnostics

Of course, when you have samples that contain context and time information, you can also go look at that information when something goes spectacularly wrong. For example, looking at a timeline view of the threads involved in processing a part of a distributed operation that was painfully slow, can reveal a lot. When something goes wrong, and it has not been a priori instrumented by the tracer, logging or some other instrumentation, profiling data is often the last resort for explaining what went wrong. In other words, it can help understand the unknown unknowns.

Here are some screenshots that hopefully will give you some idea of the capability:

Screenshot 2023-06-28 at 17.22.03

Screenshot 2023-06-28 at 17.07.38

Screenshot 2023-06-28 at 17.10.51

For some examples using this feature (in Go), see Felix Geisendörfer’s YouTube video.

Note that these screenshots were from profiling timelines for specific spans in a distributed trace. It’s also possible to look at a timeline for all the threads in the runtime.

Some time ago we had a 21 second span that remained unexplained until the profiling data showed that it was a safe pointing VM operation related to dumping the heap – someone had ssh:d into the machine and used jcmd to request a full heap dump. Had it not been for recording outliers for safe pointing VM operations, this could have been hard to explain. Profilers purely written in Java and using exceptions to force stack walks, or using the Java stack walking APIs, would never know better. For them it would have been like the world would have been stopped, and the only visible effect would have been that the clock suddenly skipped ahead 21 seconds.

TL;DR

  • Profiling is the recording and analysis of the runtime behaviour of a program.

  • Profiling can not only be used for performance engineering, but it can also be a very powerful diagnostic tool.

  • Profiling samples often contain stack traces, but in some paradigms, stack traces will not be the most helpful thing to explain why something ended up being called.

Many thanks to my colleagues at Datadog for all the awesome work they do, and for the feedback on this post.