So, this particular rabbit hole started when I, to my dismay, was running out of diskspace on one of my disks, filling it up with local AI models. I wanted to use a pretty and nice tool to start reclaiming space, something using a sunburst visualization. After some quick research, I couldn’t find anything that I really liked. Nothing was pretty, except DaisyDisk on Mac. But that was only available on Mac. And there was nothing out there that allowed you to interact with it whilst data was being collected. Also the stuff that I found was committed to just one of many useful visualizations. And I wanted to dynamically be able to switch theme. And coloring. And <insert impossibly long list of wants>…
Since there was a bunch of other more technical things I’d wanted to try for a while anyways, I decided to go ahead and use this project as an excuse to try them:
- Is it possible to reasonably use GraalVM native image to make binary desktop apps with Java?
- Is it possible to use GraalVM with JavaFX?
- If I need to do native bindings, how does that work in GraalVM?
I had already used GraalVM native image for various other projects where it makes sense, such as in my email MCP server, but it had always mostly been for smaller Java projects where I needed quick startup and low memory utilization. To make this one work well, I was pretty sure I’d need to do platform specific magic to make it as fast as can be.
My Friend the AI Model
This project was a proper collaboration between me and my AI models, in an area where they usually are very strong — a new greenfield project. Well defined outcomes. Very strong design cues and up-front technology choices. I worked on the design document first, together with the AI. I laid out incremental goals, like first trying to get the thing to compile with a simple scene, trying to get a simple native binding to work, etc. Before implementing anything, that design/architectural document was discussed, debated, edited and finalized. Next, I was providing guidance and help every step of the way, sometimes coding actively when I felt things were hitting a dead end and I could just do the right thing quicker myself.
Design language is pretty hard to communicate, and abstract ideas like how much space something needs to take to “look just right”, or the particular shade of grey that is required to make something “look good” are so subjective and very hard to get to a good place without a lot of back and forth.
At the same time, very technical work was sometimes (not always) executed flawlessly, saving tons of time. Did this project go quicker than if I had done it just on my own? Undoubtedly so. All the hunting down weird linkage problems, setting up the project to build properly on all platforms. All that tedious, no fun work, went massively faster. I spent most of the time doing back and forth with the AI model and getting it to do the right thing, and providing it with all the ideas and context to be able to solve most of its problems with as little additional context as possible. Which brings me to…
Performance
You’d think that performance in the rendering would not ever become a problem in a project like this, but lo and behold, it did. Every once in a while when scanning my biggest and most diversely populated disks (C: on my Windows box), it would freeze. I first suspected GC. Since I was going to do the analysis with JFR anyways, I also introduced custom JFR events for context, to make it easier to attribute the JFR data to actual real world events.
In the end, I added four of them:
- Scan
One event per scan run (carrying strategy, root path, file count, total bytes, using the @DataAmount annotation to ensure everything is rendered properly in JMC). - Render
A render event per UI repaint (mode, trigger, node count, canvas dimensions). - VisualizationActive
An event spanning the time a specific visualization was active, with a repaint counter. - UserAction
An event per keypress or click.
The events were all stitched together via a per-scan correlation ID, so that it’s easy to contrast different scans to each other and pivot between scans and all the renders, visualizations and user actions changing them. For the render event, I disabled stack trace capturing. It wouldn’t be that exciting given the other events I have, not to mention the additional cost of walking the stacks over and over again.
@Name("se.hirt.diskspace.Render")
@Label("Visualization Render")
@Category({"DiskSpace", "UI"})
@Description("One repaint of the visualization on the JavaFX UI thread.")
@StackTrace(false) // 60+ Hz: per-event stack capture unnecessary work here
public static class RenderEvent extends Event {
@Label("Mode")
@Description("SUNBURST or HEATMAP.")
String mode;
@Label("Trigger")
@Description("scan-update / mode-change / resize / ...")
String trigger;
@Label("Node Count")
@Description("Live file count at render time.")
int nodeCount;
@Label("Width")
int widthPx;
@Label("Height")
int heightPx;
@Label("Scan ID")
@Description("Correlation ID matching the Scan event.")
long scanId;
}
JFR isn’t properly supported in the GraalVM JDK version (21) used by GluonFX, so I ran the project in plain Java when measuring. I then ran the recordings through my JMC AI view (coming in JMC 10, I’ll write more about it later), which was able to quickly root cause the algorithmic complexity in the squarified tree map renderer as the culprit, and suggest improvements.
Properly chosen contextual events is as usual King, and the self describing nature of JFR makes adding them, with the proper metadata, a breeze for the right AI tools to use. No parser updates, or any additional changes required to have them use it other than just adding it.
A Few More Fun Bits
This project has more under-the-hood stuff than fits one post, but a few bits are worth calling out:
The Windows MFT Scanner
Walking a directory tree the obvious way (FindFirstFile / FindNextFile) issues a
syscall per file just to read its size. On a C: drive with millions of files, that can add up to minutes. NTFS has a much better idea: the Master File Table is already a giant table of every file’s metadata, and Windows exposes the whole thing via FSCTL_ENUM_USN_DATA. Read it once, get every file on the volume in a few seconds.
The catch is that it needs SeBackupPrivilege, which means UAC elevation. That’s the reason DiskSpace on Windows has an auto-elevation preference (Ask once / Always / Never). Once elevated, DiskSpace will drop all privileges except for the backup one, and scans drop from minutes to seconds on the biggest drives I can find to throw at it.
The macOS Bulk Scanner
The macOS equivalent of the MFT trick is getattrlistbulk(2). Instead of one stat syscall per file, it returns metadata for roughly 500–800 entries per call. Combined with a per-directory ForkJoinPool and parallelism tuned to the storage profile, APFS scans finish very quickly.
The storage-profile detection (the thing that decides “parallel-16 for NVMe, sequential for spinning disks”) walks the IORegistry via DiskArbitration + IOKit to find the underlying media type. Same plumbing diskutil info uses internally, just routed through GraalVM’s @CFunction bindings rather than the command line.
Interact While It’s Still Scanning
The thing that bugged me about every other disk-usage tool I tried: you click “scan”, wait until it finishes,
then you can explore. With DiskSpace the sunburst materialises as data streams in. You can drill into a folder before its size is even known, switch tabs, switch visualization modes (sunburst ↔ treemap), or kick off a second scan on a different drive — all without waiting for anything to complete.
The volume picker has the same property. If you have a dodgy SD card that mounts but doesn’t respond, the old picker would block startup for thirty seconds or more. The new one, thanks to one virtual thread per volume root, shows up instantly and fills in each row as it resolves, with the slow ones spinning quietly until they decide to answer.
So Animate, Much LERPs, Wow
The drill-in / drill-out animation on the sunburst (the one where you click a sector and the whole visualization smoothly reflows so that sector becomes the new centre) is built out of nothing but linear interpolation and a single ease-out curve. The layout system computes the before and after sector positions (depth, start angle, sweep angle) and the animation just lerps each value from one to the other over a few hundred milliseconds.
The fun bits aren’t in the lerp itself, they’re in what you give it at the endpoints. New sectors that
weren’t visible before the drill, deeper nodes that only show up at the new zoom level, start their animation
with sweep = 0 at the midpoint of their final position. So instead of popping into existence, they blossom outward from a single angular point until they reach full width. Departing sectors do the inverse: they collapse to their own midpoint and fade to transparent. The outgoing inner ring stays in place and fades out while the new one fades in over it.
Native Image + JavaFX + Native Bindings
GraalVM native-image + Gluon Substrate + JavaFX gives you a single-binary desktop app with native application startup times. The platform-specific APIs (the MFT scanner, getattrlistbulk,
DiskArbitration, IOKit, the Win32 storage queries) are all bound through GraalVM’s @CFunction annotations — a handful of lines of Java per native function, no JNI shim, no separate C side, no -Djava.library.path dance at runtime. The native-image build resolves the symbols at link time against the system libraries, and at runtime the call overhead is comparable to a regular C function call. Clean and fast.
Sub-pixel Arc Culling
A small rendering trick on the sunburst side: when an arc would project to less than one screen pixel, it’s literally invisible — so don’t issue the draw call at all. On deep trees with millions of leaves that drops the per-frame node walk to a fraction of what it would otherwise be. Same principle a frustum-culling game engine uses, just in polar coordinates.
TL;DR
- The AI was immensely helpful getting to the bottom of various linkage problems and ensuring that the project would build on the various platforms.
- AI ended up doing much of the typing for this greenfield project.
- Even the latest models need a ton of supervision and course corrections to yield good results.
- With today’s AI models, building well still takes the same skill as building it yourself.
- Yes, GraalVM native image works well for building native desktop apps with Java. GluonFX bridges it well to JavaFX, and @CFunction makes native bindings painless.
I just released 0.3.1 of DiskSpace here:
https://github.com/thegreystone/diskspace/releases
This release marks the first community contribution! Many thanks to Scott Gerring (@scottgerring,
https://github.com/scottgerring) for adding APFS clone deduplication to the macOS bulk scanner, so that scan totals on Mac now match Disk Utility and df properly.
Please give it a spin, and if you like it, consider giving it a star on GitHub. This will help me getting free signing of the DiskSpace artifacts from the SignPath foundation in the future.



