Graph my Program

Recently (and since forever) I’ve been thinking about how tooling can be improved for programmers. We have essentially been writing source code in editors and IDE’s for many years now. Is it possible to improve visualization so that understanding larger codebases becomes simpler? Something like Windows Workflow Foundation (an orchestration framework) but for a larger scale.

I like UML diagrams but hate manually creating them. However at the same time there will be too much detail if everything in a class is converted to a diagram. So some sort of new way to see code. Contextual but yet powerful. And does it have to be 2D? How about seeing code in some new 3D view?

There are so many unknowns in this. But this is definitely that is an interesting problem to work on. So what’s the first step. For now changing the way I see things. In programs, one library has dependencies on other libraries and so on. In Java, one method in a class invokes another method. These relationships can actually be best expressed as a Graph. Consider a Class and a Method to be Nodes of a Graph. The relationships between these elements are the Edges. Like a Class -has- a Method. Or a Method -overrides- another Method. Other relationships are -extends-, -implements-, -invokes- etc. These are structural relationships that can be generated by parsing source or class files. A graph database can then be used to persist this model and also run queries.

Some thoughts are - Dependencies across packages, finding implementations of interfaces or classes, paths from one class to another. Special method names like main, or library specific points like @Controllers in spring can be tagged as entry points into a program. Visualizing the dependencies in a project is possible (which might convince you to refactor). Will it be possible to have more complex static analysis?

And this is not the first time that I have tried something similar. Onyem is a project that traces execution of java programs at the method level. In Onyem I stored metadata about classes and methods and then the connected the methods events to the metadata.

A great talk that I heard recently was - SQLite: Protégé of PostgreSQL. The big idea was to think about the model and the data structures rather than the algorithms. In Onyem, I realized that flat files were limiting and so moved to databases but that was not the best fit either. I always wanted to express more complicated queries which should be possible using a graph database. So let’s see where this leads.

“The game has changed”.