The seL4 organisation on GitHub uses git-repo to manage multiple source repositories, and so there are a large number of projects to get your head around when figuring out the ecosystem.

As an experiment, I have taken the various manifest files across the org, and constructed a graph based on how frequently each pair of repositories is mentioned in a manifest together. See below:

[This may render badly when syndicated outside of my blog; and also on small screens. And probably large screens. I’ve attempted to make sure there’s a non-JS fallback – on my site with JS enabled, if you hover over a node, it should highlight connected nodes.]

The colouring of the nodes is mostly manual; I experimented with graph clustering algorithms but have not found a satisfactory result so far. Still, some clusters are obvious:

  • Kernel – the seL4 microkernel proper. This often but not always co-exists with the main cluster of core libraries, but it is pulled away slightly by the verification and microkit manifests.

  • Verification – the verification repositories (l4v, HOL, graph-refine, polyml, isabelle) form a very distinct group. These are connected only to the seL4 microkernel itself, which is the only component formally verified.

  • Microkitmicrokit is a newer operating system framework that does not use CAmkES, so stands apart from the rest of the pack. I chose to scope this work to the seL4 org, so the LionsOS ecosystem and sDDF which are maintained by Trustworthy Systems are not shown. Also not linked is rust-sel4, because this modern world isn’t using git-repo in the main to manage its repositories.

  • RefOS – I’d not come across refos before, but it appears to be an example OS from 2021 built on the seL4 kernel.

It’s quite hard to pull apart the CAmkES framework and the core libraries; there are definitely some which are more associated with VM management, but the overall shape of this co-occurence data is a messy ball in the middle with some outliers in orbit. One observation is that camkes is correctly identified as more peripheral than camkes-tool, which contains the actual core CAmkES code.

Reflecting on this approach, in hindsight I’m surprised that using co-occurences worked as well as it did – there was no attempt to actually inspect the code and find direct mentions of other code e.g. library header dependencies. As the newer microkit effort largely eschews git-repo, better results might be found by actually taking that more detailed approach, so that graph edges could represent real dependencies between two packages. Additionally, this could allow diving into the various libraries held in the different ’libs’ repos, to get a more granular graph of relationships between them.

However, I think I spent more time on making it possible to render graphviz graphs easily on my blog than actually gaining any insight into the codebase!