This is where I write on the internet.

Data Diodes

At ArgoCon today, Thomas Fricke gave a nice talk on Cloud Native Deployments in Air Gapped Environments describing container vulnerability scanning in the German energy sector… and since he didn’t mention data diodes, and since some of my colleagues at Oakdoor/PA Consulting make data diodes for a living, I thought this might be interesting to write about!

Git internals and SHA-1

LWN reminds us that Git still uses SHA-1 by default. Commit or tag signing is not a mitigation, and to understand why you need to know a little about Git’s internal structure. Git internally looks rather like a content-addressable filesystem, with four object types: tags, commits, trees and blobs. Content-addressable means changing the content of an object changes the way you address or reference it, and this is achieved using a cryptographic hash function.

Exploring StackRox

At the end of March, the source code to StackRox was released, following the 2021 acquisition by Red Hat. StackRox is a Kubernetes security tool which is now badged as Red Hat Advanced Cluster Security (RHACS), offering features such as vulnerability management, validating cluster configurations against CIS benchmarks, and some runtime behaviour analysis. In fact, it’s such a diverse range of features that I have trouble getting my head round it from the product page or even the documentation.

Reflections on OSSF London 2021

On Tuesday I attended the Open Source Strategy Forum in London, which is a meeting of the Fintech Open Source Foundation (FinOS), part of the Linux Foundation. (There is a New York version coming up in November for those across the pond.) The morning keynotes included Gabriele Columbro introducing the day, then Russell Green highlighting the progress FinOS has made; Liz Rice of CNCF fame with an inspiring talk about contributing back to upstream; an interesting conversation between Nick Cook and Jane Gavronsky about innovations in financial regulation, and finally a presentation from Andrew Agerbak of BCG about how open source can help banks move to public cloud.

GCP - Planning for the Worst

Last month, Google Cloud published Planning for the Worst: Reliability, Resilience, Exit and Stressed Exit in Financial Services. This happens to be a topic I have previously worked on, so I was very interested to hear the perspective that GCP would bring. The wider industry context here is that regulators are very interested in potential risks to the financial system arising from the wholesale migration to cloud computing; in March 2021 the Prudential Regulation Authority in the UK published two supervisory statements closely related to the topic, including Outsourcing and third party risk management, which introduces the concept of a “stressed exit”.

Maglev Load Balancers

Maglev is the codename of Google’s Layer 4 network load balancer, which is referred to in GCP as External TCP/UDP Network Load Balancing. I read the 2016 Maglev paper to better understand various implementation details of Maglev with an emphasis on security (in particular as affects availability). Maglev uses a scale-out approach, implemented within clusters built from commodity hardware achieving n+1 redundancy, providing greater tolerance to failure compared with traditional hardware load balancers deployed in pairs (only 1+1 redundancy).

Google Workspace Super Admins

I recently had cause to remind myself of Google Workspace administrator account best practices. Briefly: Set up separate admin accounts, e.g. [email protected] to exist side-by-side with [email protected]. Keep accounts individually identifiable, and ideally ensure there are multiple Super Admins in your organization.1 Avoid using [email protected] for day-to-day use. One of these Super Admin accounts must be set as the primary account contact, but (due to the previous point) you’re unlikely to be checking the emails very often.

Go Trie Benchmarks

After writing a trie I wanted to better understand its performance, so I wrote some benchmarks against various other Go implementations for storing UK postcodes. At some point since the new year I entirely replaced the implementation from my last post with one that more closely matches the “pure” trie described at the start of TAOCP 6.3; i.e. a table of nodes, consisting of a list of entries, where each node entry can be either a link to another node, or a key (that is, an entire string stored in the trie).

Golang Trie

A trie (pronounced either “tree” or “try”) is a data structure typically used to store a set of strings in a way that allows looking up by prefix efficiently - i.e. unlike a hashmap where the keys are randomly ordered - this makes it a reasonable choice for an autocompletion system. A possible advantage over binary trees is that the keys are not stored in full in each node - so if you have a large number of strings which often have overlapping prefixes (e.g. “cat”, “cats”, “catastrophe”) then you may be able to save memory.

Tim Retout

A solution architect