Feb 27, 2025

DuckDB tricks - renaming fields in a SELECT * across tables

I was exploring some new data, joining across multiple tables, and doing a simple SELECT * as I’d not worked out yet which columns I actually wanted. The issue was, the same field name existing in more than one table. This meant that in the results from the query, it wasn’t clear which field came from which table:

Feb 3, 2025

Interesting links - February 2025

Here’s a bunch of interesting links and articles about data that I’ve come across recently.

Dec 11, 2024

Disabling Vale Linting Selectively in Asciidoc

I’m a HUGE fan of Docs as Code in general, and specifically tools like Vale that lint your prose for adherence to style rule.

One thing that had been bugging me though was how to selectively disable Vale for particular sections of a document. Usually linting issues should be addressed at root: either fix the prose, or update the style rule. Either it’s a rule, or it’s not, right?

Sometimes though I’ve found a need to make a particular exception to a rule, or simply needed to skip linting for a particular file. I was struggling with how to do this in Asciidoc. Despite the documentation showing how to, I could never get it to work reliably. Now I’ve taken some time to dig into it, I think I’ve finally understood :)

Sep 2, 2024

Current 2024 - 5k Fun Run (or Walk)

At Current 24 a few of us will be going for an early run (or walk) on Tuesday morning. Everyone is very welcome!

May 22, 2024

How I Try To Keep Up With The Data Tech World (A List of Data Blogs)

I do my best to try and keep, if not abreast of, then at least aware of what’s going on in the world of data. That includes RDBMS, Event streaming, stream processing, open source data projects, data engineering, object storage, and more. If you’re interested in the same, then you might find this blog useful, because I’m sharing my sources :)

May 3, 2024

ngrok DNS headaches

Let’s not bury the lede: it was DNS. However, unlike the meme ("It’s not DNS, it’s never DNS. It was DNS"), I didn’t even have an inkling that DNS might be the problem.

I’m writing a new blog about streaming Apache Kafka data to Apache Iceberg and wanted to provision a local Kafka cluster to pull data from remotely. I got this working nicely just last year using ngrok to expose the broker to the interwebz, so figured I’d use this again. Simple, right?

Nope.

Apr 26, 2024

How to stop AWS CLI clearing the screen

After a break from using AWS I had reason to reacquaint myself with it again today, and did so via the CLI. The AWS CLI is pretty intuitive and has a good helptext system, but one thing that kept frustrasting me was that after closing the help text, the screen cleared—so I couldn’t copy the syntax out to use in my command!

The same thing happened when I ran a command that returned output - the screen cleared.

Here’s how to fix either, or both, of these

Mar 15, 2024

🏃🚶 The unofficial Kafka Summit London 2024 Run/Walk 🏃🚶

At this year’s Kafka Summit I’m planning to continue the tradition of going for a run (or walk) with anyone who’d like to join in. This started back at Kafka Summit San Francisco in 2019 over the Golden Gate Bridge and has continued since then. Whilst London’s Docklands might not offer quite the same experience it’ll be fun nonetheless.

Mar 15, 2024

Apache Flink talks at Kafka Summit London 2024

This year Kafka Summit London includes a dedicated track for talks about Apache Flink. This reflects the continued rise of interest and use of Apache Flink in the streaming community, as well as the focus that Confluent (the hosts of Kafka Summit) has on it.

I’m looking forward to being back at Kafka Summit. I will be speaking on Tuesday afternoon, room hosting on Wednesday morning, and hanging out at the Decodable booth in between too.

Here’s a list of all the Flink talks, including the talk, time, and speaker. You find find more details, and the full Kafka Summit agenda, here.

Jan 17, 2024

Antora Deployment to Cloudflare Across Private Repositories with GitHub Actions

At Decodable we migrated our docs platform onto Antora. I wrote previously about my escapades in getting cross-repository authentication working using Private Access Tokens (PAT). These are fine for just a single user, but they’re tied to that user, which isn’t a good practice for deployment in this case.

In this article I’ll show how to use GitHub Apps and Installation Access Tokens (IAT) instead, and go into some detail on how we’ve deployed Antora. Our GitHub repositories are private which makes it extra-gnarly.

rmoff’s random ramblings

✨ Data Engineering, Kafka, and other random geekery 🤓