Under the hood
A small performance fix that turned into test-case hell
go53 issue #45 was tagged "Low priority", and it looked the part: stop grabbing a lock every time we read the live config on the DNS hot path. The fix itself was small and standard. The cleanup it set off across the test suite turned out to be the larger half of the job — which is the part worth writing down.
The problem: a tiny lock, taken a lot
go53 keeps its runtime configuration in a single LiveConfig snapshot. Every reader went through GetLive(), which took a read lock and returned a copy:
func (cm *ConfigManager) GetLive() LiveConfig {
cm.mu.RLock()
defer cm.mu.RUnlock()
return cm.Live
}
Individually cheap. The catch is that a single DNS query calls it several times — DNSSEC checks, EDNS sizing, NSID, rate-limiting — and an RWMutex read lock isn't free under load. Its internal reader counter is a shared, atomically-updated cache line. Under many concurrent queries, every reader contends on that one line, and the "read lock" quietly becomes a scalability ceiling.
The fix: read-mostly wants an atomic pointer
Config is written rarely and read constantly — the textbook case for a lock-free copy-on-write swap. We moved the snapshot behind an atomic.Pointer[LiveConfig]. Readers do a single atomic load and a value copy; no mutex, no contention. Writers serialize on a dedicated mutex and publish a brand-new snapshot, so anything GetLive() hands out is immutable once it's seen.
type ConfigManager struct {
Base BaseConfig
writeMu sync.Mutex // serializes writers (copy-on-write)
live atomic.Pointer[LiveConfig] // lock-free reads
}
func (cm *ConfigManager) GetLive() LiveConfig {
if p := cm.live.Load(); p != nil {
return *p
}
return LiveConfig{}
}
While we were there, we also stopped re-fetching the snapshot mid-request: handleRequest now reads live once and reuses it. Fewer reads, and the reads that remain are nearly free.
Did it help? Yes — and it scales the right way
A micro-benchmark of the two patterns with the real LiveConfig struct (20-core machine, zero allocations either way):
| Scenario | RWMutex (old) | atomic.Pointer (new) | Speedup |
|---|---|---|---|
| Serial, no contention | 25.2 ns/op | 13.3 ns/op | ~1.9× |
| Parallel, 4 cores | 45.0 ns/op | 4.4 ns/op | ~10× |
| Parallel, 20 cores | 49.4 ns/op | 2.6 ns/op | ~19× |
The shape matters more than any single number. The RWMutex got slower as we added cores (27 ns → 49 ns) — that is the cache line bouncing between CPUs. The atomic version got faster per op (13.7 ns → 2.6 ns), because independent reads have nothing to contend over. Negative scaling became near-linear scaling.
…and then the test suite had opinions
This is where the "Low priority" label started to look optimistic. The old config exposed a public Live field, and 180 lines across 20 test files in 8 packages had been reaching in and setting it directly:
config.AppConfig.Live.DNSSECEnabled = false // no longer compiles
You can't set a field through an atomic.Pointer, so every one of those call sites had to move to GetLive(), a new SetLive(), or a clearly-labelled, test-only LiveForTest() helper. Five struct literals that set Live: directly had to be unrolled by hand. A one-line production change pulled twenty test files along with it.
Running the suite under Go's race detector also surfaced a data race in the DNSSEC tests — background signing goroutines reading config while a test mutated it. We checked the previous commit and confirmed the race was already present, so this change didn't introduce it; it just made it easy to see. It's logged for a separate follow-up.
Takeaways
- Read-mostly state suits atomic pointers: lock-free reads, immutable snapshots, copy-on-write writes — faster and safer at once.
- Benchmark the shape, not the spot: the win here isn't a few nanoseconds, it's a curve that bends the right way as cores grow.
- "Low priority" can be misleading: the implementation took minutes; the blast radius into the test suite took the rest of the afternoon. Encapsulation deferred is encapsulation billed later.
A small fix, a clear result, and a useful reminder about where the real cost of a change lives. Explore go53 →