David R. MacIver's Blog: Programmer at Large: Does that work?

Programmer at Large: Does that work?

13 May 2017

This is the latest chapter in my web serial, Programmer at Large. The first chapter is here and you can read the whole archives here or on the Archive of Our Own mirror. This chapter is also mirrored at Archive of Our Own.

Sam and I worked in companionable silence for about five kiloseconds, but eventually they had to go lead a Krav Maga session, so we kissed each other goodbye. Kimiko was still flagged as busy, so I took the opportunity to retreat to a pod to do some real work.

Uh, not that social network analysis isn’t real work you understand, it’s just not exactly in my remit. It’s a useful skill to keep your hand in on, but I try to avoid the trap of becoming a generalist.

I reviewed where I was on my current task: I still didn’t know much about what was going wrong, but had a hint that it was something to do with temperature events.

At this point I could do an exhaustive analysis and try to binary search out the exact problem. It would take ages and require a lot of detail work, but it would almost certainly work in narrowing down at least one real problem.

But I wasn’t really in the right frame of mind for detail work, so I decided to gamble instead.

“Ide, show me something interesting to do with the current task.”

“I have a temperature control program marked as critical that exhibits anomalous command output prior to the event and currently has a failing build. Is that suitable?”

“Perfect.”

Almost too perfect in fact. I wondered why that hadn’t that flagged up before.

“How many other equally interesting things could you have shown me?”

“113”

RIght.

“OK, call up the specs for this program.”

Subject: Stochastic Temperature Control Feedback Regulation Unit 3
Origin: New Earth 2
Language: Go#
Importance: Critical
Reliability: High
Obsolescence: High
Fragility: High
Notes: It might be best to leave now, you probably shouldn't touch this.

That wasn’t encouraging. Also, I wasn’t thrilled by the idea of learning about another weird Grounder programming language.

I sighed. Still, I wasn’t just going to stop without looking into it a bit.

“Wiki, show me the specs for Go#”

Subject: Go#
Category: Programming language, text based.
Lineage: Pre-diaspora, began as a dialect of Go in 2021.
Common Tags: Archaic, Esoteric, Moderate Complexity, Evolutionary Dead End, Poorly Thought Out.
Normalised Rating: Please tell me you're not still using this.

Definitely less than encouraging.

“OK, show me the failing build step.”

Ide displayed a bunch of code for me. I can’t say I understood any of it, but one thing stood out.

“Wiki, what’s Hypothesis?”

“Hypothesis is a generic term for a family of testing tools that were popular for a period of approximately five gigaseconds before and around the diaspora. They work by generating random data to run a conventional unit test against.”

“Wow, really? Does that work?”

“Significantly better than the methods that predate it. Unassisted humans tend to to be very bad at writing correct software, which results in many shallow bugs that simple random testing can uncover. However, it has largely been supplanted by modern symbolic execution and formal methods, as the number of bugs it finds grows logarithmically.”

“Ide, how long did it take to find this particular bug?”

“Approximately nine gigaseconds of compute time.”

Wow. This code had run for most of a crew lifespan before eventually finding a bug. That was rather adorable. I vaguely saluted whatever grounder was responsible for this thing, and reflected on how grateful I was to not be them and to have access to modern tooling.

“How long would it have taken given appropriate formal methods?”

“Difficult to estimate due to low availability of formal models for this language. However, based on the execution trace this is a known bug in OpenSSH, where the bug was found within the first four seconds of active testing of it under a more modern test suite written as a student exercise in a class on software archaeology on the Star Struck three gigaseconds ago coordinated time.

That was about what I expected.

“Wiki, what’s OpenSSH?”

“It is a secure network communication protocol, originally designed to provide remote access to a system via a local PTY.”

“What’s a PTY?”

“Warning: This information has been tagged as a memetic hazard, subcategory can of worms. Do you wish to proceed?”

I blinked. That was unexpected. I was almost curious enough to proceed anyway, but these warnings were usually worth taking seriously and they didn’t normally get attached to interesting awful information. Besides, this really wasn’t that relevant.

“No, that’s fine.”

I thought for a bit. I was pretty sure why this known bug was still present, but decided to check anyway.

“Ide, why has this bug not been fixed despite being known?”

“Due to the rating of this process as high in all of criticality, stability and fragility, it was flagged as an ultra-low priority fix.”

That’s what I thought. It works, but trying to fix it is probably more likely to break it, and then the plumbing backs up. Not unlike the problem I’d run into with the ramscoop, but the difference was that one this one was in my remit.

“Is this bug being triggered in the wild?”

“Unknown as to whether the particular sequence of events Hypothesis has found are present in the wild, but logs indicate that the underlying OpenSSH bug is triggered.”

“Is it being triggered in the vicinity of the anomalous event?”

“Yes”.

OK. So this was definitely a plausible culprit.

“Can we run a simulation of what would happen if this bug was fixed?”

“Warning: At current resource availability, such a simulation would take 0.8 gigaseconds to complete.”

“Ugh. Show me the cost curve.”

I looked at the cost curve. Right. All those game theory simulations the programmers at arms were running were taking up most of our spare capacity, and I didn’t have budget to outbid them on anything except the very tiny subset of capacity I had priority on.

Which wasn’t wholly unfair. But I now had evidence that a critical system was misbehaving and might be triggering an anomalous plumbing event, which was serious. Granted it was less important than the fate of humanity, but it might be higher priority. Time to try and free up some budget for simulation.

I sighed, and started putting together a report.

Next chapter: Can we speed that up?

Like this? Why not support me on Patreon! It’s good for the soul, and will result in you seeing chapters as they’re written (which is currently not very much earlier than they’re published, but hopefully I’ll get back on track soon).

Comments

Programmer at Large: Can we not? | David R. MacIver on 2017-05-13 13:45:57:

[…] Next chapter: Does that work? […]