Nov 18

Language engineering for great justice

Whole-systems engineering, when you get good at it, goes beyond being entirely or even mostly about technical optimizations. Every artifact we make is situated in a context of human action that widens out to the economics of its use, the sociology of its users, and the entirety of what Austrian economists call “praxeology”, the science of purposeful human behavior in its widest scope.

This isn’t just abstract theory for me. When I wrote my papers on open-source development, they were exactly praxeology – they weren’t about any specific software technology or objective but about the context of human action within which technology is worked. An increase in praxeological understanding of technology can reframe it, leading to tremendous increases in human productivity and satisfaction, not so much because of changes in our tools but because of changes in the way we grasp them.

In this, the third of my unplanned series of posts about the twilight of C and the huge changes coming as we actually begin to see forward into a new era of systems programming, I’m going to try to cash that general insight out into some more specific and generative ideas about the design of computer languages, why they succeed, and why they fail.

Continue reading

Nov 13

The big break in computer languages

My last post (The long goodbye to C) elicited a comment from a C++ expert I was friends with long ago, recommending C++ as the language to replace C. Which ain’t gonna happen; if that were a viable future, Go and Rust would never have been conceived.

But my readers deserve more than a bald assertion. So here, for the record, is the story of why I don’t touch C++ any more. This is a launch point for a disquisition on the economics of computer-language design, why some truly unfortunate choices got made and baked into our infrastructure, and how we’re probably going to fix them.

Along the way I will draw aside the veil from a rather basic mistake that people trying to see into the future of programming languages (including me) have been making since the 1980s. Only very recently do we have the field evidence to notice where we went wrong.

Continue reading

Nov 07

The long goodbye to C

I was thinking a couple of days ago about the new wave of systems languages now challenging C for its place at the top of the systems-programming heap – Go and Rust, in particular. I reached a startling realization – I have 35 years of experience in C. I write C code pretty much every week, but I can no longer remember when I last started a new project in C!

If this seems completely un-startling to you, you’re not a systems programmer. Yes, I know there are a lot of you out there beavering away at much higher-level languages. But I spend most of my time down in the guts of things like NTPsec and GPSD and giflib. Mastery of C has been one of the defining skills of my specialty for decades. And now, not only do I not use C for new code, I can’t clearly remember when I stopped doing so. And…looking back, I don’t think it was in this century.

That’s a helluva thing to have sneak up on me when “C expert” is one of the things you’d be most likely to hear if you asked me for my five most central software technical skills. It prompts some thought, it does. What future does C have? Could we already be living in a COBOL-like aftermath of C’s greatest days?

Continue reading

Jul 17

A teaching story

The craft of programming is not a thing easily taught. It’s not so much that the low level details like language syntaxes are difficult to convey, it’s more that (as I’ve written before) “the way of the hacker is a posture of mind”.

The posture of mind is more essential than the details. I only know one way to teach that, and it looks like this…

Continue reading

Jul 12

Fuzzbombing: abort() calls for great justice!

The Colossal Cave Adventure restoration is pretty much done now. One thing we’re still working on is getting test coverage of the last few corners in the code. Because when you’re up to 99.7% the temptation to push for that last 0.3% is really strong even if the attempt is objectively fairly pointless.

What’s more interesting is the technique one of our guys came up with for getting us above about 85% coverage. After that point it started to get quite difficult to hand-craft test logs to go to the places in the code that still hadn’t been exercised.

But NHorus, aka Petr Vorpaev, is expert at fuzz testing; we’ve been using American Fuzzy Lop, a well-designed, well-documented, and tasteful tool that I highly recommend. And he had an idea.

Want to get a test log that hits a particular line? Insert an abort() call right after it and rebuild. Then unleash the fuzzer. If you’ve fed it a good test corpus that gets somewhere near your target, it will probably not take long for the fuzzer to random-walk into your abort() call and record that log.

Then watch your termination times. For a while we’d generally get a result within hours, but we eventually hit a break after which the fuzzer would run for days without result. That knee in the curve is your clue that the fuzzer has done everything it can.

I dub this technique “fuzzbombing”. I think it will generalize well.

Jun 30

Open Adventure 1.1, and some thoughts on software preservation

Open Adventure 1.1 has shipped. There are a lot more changes under the hood than are readily apparent. In fact there have been no changes in gameplay at all, and only minor changes to the UI (reversible with the -o oldstyle switch).

We (Jason Ninneman, Per Vorpaev, Aaron Traas, Peje Nilsson and I) could have taken the approach of changing the original rather ugly C code (mechanically translated from FORTRAN) as little as possible, simply packaging it for compilation and release in a modern environment.

I elected not to do that, one reason being that I think we honor hacker tradition better by bring the code forward as a dynamic, living artifact that invites being hacked on than museumizing it as a static one. There’s also the fact that the extreme obscurity of the code made it difficult to appreciate what a work of genius Adventure actually was. (The code we inherited had over 350 gotos in it – rather hard to see past those.)

So we’ve taken a different path. We’ve translated the code into (almost) fully idiomatic C (but not trying to introduce pointer idioms; that should make translation to future languages easier). We’ve replaced the rather cryptic custom text database file that used to define the dungeon with a YAML document that is orders of magnitude easier to read and modify. We haven’t hesitated to use technology that wasn’t even a gleam in anyone’s eye when Adventure originated – the YAML is compiled to C structures at build time by a Python script.

The effect (we hope) is Adventure as it would have been written if Crowther & Woods had had today’s tools to do it – the same vision and design logic, expressed in modern coding idioms. Worth doing, because there are still some things to be learned from this design.

Probably the single cleverest thing in it – which pretty much has to go back to Crowther, Woods couldn’t have bolted it on afterwards – is the way movement in the dungeon is handled. The dungeon’s topology is expressed by a kind of pseudocode broadly resembling the microcode found underneath a lot of processor architectures; movement consists of dispatching to the sequence of opcodes corresponding to the current room and figuring out which one to fire depending not only on the motion verb the user entered but also on conditionals in the pseudocode that can test for the presence or absence of objects and their state.

It was hard to fully understand and appreciate this before, because the code was a spaghetti tangle in what looks today like a shockingly primitive style. The abstraction of the dungeon topology into a declarative specification that – in effect – loads microcode into the game engine was a thing you could half-see, but the impact was blunted by the unreadability of both the code and the specification format. Lifting the specification to YAML was like polishing a rough diamond, revealing beauty and brilliance.

And that’s before we even get to Adventure considered as a work of communicative art. It’s had so many successful descendants – like, every dungeon-crawling game ever, and every text adventure ever – that it’s difficult to see with fresh eyes. But if you make the effort, it is astonishing how mature the wry, quirkily humorous, slightly surrealistic style of this very first game seems. The authors weren’t fumbling for an idiom that would be greatly improved by later artists more sure of themselves; instead, they achieved a consistent and (at the time, unique) style that would be closely emulated by pretty much everyone who followed them in text adventures, and not much improved on as style, even though the technology of the game engines improved by leaps and bounds.

I don’t know how they did it, and the authors would probably not be able to explain if we asked. But I think it is damned impressive how well this game has aged – the code may have needed a refresh, but the design still shines. I’m proud to have helped restore it, and hope I have brought it to a state where it can be forward-ported to future languages for as long as programming is a living art.

May 14

The advent of ADVENT

A marvellous thing has just occurred.

Colossal Cave Adventure, the original progenitor of the D&D-like dungeon-crawling game genre from 1977 and fondly remembered as ADVENT by those of us who played it on PDP-10s, is one of the major artifacts of hacker history.

The earliest version by Crowther and Woods (sometimes known as 350-point Adenture) was ported to C by Jim Gillogly in ’77 just after it first shipped. That has been part of the bsd-games collection forever.

What I have have just received Crowther & Wood’s encouragement to polish up and ship under a modern open-source license is not the Gillogly port; it’s Crowther & Woods’s last version from 1995. It has 18 years of work in it that the Gillogly version doesn’t.

I feel rather as though I’d been given a priceless Old Master painting to restore and display. Behooves me to be careful stripping off the oxidized varnish.

Mar 26

src 1.13 is released

My exercise in how small you can make a version-control system and still have it be useful, src, does seem to have a significant if quiet fanbase out there. I can tell because patches land in my mailbox at a slow but steady rate.

As the blurb says: Simple Revision Control is RCS/SCCS reloaded with a modern UI, designed to manage single-file solo projects kept more than one to a directory. Use it for FAQs, ~/bin directories, config files, and the like. Features integer sequential revision numbers, a command set that will seem familiar to Subversion/Git/hg users, and no binary blobs anywhere.

Mar 20

cvs-fast-export 1.43 is released

Maintaining cvs-fast-export is, frankly, a pain in the ass. Parts of the code I inherited are head-achingly opaque. CVS repositories are chronically prone to malformations that look like bugs in the tool and/or can’t be adapted to in any safe way. Its actual bugs are obscure and often difficult to fix – the experience is not unlike groping for razor-blades in the dark. But people expect cvs-fast-export to “just work” anyway and don’t know enough about what a Zeno’s tarpit the domain problem is to be properly grateful when it does.

Still I persevere. Somebody has to; the thought of vital code being trapped in CVS is pretty nervous-making if you know everything that can go wrong with it.

This release fixes a bug introduced by an incorrect optimization hack in 2014. It should only have affected you if you tried to use the -c option.

If you use this at a place that pays developers, please have your organization contribute to my Patreon feed. Some of my projects are a pleasure to do for free; this one is grubby, hard work.

Mar 16

An apologia for terminal games

Yes, to a certain segment of the population I suppose I define myself as a relic of ancient times when I insist that one can write good and absorbing computer games that don’t have a GUI – that throw down old-school in a terminal emulator.

Today I’m shipping a new release of the game greed – which is, I think, one of the better arguments for this proposition. Others include roguelike dungeon crawlers (nethack, angband, moria, larn), VMS Empire, the whole universe of text adventure games that began with ADVENT and Zork, and Super Star Trek.

I maintain a bunch of these old games, including an improved version of the BSD Battleships game and even a faithful port of the oldest of them all: wumpus, which I let you play (if you want) in a mode that emulates the awful original BASIC interface, all-caps as far as the eye can see.

Some of these I keep alive only because somebody ought to; they’re the heritage grain of computer gaming, even if they look unimpressive to the modern eye. But others couldn’t really be much improved by a GUI; greed, in particular, is like that. In fact, if you ranked heritage terminal games by how little GUIfication would improve then, I think greed would probably be right at the top (perhaps sharing that honor with ski). That in itself makes greed a bit interesting.

Much has been gained by GUIfying games; I have my own favorites in that style, notably Civilization II and Spaceward Ho! and Battle For Wesnoth (on which I was a developer for years). But the very best terminal games retain, I think, a distinct charm of their own.

Some of them (text adventures, roguelikes) work, I think, the way a novel does, or Scott McCloud taught us minimalist cartooning does; they engage the user’s own imagination as a peripheral, setting up a surprisingly strong interaction between the user’s private imagery and the bare elements of the game. At their best, such games (like novels) can have a subtle imaginative richness that goes well beyond anything this week’s graphical splatterfest offers.

More abstract puzzle games like greed don’t quite do that. What they offer instead is some of the same appeal as tiling window managers. In these games there is no waste, no excess, no bloat, no distraction; it’s all puzzle value all the way down. There’s a bracing quality about that.

Ski is kind of hermaphroditic that way. You can approach it as a cartoon (Aieee! Here comes the Yeti! Flee for your life!) or as a pure puzzle game. It works either way.

Finally, maybe it’s just me, but one thing I think these old-school terminal games consistently do better than their modern competition is humor. This is probably the McCloud effect again. I’ve laughed harder at, and retained longer, the wry turns of phrase from classic text adventures than any sight gag I’ve ever seen in a GUI game.

So, enjoy. It’s an odd and perhaps half-forgotten corner of our culture, but no less valuable for that.

UPDATE: I probably shouldn’t have described wumpus (1972) as “the oldest of them all”, because there were a few older games for teletypes like Hammurabi, aka Hamurabi (with a single ‘m’) aka The Sumer game from 1968. But wumpus is the oldest one that seems to be live in the memory of the hacker culture; only SPACEWAR (1961) has a longer pedigree, and it’s a different (vector graphics) kind of thing.

Mar 14

Semantic locality and the Way of Unix

An important part of the Way of Unix is to try to tackle large problems with small, composable tools. This goes with a tradition of using line-oriented textual streams to represent data. But…you can’t always do either. Some kinds of data don’t serialize to text streams well (example: databases). Some problems are only tractable to large, relatively monolithic tools (example: compiling or interpreting a programming language).

Can we say anything generatively useful about where the boundary is? Anything that helps us do the Way of Unix better, or at least help us know when we have no recourse but to write something large?

Continue reading

Mar 12

Ones-complement arithmetic: it lives!

Most hackers know how the twos-complement representation of binary numbers works, and are at least aware that there was an older representation called “ones-complement” in which you negated a binary number by inverting each bit.

This came up on the NTPsec development list recently, with a question about whether we might ever have to port to a non-twos-complement machine. To my utter, gob-smacked astonishment, it turns out ones-complement systems still exist – though, thankfully, not as an issue for us.

I thought I could just mumble something about the CDC 6600 and be done, but if you google “one’s-complement machines” you’ll find that Unisys still ships a series of machines with the brand “Clear-Path Dorado” (latest variant introduced 2015) that are emulations of their old 1100-series mainframes running over Intel Xeon hardware – and these have one’s-complement arithmetic.

This isn’t a practical port blocker for NTPsec, as NTP will never run over the batch OS on these things – it’s about as POSIX-compatible as the Bhagavad-Gita. It’s just weird and interesting that ones-complement machines survive in any form at all.

And a bit personal for me. My father was a programmer at Univac in the 1950s and early ’60s. He was proud of his work. My very first interaction with a computer ever was getting to play a very primitive videogame on the oscilloscope-based video console of a Univac 1108. This was in 1968. I was 11 years old, and my game machine cost $8M and took up the entire ground floor of an office building in Rome, Italy.

Other than the 1100, the ones-complement machines Wikipedia mentions (LINC, PDP-1, and CDC6600) are indeed all long dead. There was a ones-complement “CDC Cyber” series as late as 1989, but again this was never going to implement POSIX.

About other competitors to twos-complement there is less to say. Some of them are still used in floating-point representations, but I can find no evidence that sign-magnitude or excess-k notation have been used for integers since the IBM 7090 in 1959.

There’s a comp.lang.std.c article from 1993 that argues in some technical detail that that a C compiler is not practical on ones-complement hardware because too many C idioms have twos-complement assumptions baked in. The same argument would apply to sign-magnitude and excess-k.

UPDATE: It seems that Unisys is the graveyard of forgotten binary formats. I have a report that its Clear-Path Libra machines, emulating an ancient Burroughs stack machine architecture, use sign-magnitude representation of integers.

Mar 06

Reposturgeon recruits the CryptBitKeeper!

I haven’t announced a reposurgeon release on the blog in some time because recent releases have mostly been routine stuff and bugfixes. But today we have a feature that many will find interesting: reposurgeon can now read BitKeeper repositories. This is its first new version-control system since Monotone was added in mid-2015.

Continue reading

Feb 19

The simplest possible method syntax in C

I’ve been thinking a lot about language design lately. Part of this comes from my quite successful acquisition of Go and my mostly failed attempt to learn Rust. These languages make me question premises I’ve held for a long time, and that questioning has borne some fruit.

In the remainder of this posting I will describe a simple syntax extension in C that could be used to support a trait-centered object system similar to Rust’s (or even Go’s). It is not the whole design, but it is a simple orthogonal piece that could fit with several different possible designs.

Continue reading

Feb 13

loccount: A faster SLOC utility

Here’s my first new project in a while – loccount, inspired by David A. Wheeler’s sloccount tool but much faster and with broader language coverage.

I actually wrote this as a learning exercise in the Go language. You can find more details in my NTPsec blog post on Grappling With Go.

If you like it, please remember that open source may be free but my time is not and join my Patreon feed.

Jan 13

Rust and the limits of swarm design

In my last blog post I expressed my severe disappointment with the gap between the Rust language in theory (as I had read about it) and the Rust language in practice, as I encountered it when I actually tried to write something in it.

Part of what I hoped for was a constructive response from the Rust community. I think I got that. Amidst the expected volume of flamage from rather clueless Rust fanboys, several people (I’m going to particularly call out Brian Campbell, Eric Kidd, and Scott Lamb) had useful and thoughtful things to say.

I understand now that I tested the language too soon. My use case – foundational network infrastructure with planning horizons on a decadal scale – needs stability guarantees that Rust is not yet equipped to give. But I’m somewhat more optimistic about Rust’s odds of maturing into a fully production-quality tool than I was.

Still, I think I see a problem in the assumptions behind Rust’s development model. The Rust community, as I now understand it, seems to me to be organized on a premise that is false, or at least incomplete. I fear I am partly responsible for that false premise, so I feel a responsibility to address it square on and attempt to correct it.

Continue reading

Jan 12

Rust severely disappoints me

I wanted to like Rust. I really did. I’ve been investigating it for months, from the outside, as a C replacement with stronger correctness guarantees that we could use for NTPsec.

I finally cleared my queue enough that I could spend a week learning Rust. I was evaluating it in contrast with Go, which I learned in order to evaluate as a C replacement a couple of weeks back.

Continue reading

Sep 27

Twenty years after

I just shipped what was probably the silliest and most pointless software release of my career. But hey, it’s the reference implementation of a language and I’m funny that way.

Because I write compilers for fun, I have a standing offer out to reimplement any weird old language for which I am sent a sufficiently detailed softcopy spec. (I had to specify softcopy because scanning and typo-correcting hardcopy is too much work.)

In the quarter-century this offer has been active, I have (re) implemented at least the following: INTERCAL, Michigan Algorithmic Decoder, and a pair of obscure 1960s teaching languages called CORC and CUPL, and an obscure computer-aided-instruction language called Pilot.

Pilot…that one was special. Not in a good way, alas. I don’t know where I bumped into a friend of the language’s implementor, but it was in 1991 when he had just succeeded in getting IEEE to issue a standard for it – IEEE Std 1154-1991. He gave me a copy of the standard.

I should have been clued in by the fact that he also gave me an errata sheet not much shorter than the standard. But the full horror did not come home to me until I sat down and had a good look at both documents – and, friends, PILOT’s design was exceeded in awfulness only by the sloppiness and vagueness of its standard. Even after the corrections.

Continue reading