David R. MacIver's Blog: Book Review: The Art of Assembly Language

Book Review: The Art of Assembly Language

29 May 2008

I’ve been meaning to get a handle on a lot more low level programming details for a while. I can do C, more or less, but I only had the vaguest notion of the lower level CPU architecture and assembly language. While I was over in the states, Victoria made the mistake of taking me into a book store. I emerged several books later. Among them, The Art of Assembly Language. This post is a review of the book. Well, ok. It’s more of a rant about the book.

First off, let me say this: I liked this book. I read it cover to cover (although a few of the chapters I only skimmed), which is extremely rare for a programming book. I can’t offhand think of another example of one I own (of which there are many) which I’ve read that thoroughly. Most of them I dip into and read chapters out of order as I feel like it. I came out feeling like I had a much better understanding of the way the x86 architecture works and various details that had previously escaped me. It’s interesting and largely well written.

So, it’s a good book, and I’m glad I read it. And that’s the last positive thing I’m likely to say about it in this post. The rest of the post is me being mean about its flaws.

First off: After reading this book, one comes out with a good sense the way various aspects of assembly are constructed and used. It clears up a lot of misconceptions you might have had if you don’t already know assembly and provides a high level sense of what makes up an assembly program (Examples: I had previously assumed that the stack was purely a higher level construct and didn’t have any specific CPU support. I also hadn’t realised quite how significant registers were for... well, everything, or the way the FPU was set up in relation to the rest of the CPU instructions). What it doesn’t teach you to do is write assembly.

The text uses High Level Assembler. This is an interesting pedagogical tool, as it certainly helps to remove a lot of the scariness of writing assembler (compare the hello worlds for example). The problem is that it’s not really an assembler. Instead it’s the bastard offspring of a highish level language (well, low to mediumish. Somewhere a little above C but below C++) and x86 assembly. Worse, the bits that look like x86 assembly often aren’t. Some of this takes the form of relatively harmless little extensions. e.g. in x86 assembly the mov instruction may only move data between registers or a register and a memory location. You can’t mov between two addresses in main memory. In HLA you can. mov(x, y) translates to push(x) pop(y).

If you know your assembly, you’ll just have spotted the other far more insidious way in which it’s different (I do not know my assembly, hence only discovered this on a) Reading some NASM source and getting very confused and b) Reading FAQ entry 8). The operands are backwards from intel’s mnemonics. Mostly. Except when they’re not. So in other words if you learn to write HLA you will most likely completely destroy your ability to write other assemblers because you will constantly get fuddled by operand order (ok, that’s probably not true. I suspect a few weeks with an assembler and someone standing over you with a big stick would cure that).

But that’s ok. I mean, surely HLA is a mature and well tested product? It’s been around for over a decade. If it’s that much better than existing approaches to assembler, what’s the harm in doing things differently?

Um.

Setting aside any philosophical issues with the way HLA does things, the author repeatedly and insistently describes the current implementation of HLA as horrible, not designed for performance and a prototype. This does not exactly fill me with confidence.

So, what does HLA add on top of x86 assembly?

A nice standard library.
In particular, good String handling (better than C’s in my opinion).
Support for composite datatypes (records, unions, pointers, arrays)
A calling convention for passing arguments to procedures. I’m not thrilled by this - something like it is definitely useful to have, but the calling convention chosen suffers from the C problem of making it all too easy to copy large amounts of data around on the stack (more so: It passes arrays by value).
A macro language (the author seems to be convinced that this is the most advanced macro language ever. It’s not. It’s a half assed scripting language built into a preprocessor. It’s not really better than using something like fmpp or similar. It’s not even close to lispy macros).
Ten thousand different types of loop.
Exceptions ?!
Objects ?!?!

The exception subsystem makes me particularly mad. The book goes into a great deal of detail (too much detail) about how all the different control flow extensions are implemented and is completely silent on the subject of exceptions. I assumed that this was an advanced subject left to the HLA manual. Not so much - no description of detail there.

So, we have a high level feature whose implementation is undocumented and which is used in many places in the standard library. It’s good that we’re writing assembly here - it really lets us see what the close to the metal programming is like.

The objects section is just sad. I don’t really have anything more to say about it than that.

As a tool for teaching, HLA serves its purpose, but it would work a lot better if it were heavily trimmed down and made less of an attempt at being higher level than normal assembly and more of one at being a lot of useful libraries, some extensions for procedure calls and a better macro language. As it is, I certainly wouldn’t want to program in it.

Comments

olsner on 2008-05-29 19:03:23:

(This is not what I was going to write about, but I just couldn’t resist: HLA? WTF! The whole *point* of assembly is to write every bloody opcode by hand!)

“The operands are backwards from intel’s mnemonics. Mostly. Except when they’re not.”

Well, gas also has a backwards operand order compared to Intel (except in a few instructions where they accidentally used the intel order). And then it differs between architectures of course. So in this respect you’re not much more screwed than you’d have been if you’d learnt any other kind of assembly dialect ;-)

david on 2008-05-29 19:11:18:

Ha. Fair enough. :-)

I guess that’s what I get for commenting on the choice of dialect without knowing much assembly...

Derek Elkins on 2008-05-29 19:23:53:

Note that this book is online as well as two other versions. I much prefer the 16-bit DOS version which uses a tradition Intel-syntax assembler and goes into more low-level details. Assembly programming in Win32/Linux is like C programming with push/pop (i.e. unsatisfying if you are looking for low-level.)

Note, as olsner suggests, operand order isn’t universal. There are two “traditional” styles: AT&T syntax (which is what gas uses by default) and Intel syntax. AT&T syntax does indeed use a different operand order from Intel syntax.

david on 2008-05-29 19:42:29:

ok. I withdraw the outrage about argument order. :-)

I maintain the outrage about the higher level stuff. Particularly the exceptions.

James Iry on 2008-06-02 03:30:44:

I’m so old I first learned x86 assembly from Peter Norton: http://www.amazon.com/Assembly-Language-Primer-Personal-Computer/dp/0136619010.

Anyway, I join you in your outrage about exceptions. Hiding exceptions isn’t like hiding string handling. Unless you’re dealing with Unicode-level stuff, strings just aren’t that interesting (even with Unicode it’s just a crap load of detail mostly isn’t all that interesting). Exceptions are something else entirely. There are several different ways to handle the beasts, each with its own trade-offs. A good intro to assembly might let you macro up some exception generating/handling structures, but should never, ever prevent you from learning what’s going on underneath. If the intro didn’t want to cover exceptions then they shouldn’t be in the book at all. Bah! :-)