Author Archives: david

Because ten just wasn’t enough

It was a slow day yesterday at the Aframe offices – no one was making much progress and we were all rather tired. So based on a conversation I had around lunchtime I decided to take a break and work on a little hack: A random commandment generator. I even had a good name for it: “The N Commandments”.

Well, I knocked together a quick prototype in about 20 minutes. Stef loved it and gave me a bunch of good style suggestions (and made me add a facebook like button. sigh), and before I knew it nearly the whole dev team was chiming in with suggesting tweaks to the generator and avidly refreshing the page.

So the site grew pretty quickly from my initial black text on white screen and nothing else prototype, and is now online.

The commandments range from the utterly absurd to ones which should probably have been included up front.

Some of my favourites:

Oh, by the way, it also has an API of sorts: Add .json to any of the URLs to get a JSON representation.

This entry was posted in Uncategorized on by .

Playing with a new language: Clay

There’s been a distinct lack of interesting language related posts around here recently.

Why? Well because I’ve been mostly writing Ruby and C. Neither of which languages fills me with an urge to post HEY GUYS, I FOUND THIS COOL THING. Ruby because I vaguely dislike it, so I resent even the kindof cool things I find as I use it, C because my use of C tends towards the extremely straightforward so C posts tend to be about the result rather than the method (arguably that’s what programming posts should be like anyway, but that’s a separate issue).

But anyway, I’ve been window shopping around for a new language to learn. My list of features I wanted in said language were:

  1. Good performance
  2. Higher level than C
  3. Preferably statically typed
  4. Generic collections
  5. Didn’t run on the JVM

The closest to this on the list of languages I already knew was Haskell, and I’ve been using it a little bit, but it and I don’t really get on for a variety of reasons that are beyond the scope of this blog post.

I was pretty much thinking it was going to be O’Caml (which is another language in the “vaguely dislike” category. I want to like it, I really do, but somehow I can never manage) and C++ (oh dear god no).

So, anyway, my ears perked up when clay came up on reddit the other day. I had already been vaguely aware of it, but somehow it dropped off my radar before I did anything about it.

So I’ve been spending today playing with it, and the results are two little projects: An interval tree implementation and a very rudimentary start on some client bindings to PostgreSQL. Neither are written with any particular purpose in mind, ‘though both are something I might want to use at some point, it’s really just a way to get a feel for the language.

I’m pleasantly impressed. Is it going to be the Next Big Language? Who knows. Certainly not me. But despite a lot of stupid questions on my part it was a pleasant experience, and I’m definitely going to continue playing with it.

What follows are some initial impressions. Bear in mind it’s the result of only a day of playing with it, so they may be incomplete or outright wrong:

The Good

Here are some of the things I liked:

Good support for generic programming

Like it says on the label really. This is one of the things the language bills itself as being good at, so if it weren’t it would be dead on launch.

I’ve had a reasonable play around with it, and it works well. The entire system is basically a giant (mostly) statically resolved multimethod system. Given my fondness for multimethods, that’s a plus as far as I’m concerned. Runtime polymorphism comes in through an extensible variant system: You can declare variants in one module and add new ones in another.

Anyway, the result is that there are generic collections and they Just Work, exceptions make use of the extensible variant system and seem well behaved.

There’s no object orientation support and no subtyping. As far as I’m concerned, that’s a good thing.

Decent FP primitives

So I haven’t actually had much of a play with these yet believe it or not, but they’re there and seem to work. lambdas don’t implement non-local return (‘though I suppose you could do this with an exception), but do capture their environment. The main place I’ve used this is in the test system.

clay bind-gen

clay bind-gen takes a C header file and gives you a clay library. It’s a bit finicky, and obviously the code is far from idiomatic, but it does work and works well. It took me about 10 minutes to get this working on the libpq header and port a C example of using it to clay. Hopefully this means that it’s rather easy to build on the available C ecosystem.

Decentish standard library

All quite basic at the moment, but it already annoys me less than Scala’s used to when it was at a much later stage in the game than this. Granted that’s because Scala could get away with it by falling back on the JVM ecosystem and clay can’t, but it’s a good start. It’s pleasantly reminiscent of the sort you’d expect with a high level language, but seems well tuned for clay’s suitability as a systems language.

Valgrind works flawlessly

As per title. Clay is an unsafe language – it has manual memory management (which, thanks to things like stack allocation and deterministic destructors, is much less of a pain than in C, but is still definitely not hard to get wrong), pointer arithmetic, no bounds checking on array access.

The only reason I trust myself to write C is because valgrind exists. I would have had a very hard time trusting myself to write clay if it didn’t work on it. Fortunately, no problems.

Friendly and intelligent IRC channel

I’m a big fan of IRC as a learning environment. I’ve been on IRC on a fairly regular basis in various channels for 10-15 years (‘though rarely any given one for more than 4 or 5. I have community attention span issues), and so the first thing I do when checking out a new tech or language is long onto their IRC channel. Clay’s does not disappoint. It’s small, but there are a bunch of smart people in there, and they were very patient and helpful with my stupid questions.

The bad

Confusing semantics

Generally there’s a good reason for it, and things are still in flux so I think many of these will improve, but there have been a bunch of cases where I’ve gone “Wait, what?”. Some of these are just the result of my never having used a language with non-trivial value semantics before (tree = tree.child generates an invalid read), some of them are gotchas which you’ll probably only be caught by once but are still pretty ugly (it’s very easy to think you’ve redefined a constructor when you haven’t because the type arguments are part of the constructor’s name), and a few others that slip my mind at the moment. The semantics of how copying, moving and assignment work are not so much confusing as very easy to get wrong. I’m still not 100% sure I understand how variants interact with their members.

I think I’m largely on top of it now, and I expect it will become clearer as I write more of it, but I definitely spent a good chunk of today quite bewildered.

Basically zero documentation

Well, what do you expect? The language is young and still massively in flux. I’m not surprised there’s no documentation and, as mentioned, the community is very helpful in making up for its lack, but it definitely makes the learning process much harder.

The language is young and still massively in flux

As mentioned above. This is definitely an experimental language – the compiler seems to currently be undergoing a rewrite which will, amongst other things, be changing the backend target away from LLVM and to C. The syntax is expected to change. The semantics are expected to change. The language designers will, and should, feel free to break backwards compatibility.

Error messages

As well as the usual problems with new languages and quality of error messages, clay’s error messages, both at runtime and compile time, are very short on line numbers. This can make it very hard to track down what’s going wrong. It wasn’t drastically difficult with what I was doing today, but they were both rather small projects. I don’t know how hard it will be on larger projects, but I’m hoping that some of this (particularly compile time error messages) will be fixed by the time I have to worry about that.

Late discovery of errors

It’s not quite as bad as C++ template errors are reputed to be, but generally you don’t discover an error in a function until you try to use that function. This is somewhat deliberate, as it ties in with how the genericity works, but when combined with the line numbers problem can be quite frustrating. Additionally the support for giving the compiler hints to get better error messages is a bit rudimentary: You can define compile time functions which constrain arguments, but that’s about it.

The ugly

Keywords vs operators

Clay doesn’t have much C compatibility in terms of its operators: the bitwise operators (in particular <<, >>, |, & and ^) are replaced by functions. The shortcutting boolean operators are replaces with keywords and and or. It’s not a major problem, but it annoyed me vaguely when porting from C.

Keywords

Clay has a fair few keywords: ref, forward, lvalue, rvalue, lambda, and, or, overload, procedure, variant, record, callbyname… probably more.

It’s not a massive problem, but I tend to regard keyword surplus as a sign of insufficiently well factored features. I’ve yet to form an opinion on whether or not this is the case here.

Everything is by reference

I don’t yet know if I like this or not, but Clay’s calling convention passes everything as a mutable reference. So assigning to a function argument assigns in the calling context. Really. This can be quite useful, particularly given that copying a clay value can be a very expensive operation, but I find it deeply disconcerting. I feel like it may grow on me. We’ll see.

A few minor syntax weirdnesses

e.g. one that bugged me earlier is the syntax for return type declaration on a function: foo(a : Blargh) ReturnType. No colon on the return type, but colon on the arguments.

Conclusion

Overall, definitely more good than bad. I intend to keep playing with it and see how I feel after a while doing so.

This entry was posted in Code on by .

Reading video frame by frame with ffmpeg

So I’ve been playing around with scene detection. It’s really more of a NIH task that I’m doing for my own amusement than it is a serious tool I expect to be used, but it’s a good way to expand my knowledge of video and I have a few good ideas which don’t seem to have been used before, so it’s crazy but it Just Might Work.

One of the things I need to do for scene detection is read a video frame by frame and compare subsequent frames. My initial hacked use ffmpeg to turn the video into a sequence of images on disk, ran through them as they were generated them and deleted old ones.

As you can probably imagine this was slow, cumbersome and remarkably hard to get right.

“Oh, hey” thought I. “ffmpeg makes all this stuff available as libraries: libavformat and libavcodec. That will let me do this efficiently!”.

So I started playing around with examples and reading through the documentation. Excuse me, did I say documentation? I meant header files.

Oh.
My.
God.

I mean no (large amount of) disrespect to the authors when I say this: They have created a piece of software which, by and large, works very well. And I’m sure that a lot of the complexity of the API is essential rather than accidental if you’re, say, writing a video player rather than a dumb frame processor.

But, that being said, the contents of the header files are remarkably like getting a lecture on the botany of trees when what you want is a map out of the forest. Apparently it all makes sense if you’ve seen the mpeg4 spec. Apparently writing actual documentation would be a patent minefield. Certainly I have no clue what’s going on.

I tried basing my code on examples from the internet. Unfortunately it looks like the API has moved under them – the examples have been half patched up by other people around, but in the versions I got closest to working they appeared to be doing the wrong thing. The arguments to certain functions were suspicious, and the results were just wrong. The right thing to do might have been to fix this, but I genuinely had no idea how the code was working, so it would have been far from easy to debug it.

So, at this point I largely considered myself defeated by libav* and started thinking about other ways one could do it.

“What I really want”, I thought, “is some sort of server program where I can just feed it a file and then read the frames off in some sensible binary format. That way I’d be insulated from most of the pain of this”.

“Hey, ffmpeg can write its output to a pipe, can’t it?”

After that, the rest was history:

Step 1: Pick some binary format which is easy to read pixel RGB data out of. It will never live on disk, and ease of use speed of parsing is more important than efficiency. Easy, obvious choice: ppm. It’s basically designed for that.
Step 2: Figure out how to get ffmpeg to write a stream of ppm files to its stdout. This turns out to be easy:

Step 3: Figure out how to read a stream of ppm files from a pipe. libnetpbm to the rescue! The only minor issue I had was determining whether we were at the end of file without stomping on netpbm’s toes, so the code contains a slightly weird step where it does a getc to check if it’s at eof and then does an ungetc if we’re not. Other than that, it’s textbook netpbm processing code taken straight from the examples:

This took me all of about half an hour to figure out, after most of a day wrestling with libavcodec, and it works pretty well. The performance is decent. I don’t know how it compares to using libavcodec directly as I haven’t benchmarked (due to not having a working example with libavcodec), but it’s orders of magnitude faster than my previous file system based hack, and the code is a hell of a lot cleaner, so I’m happy.

This entry was posted in Code on by .

Dear Commenters: Frankly, I’m tired of you

Hi there,

I see you want to write a comment on my blog. That’s awesome! It’s really great that after reading the article you have interesting and constructive feedback to deliver.

…oh, you didn’t read the article? Just the title? That’s fine I suppose. I’m sure you’ve got something useful to contribute anyway.

Not so much, huh?

Every time I get a new email saying “There’s a new comment on your blog” I wince and think “I wonder what this guy has got wrong”. Maybe some of it is my fault – there have definitely been cases where people who I believe have actually read the article have been confused as to the point – but I’ve seen enough of it on other peoples’ blogs to know that actually for the most part this is just what people on the internet are like.

I don’t care if people disagree with me if they do so reasonably. Bob knows I’m often wrong. But every time I have to deal with someone who hasn’t read the article, or who is more interested in setting up strawmen, or who will come up with an entirely new “reason” why I am wrong which contradicts all the previous ones he gave that I have just spent time refuting, I feel sad inside and become a little less likely to write articles.

So, I’m done. New posts will not have comments enabled. Old posts will have comments disabled as and when I have reason to do so. If you want to respond to my post, you can do it by one of the myriad communication and discussion mediums the internet has to offer: I’m easily available on twitter or email, and you can show the entire world how clever you are in the discussions on reddit, ycombinator or the like if it’s there.

To those of you who wrote worthwhile comments, of which there were a small but vocal minority, thanks. It was appreciated. Sorry to take this away from you.

This entry was posted in Admin on by .

Removing silent tracks from a video: A bug and the hack that killed it

So, as you might have gathered, I now work for a company called Aframe (lower case f. Very fussy about that. As if I didn’t spend enough time correct people’s usage of my name). We do video stuff.

Of course, muggins here gets to be the one responsible for an awful lot of that video stuff. We have in house expertise, but they’re largely people with a huge amount of video experience and no programming experience. So this has been a bit of a learning experience for me, and it’s far from over yet.

Anyway, one of the things we do is convert all the video we get into a version you can view on the website. It’s not really our major selling feature, but it’s the basic feature on which we build a hell of a lot of other stuff.

We’ve had this collection of video from a particular customer for a while with the embarrassing feature that the version we were producing from the website had no sound. I couldn’t figure it out at all – the original had 8 audio channels, which seemed likely to be the source of the problem, but no matter which one I played in vlc it was always silent.

Actually… turns out that was just vlc lying to us and not switching channels. Sigh. (I’d file a bug report, but channel switching clearly works in other contexts and we can’t share the video, so I don’t know how best to do that).

We revisited this yesterday and verified that actually there are perfectly good audio tracks on channels 3 and 4, but the other 6 are silent and we were defaulting to using tracks 1 and 2. The camera in question has a variety of different ways of taking audio in, and which tracks are non-silent depends on which audio inputs you use. Of course, all the channels are there regardless of what you do, and there’s no metadata (that I could find) marking which ones contain sound.

It’s easy enough for our transcode process to select the right tracks, but combining the audio tracks is harder (some transcoders support it, some don’t). So ideally what we would do is just remove the silent audio tracks (you probably guessed this from the title).

But how do you do that? There’s no metadata saying which are silent, so you have to figure out which audio tracks are silent and which contain sound. In an automated way that is – listening to it is no good – and with the audio tracks in an arbitrary codec.

I pondered this for a little while and came up with a solution. It’s duct tape programming in the extreme, but actually it works very nicely:

We have no metadata. So we have to use the data. We don’t know what the format is, so we have to convert it to a standard one. What standard format would make it really easy to detect whether a track is silent? Well… an uncompressed one. Pick a random simple uncompressed audio format? Let’s try wav.

It turns out that detecting a silent wav file is trivially easy: It consists overwhelmingly of 0 bytes. There’s a brief header at the beginning, and then the rest of the file is just a big long chain of 0s. So for each audio track we generate a wav, check what percentage of it is 0 bytes, and if it’s >= 99.9% we declare it to be silent.

Once we’ve determined which tracks are silent it’s then a simple matter (admittedly a simple matter which took me a reasonable amount of wrestling with command line options and asking for help on IRC) to use ffmpeg to cut out the silent tracks. The invocation looks approximately like:

ffmpeg -i original.mov -y -ac 2 -map 0.0 -map 0.3 -map 0.4 -acodec copy -vcodec copy output.mov -newaudio

-ac tells the number of audio channels the output should have, the map options say which channels from the source to use (the order is significant) and -newaudio says “Don’t fail in mysterious and incomprehensible ways as a result of my having tried to change the audio channels”.

In the unlikely event that anyone who isn’t us will find this useful, I’ve started collecting the results of experiments like this into a repo on github. The code for this post is there as “removesilent”. It’s not the exact code we’re using at work (as that’s better integrated into our system), but it’s pretty close.

This entry was posted in Uncategorized on by .