Category Archives: Uncategorized

What books have made you a better person?

This post is largely an aggregation of some conversation on twitter that I want to preserve for posterity and share with a wider audience.

Yesterday I asked:

Question: What books have you read that you’d say improved you as a human being? (Ethically, rationally, creatively, whatever),
I’m aware that’s a bloody hard question. I’m not sure I can answer it.

Here are the responses, in roughly chronological order.

  • @alexjs: 1984 taught me at an early age to question misinformation. HHGTG taught me that ignorance can be even better…
  • @mdreid: Singer’s How Are We To Live. DFW’s Infinite Jest. Feynman’s QED. Mandelbrot’s Fractal Geometry of Nature. Coetzee’s Youth.
  • @etorreborre: I think it was this one: http://amzn.to/kl9Mjv (psychology)
  • @davidpeto: American Psycho.
  • @riffraff: excluding others already suggested, “King Solomon’s Ring” by konrad lorenz, taught me “human” behaviour is not so human after all
  • @wgren: Stephen J Gould – “The Mismeasure of Man”. Charles Stross – “Accelerando”. Carl Sagan – “The Demon Haunted World”.
  • @newsmary: In that case, Faulkner’s Absalom, Absalom! tends to be the first one I recommend to storytellers.
  • @illyrica: I Fought the Law, Dan Kieran. Chinese Whispers, Hsiao-Hung Pai. Contingency Irony & Solidarity, Richard Rorty
  • @cemenzel: 1984; The Demon-Haunted World; The God Delusion. None really changed me much, but made me more aware.
  • @lordcope: Full Catastrophe Living; Getting Things Done; The Little Schemer; Crucial Confrontations; 7 Habits; Getting to Yes.
  • @reyhan: Neuromancer and The Adventures of Endill Swift
  • @channingwalton: zen and the art of motorcycle maintenance, Tao te Ching,Timeless Way of Building (look at things differently)
  • @jarhart: Working backward – Learn You A Haskell; Sex, Ecology Spirituality; No Limit Hold ‘Em Theory And Practice; Rich Dad, Poor Dadu
  • @wgren: Also- Diamond: “Collapse”. Wright: “Remembering Satan”. And ofc 1984, Brave New World, Postman: “Amusing ourselves to Death”
  • @toluju: Not sure if it’s been mentioned, but Sophie’s World has a pretty big effect on me.
  • @kssreeram: “If you want to write” by Brenda Ueland. More than just writing, that book is about self-expression and creativity.
  • @DanielJMaxwell: Domain Driven Design by @ericevans0. It looks technical but really it’s about communication. Every human being should read it.
  • @dylanbeattie: ‘Chess for Young Beginners’ – the book that taught me that you can learn things from books. I can remember every page vividly.
  • @dylanbeattie: Chaos (Gleick). Every Rough Guide I ever read. Andrew Martin’s “How To Get Things Really Flat”, the first Dirk Gently novel…
  • @obs3sd: To Kill A Mockingbird has stuck with me for over 30 years
  • @dylanbeattie: …and Leonard da Vries Book of Experiments. Neuromancer. Neal Stephenson’s Baroque Cycle. Does the Jargon File count?
  • @paraseba: If “Les Misérables” doesn’t improve you, I bet you’re a statue
  • @runarorama: Ethically: Getting Things Done. The Virtue of Selfishness. Hávamál. And of course The Nicomachean Ethics.
  • @palfrey: Godel, Escher and Bach
  • @dcsobral: The Players of Null-A, Hellspark

Feel free to add more answers in the comments.

Myself, I still don’t have a good answer. But maybe when I’ve got through the above list I will.

This entry was posted in Books, Uncategorized on by .

Because ten just wasn’t enough

It was a slow day yesterday at the Aframe offices – no one was making much progress and we were all rather tired. So based on a conversation I had around lunchtime I decided to take a break and work on a little hack: A random commandment generator. I even had a good name for it: “The N Commandments”.

Well, I knocked together a quick prototype in about 20 minutes. Stef loved it and gave me a bunch of good style suggestions (and made me add a facebook like button. sigh), and before I knew it nearly the whole dev team was chiming in with suggesting tweaks to the generator and avidly refreshing the page.

So the site grew pretty quickly from my initial black text on white screen and nothing else prototype, and is now online.

The commandments range from the utterly absurd to ones which should probably have been included up front.

Some of my favourites:

Oh, by the way, it also has an API of sorts: Add .json to any of the URLs to get a JSON representation.

This entry was posted in Uncategorized on by .

Removing silent tracks from a video: A bug and the hack that killed it

So, as you might have gathered, I now work for a company called Aframe (lower case f. Very fussy about that. As if I didn’t spend enough time correct people’s usage of my name). We do video stuff.

Of course, muggins here gets to be the one responsible for an awful lot of that video stuff. We have in house expertise, but they’re largely people with a huge amount of video experience and no programming experience. So this has been a bit of a learning experience for me, and it’s far from over yet.

Anyway, one of the things we do is convert all the video we get into a version you can view on the website. It’s not really our major selling feature, but it’s the basic feature on which we build a hell of a lot of other stuff.

We’ve had this collection of video from a particular customer for a while with the embarrassing feature that the version we were producing from the website had no sound. I couldn’t figure it out at all – the original had 8 audio channels, which seemed likely to be the source of the problem, but no matter which one I played in vlc it was always silent.

Actually… turns out that was just vlc lying to us and not switching channels. Sigh. (I’d file a bug report, but channel switching clearly works in other contexts and we can’t share the video, so I don’t know how best to do that).

We revisited this yesterday and verified that actually there are perfectly good audio tracks on channels 3 and 4, but the other 6 are silent and we were defaulting to using tracks 1 and 2. The camera in question has a variety of different ways of taking audio in, and which tracks are non-silent depends on which audio inputs you use. Of course, all the channels are there regardless of what you do, and there’s no metadata (that I could find) marking which ones contain sound.

It’s easy enough for our transcode process to select the right tracks, but combining the audio tracks is harder (some transcoders support it, some don’t). So ideally what we would do is just remove the silent audio tracks (you probably guessed this from the title).

But how do you do that? There’s no metadata saying which are silent, so you have to figure out which audio tracks are silent and which contain sound. In an automated way that is – listening to it is no good – and with the audio tracks in an arbitrary codec.

I pondered this for a little while and came up with a solution. It’s duct tape programming in the extreme, but actually it works very nicely:

We have no metadata. So we have to use the data. We don’t know what the format is, so we have to convert it to a standard one. What standard format would make it really easy to detect whether a track is silent? Well… an uncompressed one. Pick a random simple uncompressed audio format? Let’s try wav.

It turns out that detecting a silent wav file is trivially easy: It consists overwhelmingly of 0 bytes. There’s a brief header at the beginning, and then the rest of the file is just a big long chain of 0s. So for each audio track we generate a wav, check what percentage of it is 0 bytes, and if it’s >= 99.9% we declare it to be silent.

Once we’ve determined which tracks are silent it’s then a simple matter (admittedly a simple matter which took me a reasonable amount of wrestling with command line options and asking for help on IRC) to use ffmpeg to cut out the silent tracks. The invocation looks approximately like:

ffmpeg -i original.mov -y -ac 2 -map 0.0 -map 0.3 -map 0.4 -acodec copy -vcodec copy output.mov -newaudio

-ac tells the number of audio channels the output should have, the map options say which channels from the source to use (the order is significant) and -newaudio says “Don’t fail in mysterious and incomprehensible ways as a result of my having tried to change the audio channels”.

In the unlikely event that anyone who isn’t us will find this useful, I’ve started collecting the results of experiments like this into a repo on github. The code for this post is there as “removesilent”. It’s not the exact code we’re using at work (as that’s better integrated into our system), but it’s pretty close.

This entry was posted in Uncategorized on by .

When you have a hammer: Squish usage example

So I’m playing around with frequent item set mining, and the approach I’m trying (not that I expect it’s at all novel. I just want to get a feel for the problems, and maybe build a useful tool if I can’t find any that match my needs) needs a “getting started” step which works out the event list for all frequent one and two item sets.

So basically what I want is a map of tags to maps of tags to events. So my_map[t1] gives a map of tags to which events they cooccur with t1 in (this should include t1, so it also includes all events t1 occurs in as a special case).

Further I want to do this in a way that can be expected to process huge amounts of data (certainly gigabytes) without falling over.

I was thinking about how to do the problem in a sensible way, and then I realised “Ah ha! I can use squish for this!”

In fact the solution ends up using squish twice. Here’s the code:

awk '{
  line_count++;
  for(i = 1; i <= NF; i++) 
    for(j = 1; j <= NF; j++)
      print $(i) " " $(j) ":" line_count
}' |  sort | squish -d ":" -s ", " | sed 's/ /:/' | squish -d ":" -s "; " | sed 's/:/: /'

It takes input like this:

foo 
bar
foo bar
foo bar baz
baz
foo bar baz bif bing bloop blorp
blorp

And outputs data like this:

bar: bar:2, 3, 4, 6; baz:4, 6; bif:6; bing:6; bloop:6; blorp:6; foo:3, 4, 6
baz: bar:4, 6; baz:4, 5, 6; bif:6; bing:6; bloop:6; blorp:6; foo:4, 6
bif: bar:6; baz:6; bif:6; bing:6; bloop:6; blorp:6; foo:6
bing: bar:6; baz:6; bif:6; bing:6; bloop:6; blorp:6; foo:6
bloop: bar:6; baz:6; bif:6; bing:6; bloop:6; blorp:6; foo:6
blorp: bar:6; baz:6; bif:6; bing:6; bloop:6; blorp:6, 7; foo:6
foo: bar:3, 4, 6; baz:4, 6; bif:6; bing:6; bloop:6; blorp:6; foo:1, 3, 4, 6

How does it work?

Well, we first use awk to transform it into data that looks like this:

foo foo:1
bar bar:2
foo foo:3
foo bar:3
bar foo:3

giving us cooccurring pairs followed by an event they cooccur in.

We then sort it to get all cooccurrences next to eachother, and pass it through squish using : as the delimiter and , as the separator, which gives us something that looks like this:

bar bar:2, 3, 4, 6
bar baz:4, 6
bar bif:6
bar bing:6
bar bloop:6

A bit of sed (not really necessary, but helps for formatting) changes the character after the first tag, and now we can apply squish again with the new delimiter and use it to join lines together with the separator as ;. We then reformat slightly to get the desired end result.

This wouldn't be terribly complicated to do without squish, but squish made it basically trivial, and because sort scales reasonable well to large data and squish processes only in a pipeline, this too should handle as much data as you want to throw at it.

This entry was posted in Uncategorized on by .

An interesting idea for authentication

So one of my pet peeves is authentication. I absolutely hate having to remember a pile of different passwords. I would love everyone to use openid, but clearly it just isn’t happening. I can live with people using twitter and facebook and similar for auth, although I am rather disinclined to give you the ability to tweet as me so I can use your social cat tagging application (yes, I know about read only. No one is bloody using it), but even this doesn’t seem to get much uptake.

Mike and I were talking earlier, and we had an interesting idea. After talking it over with a few friends (Andy Bennett and Alaric Snell-Pym) we refined the idea a bit.

The idea is basically as follows:

  1. Login is done by dedicated personalised login links. It would be something like http(s)://mysite.com/login/some-big-random-string.
  2. When you sign up you’re given a page which says “this is your login link. Please bookmark it”
  3. You also provide your email address, and you can at any point say “I need a new login link. Please email it to me”. It will email you the new link, and the old one will be invalidated as soon as you click on it (it can’t be invalidated before that, because otherwise anyone can invalidate your link).

It should be simple to use – you just bookmark links for your site that are specific to you, which is no worse than storing passwords locally, and it can easily be invalidated by email equivalently to “I forgot my password” links people are familiar with. The security is not really any worse than normal password based authentication (there’s the potential to see it in the URL bar, but you just redirect away from it quickly and make it a long random string, which renders this basically not an issue).

This seem so painfully simple it’s astonishing no one’s doing it if there’s not a major flaw I’m missing.

What do you think?

This entry was posted in Uncategorized on by .