Shaving yaks and finding feeds

So I had some interesting ideas I wanted to play with to do with keeping on top of streams of information.

Of course, I needed some streams of information to keep on top of in order to do this. I decided to go with my RSS feeds (the other obvious source being twitter).

To do that I needed a database of feed entries. So I created a small program to do that (I really should just have used feed-bag, but there were some things I wanted to tweak and integrate so I didn’t).

Unfortunately for whatever reason I ended up with a lot of URLs that pointed to sites or something invalid in my opml. I’m not sure offhand if this was an import problem or a problem in the google reader export.

So, I thought, let’s do our damnedest to correct URLs: If it points to a site do feed discovery, follow redirects, etc. It can’t be that hard.

Cue me getting very angry. Suffice it to say, if you do what I did and foolishly expect people on the web to follow standards you are very mistaken.

Anyway, after much hacking around trying to get this to work I decided to codify the various tricks into a library so you don’t have to share my anger. I’ve called this library feedify. This is very rude of me as there’s another ruby library called feedify, but given that it hit 0.0.1 in january 2008 and never updated since then I don’t feel too bad about stomping on its namespace.

Additionally I’ve put up an http interface to it. If you go to url) then it will try to find a feed associated with that URL and redirect you to it. You can also run this service yourself – it’s included in the github project.

This is all very rough and liable to change at the moment. If you have any bug reports of URLs it misses or gets wrong I’d be very interested to receive them.

This entry was posted in programming and tagged , on by .