Warning!

This is still very much beta - use at your own risk

Today, I set about teaching myself the basics of web scraping, with the intention of putting it to some good use. Coincidence or providence, I read Kottke’s post about creating an ical for summer movie releases, and immediately thought of a personal itch I could scratch.

The Irish Film and Television Network provide a list of Irish Theatrical Releases, but this is just one big flat HTML file that is only marginally helpful. It still relies on me to remember to go to their page and see what’s out and when. It would be much more useful if this information was somewhere I tend to spend a lot of my day looking - say, my calendar program - and even more helpful if it was somewhere I could carry it around with me - say, my phone.

Well, now I can. Using various combinations of bash, sgrep, awk and sed, I created a script that will automatically grab the ‘releases’ page of IFTN.ie and export it as an .ics file, which can be read through iCal/Sunbird, and from there, synched to my phone.

You can grab the .ics file here: http://www.fuckcuntandbollocks.com/dorkus/irish_releases.ics

If you find this useful, please let me know.

And now the caveats:

  1. IFTN’s listing page is braindead. I can’t help this, and my script can’t predict its unusual behaviour. For example, why does it have two release dates for “Kicking and Screaming”, one on June 3rd, the second on July 29th? And why does it randomly have two “2005"s after “Fever Pitch”?
  2. This is my first real time creating a .ics file. I ploughed through RFC 2445 for pointers, but I might have commited some mortal vcalendar sin without knowing it.
  3. Bug reports to the usual address

Update For my next trick, I did the same for videogames using Eurogamer’s release dates. Grab the calendar file here: http://www.fuckcuntandbollocks.com/dorkus/irish_game_releases.ics