Summer Projects

June 26, 2012

Both of the summer projects that I'm supervising with Walter Colombo have now started, with two students joining us for 8 weeks over summer to gain some experience of the research environment and work on some (hopefully) interesting projects.

One of our students (James) is investigating the ranking of content on Twitter by pairwise comparison. He's currently building a website to allow comparison of tweets and hopefully we'll be able to get enough people to have a go at it that we can gather a large number of ratings of tweets. From there we can investigate different ranking systems and do some analysis to see if we can pull out what makes a tweet "good" or "bad". The site should go live soon, so watch this space for an announcement.

The other student (Tim) has just started yesterday and is going to be analysing retweet propagation through the Twitter social network. Hopefully this will tie into some work being done by Will (a PhD student in the school). As he's only just started we have no firm plans yet, but we'll maybe end up looking at some community detection or some other type of analysis to go along with this.

I'll try and post regularly on these and keep the project pages updated throughout the summer.

Update

May 15, 2012

So, I've been kind of ignoring this blog, haven't I?

Ah well, never mind. I have a good excuse as I've been very busy working on something else, which I'll explain in a post sometime soon. For now though I've made a decision to use this blog more and also to include more regular small updates on work things as well as the other stuff I've been doing outside of work.

So, what work stuff have I been doing lately?

Twitter Experiment

We recently launched our twitter based experiment here. Uptake has been very good, the initial results are quite interesting and we're currently prepping a short paper for SocialCom which I personally think has a pretty good chance of being accepted.  I'll post about some of the initial findings soon.

Our analysis has revealed a couple more pieces of data that we need to collect, so I've got a couple of modifications to make to the site then we can hopefully start spreading the link a bit further to get a few hundred more respondents for a nice journal article.

Foursquare Experiment

The initial plans for this have been scaled back and refined, but this should be online soon. I'll post more in the next couple of weeks as I finish off development on this and get it ready for deployment.

CUROP Summer Projects

Walter and I were successful in applying for some CUROP funding for a summer project looking at retweet flow in Twitter. We've had a lot of interest in the project, so much so that we've managed to convince a couple of lovely people to part with some more cash so we can take on another student for an additional summer project, this one based on Twitter ranking. I'll post about both projects in detail once we've made our decision on which students to take on for the summer.

Mobisoc Website

Over the course of about a day Matthew Williams and I recently hacked together a website for the MobiSoc researchers here at CU, which is now live and will hopefully start to flourish as more people get involved. It's not much at the moment, but it could be good...

As well as those projects there's been discussions with some friendly scientists from Cambridge about getting our hands on some personality data, the ever present EU deliverables that have needed sorting, and the logistics for the next project partner meeting in Athens in June. It's been a busy couple of months.

Truckers of Husk + Gallops + Kutosis Review

March 10, 2012

Last Saturday I went along to The Globe in Roath, accompanied by my excellent musical companion Ms. Jones (and joined later on by our token northener Mark) to watch a number of bands that we'd seen before, but in a new setting. This is a review of that what happened there.

Let's start with the venue, The Globe. The Globe is not the worst venue in Cardiff (Bogiez/Barfly, The SU Great Hall). It's tucked out of the way in the corner of Roath, so anyone going to a gig there is meant to be there, which is nice. It's an old cinema and has a lovely balcony, meaning a great view of the bands, and most importantly the sound in the place is not terrible. The only bad thing about the venue this week was that they'd sold out of tolerable ale, which is no great problem. We arrived about twenty minutes after doors were due to open, and unsurprisingly the doors were not open yet. For some reason The Globe exists in a timezone that is about 45 minutes behind the rest of Cardiff, so don't ever bother getting there on time, it's just not going to happen. Rubbish timekeeping and crappy beer selection aside, it's a decent venue in which to watch a band and on this night there were three on offer.

KUTOSIS

I'm fairly sure I've seen these guys a number of times before, at least once or twice at the SWN festival, maybe somewhere else as well. They're a decent band rocking a mid-late 90's vibe that marks them out as guys that grew up with grunge and late 90's indie and decided the best thing to do when they started a band was to emulate that. Unfortunately you sometimes feel that they forgot to really add anything to that era, and haven't quite managed to push things on. Don't get me wrong, it's not a totally bad thing - they play well, they put effort into what they do, and their songs have a catchy uptempo feel. I enjoyed watching them, they were having fun and playing well, with some good riffs, tight playing and a good overall feel to the music. There are some interesting things going on with their songs, but it just feels like they need to push things a little bit further out of a safe zone and do something a bit more interesting. That all being said, they're well worth a watch, especially if you're like me and you grew up with grunge and late 90's indie and enjoy that kind of thing.

This is a recent track that seems in parts to show more development, it's got more going on:

Overall they were enjoyable, played well, and I'll give them 4 out of some.

Gallops

Again, I've seen Gallops a number of times before, mainly at SWN. On paper they're not my thing, focusing fairly heavily on electronic sounds, but in reality I really love them. They're an awesome live act and every time I see them they have something new to show. The main driving force behind them is the thundering and perfect drums; their drummer is an absolute animal, arrogant and agressive and with every right to sit behind the drumkit with the knowledge that he's almost certainly the most talented guy with a set of drumsticks in the room. He almost imbalances the band, as with his drumming there's not a lot the other band members can do to match his talent, which is probably one of their only weak points. The only other glaring weak point would be the guy with the mac. I've said it before and I'll say it again, a laptop is not an instrument. If your job could be replaced with someone pressing play on a tape deck you're superfluous, get the fuck off the stage. That aside, this band have grown massively in the two/three years I've been watching them, and I'm looking forward to their album in the next couple of months to see where they're going next.

Once again these guys were great live. They clearly spend a lot of time practising getting it right, and it pays off. I'll give them six out of some.

Truckers of Husk

Ah, the main event. A band that I first saw at the 2nd SWN Festival, in whatever Y Fuwch Goch was before it was that (and then failed to be that and became the Moon Bar). A band that have swiftly become one of my favourite live acts in Cardiff, a band who I love going to see. Their post-rock math vibes are great, and they are another band (like Gallops) who continue to grow and push stuff forward. Their first album, (released last year) is on permanent rotation in my Spotify playlist, and I still can't get enough of it. In case you didn't get it, I love this band.

Today's gig was exceptional, the band presenting themselves along with the 1924 documentary 'The Great White Silence', the fairly harrowing documentary about Scott's ill fated voyage to the South Pole. Dressed all in white the band played their songs along with the film, fitting the tracks to the projection behind them, timing things beautifully to coincide with the action. I'm fairly certain they'd put the effort in to trim songs here and there to make them fit, cutting some noise here, adding some there and it all paid off. They presented themselves on stage as a band having fun and doing well doing it, but you could tell that they recognised this was (one of?) their biggest gig to date and had put the effort in to make it something special. The songs flew by and by the time the movie drew to a close, with our intrepid arctic explorers dead in the arctic wasteland and the band themselves moving the drumkit down into the crowd for the last flour covered intense burst, we were all thoroughly sated. Plus the saxophonist kept his shirt on.

I give them 8 out of some.

All in all, a thoroughly excellent night of entertainment. Top marks.

100 Movies in the Cinema in 1 Year

March 4, 2012

After reading this blog post at the end of January I decided that watching at least 100 movies in the cinema in a year was the kind of challenge I could get behind. Last year's challenge to get fit and healthy was fairly successful and seemed like much harder work than going to the cinema a few times could ever be. After all, 100 movies a year is less than 2 a week, so it must be easy, right?

Unfortunately I started this project late, so at the time of reading the article and deciding to go ahead with it I was four weeks into the year and had only seen three films at the cinema. This is not the kind of progress needed for this challenge! So, I got myself a cineworld unlimited card and hit the cinema hard, aiming to get through February and be caught up to where I should be by the end of the month.

Of course, being a massive data geek I have been recording everything, making a note of every movie I have seen this year so far, which means I'm amassing a fairly large amount of data on my movie watching habits, which can mean only one thing. Crappy Excel Graphs! I'll be posting some throughout the year to mark how the challenge is going, and the first lot are here.

The first shows the number of movies seen in total for each day of the year against the number of movies I would need to see in order to hit the target of 100 by December 31st. As you can see, the late start in January did me no favours, but by the end of February I've almost caught up to be where I need to be.

Movies Seen against Target Number

The second crappy excel graph shows the movie viewing rate (the number of movies seen divided by the number of weeks elapsed) against the target rate, along with the actual number of movies watched that week. It's clear to see that January was a wash out, but that February was excellent and helped to bring the target rate down below 2 movies a week.

Movie viewing rate against target, presented with actual number of films seen in a week

So, that's where we're at. It's March and I've seen 16 films at the cinema, at an average cost of £3.18 per movie. I'm well on the way to 100 and I'll keep you updated with more crappy excel graphs and rubbish averages along the way...

Django + Last.FM Authentication

February 27, 2012

Following a conversation with Jon last week, I've been having thoughts about playing around with Last.FM data again, having not looked at that API for a couple of years. We had discussed the possibility of using the Last.FM web services as part of a project at a hackathon that some of the undergraduates are running in a couple of weeks time, and since I've moved from primarily developing using Java to mainly using Python since I last did anything with Last.FM I thought it would be useful to develop some basic python examples using the Last.FM API.

Yesterday I made a basic skeleton Django site that uses Last.FM for user authentication and includes a basic API wrapper for making queries. The code is all up on github for those that are interested.

Foursquare Category Icon Downloader (2)

January 31, 2012

UPDATE [01 August 2014]: Even more breaking! More recent version of the script here

Someone commented recently that the script I'd written to download the Foursquare category icons was broken. downloader.py is an update that works as of today (notice the URL param with today's date) that will fetch all available sizes of all the category icons and store them in folders by size.

matplotlib + mac os x 10.7 (Lion)

January 12, 2012

Today I spent a good long chunk of my day shouting at my computer and trying to install matplotlib, a very handy python library. I finally solved it, so figured it was worth posting about here, in case anyone else has the same problem.

I recently updated my system python to version 2.7, so I've had to go through reinstalling various libraries that I use on a semi-regular basis. Matplotlib is one of these that up until now I've had no call for. However, my experiment project that I'm working on needs some pretty graphs, so I needed to get matplotlib installed.

I tried to install using pip, I tried to install using easy_install, and I tried to install by downloading the source and compiling it. None of these methods worked and I was often getting compiler errors like this:

llvm-g++-4.2: error trying to exec '/usr/bin/../llvm-gcc-4.2/bin/powerpc-apple-darwin11-llvm-g++-4.2': execvp: No such file or directory

Which is very annoying. So I spent some time googling, and it seemed that the reason for these errors is that the compiler was trying to build a universal version of some library - a version that would work on both 32 and 64 bit intel macs and older powerpc based macs. However, Apple have now removed all traces of powerpc support from osX and Xcode, so the compiler was failing. There is a way to add ppc support back to Lion, but either I was doing it wrong or those instructions are slightly out of date, because it didn't solve my problem.  Now, depending on how I'd tried to fix the problem, it would fail with a lot of errors in compiling "ft2font.cpp". Other times it would be accompanied with "i686-apple-darwin10-gcc-4.0.1: Internal error". After uninstalling and reinstalling Xcode for the third time I realised I was going about this all wrong. I was trying to fix the problem, rather than look for why the problem was occuring. Why would my laptop try and build a universal library with powerpc support if it knew it couldn't compile or run ppc software?

The answer lay with my version of python. Running

<code>file $(which python)</code>

revealed that my version of python was also universal:

<code>
/usr/local/bin/python: Mach-O universal binary with 3 architectures
/usr/local/bin/python (for architecture i386): Mach-O executable i386
/usr/local/bin/python (for architecture ppc): Mach-O executable ppc
/usr/local/bin/python (for architecture x86_64): Mach-O 64-bit executable x86_64</code>

with support for ppc, i386 and x86_64 architectures. Because of this the compiler was trying to compile all the supporting libraries as universal as well.

I replaced python with the latest version (2.7.2), using the installer built only for i386/x86_64, and the problem was solved. Matplotlib built from source fine, and I'm pretty sure (though I didn't test it) that I could have installed using pip or easy_install too.

A few hours spent swearing at the computer and an important lesson learned. Worth it.

Introduction to Django talk

December 9, 2011

Last week I was asked by Stuart Allen to give a talk to the students in his first year "Problem Solving with Python" course on the topic of django, the python web framework. As I'm not lecturing at the moment I like to do these odd talks or cover the odd lecture just to keep up the skill set for when I start doing it again properly, so I agreed. Plus, I think django is ace and it's always nice to try and brainwash students into using educate students about your favourite technology :-)

The talk was actually really well timed as the students are currently working on a coursework in another module ("Web Applications") where they are required to develop a blog application using MySQL and PHP. This meant I was able to show off just how simple it would be to do it in django instead - so I gave a quick introductory talk then did some live coding and created a pretty functional blog application in about twenty five minutes. I think it annoyed quite a few of the students to see the difference in complexity between hand coding everything in PHP and having django take care of most of the boilerplate code. It may even have been a bit evil of me, but hey, that's what I'm here for.

The slides for the talk are attached to this post (Introduction to Django Talk), although they're of the "you need to be there to get the details" style of slide design, so they may not be that useful. Thanks to Mike Cantelon for his django keynote theme, and the lovely flickr users who uploaded photos with a CC license so I could reuse them.

BoxUK "For the Social Good" Hackday

November 22, 2011

Yesterday I attended the "For the Social Good" hackday, organised by BoxUK and held at the Students Union at Cardiff University. As you may have gathered from the title, it was a hackday themed around creating apps that had some benefit to society. The event ran from 10am to 10pm, so it was a fairly short hack event compared to some, which had a big influence on what could be done during the time available. In total, given the time needed for introductions and getting started in the morning, then for presentations and judging at the end of the day there was only actually about nine and a half hours of coding time, so it was quite a tough day. I went along with Matt Williams and Mark Greenwood as our team and over the course of the day we managed to put together a fairly functional web app.

We took what we learnt from hosting and participating in the Foursquare hackathon and used that to guide what we did at this hackday. At the last event we had been slow to get going, took too long to form a team and decide upon an idea. We had aimed too high to begin with, and then also let feature creep affect what we wanted to build. Finally, our teams were far too big, meaning it was hard to divide labour up so that everyone was participating and had something to do. We finished with a working app at the end of the event, but we were determined not to make the same mistakes this time so we could get a more successful outcome. (Not that FourCrawl wasn't a successful outcome of course, but more on that in the next couple of weeks.)

Going in to this hackday we solved the team size problem naturally, as there were only three of us able to attend which made forming a small team compulsory! To try and get a good start on the day of the event we met together on the Wednesday before to have a think about ideas and a bit of a brainstorming session. While we had a lot of ideas, most of them were too complex for a short day long hack event, so unfortunately we didn't really come up with anything. We did have a vague idea of doing something related to cold weather though, to make whatever 'it' was topical, given that we're heading into winter now. Fortunately at least one of us (not me) had obviously been thinking about it further after the meeting and on Saturday evening Matt sent round a message on G+ that he'd found a good data set we could use for an app that would be both easy to code within the time limit, and would hit the brief for the hackday pretty well. He'd found a dataset on data.gov.uk related to road accidents that included information on weather and road surface conditions that we could use to show people where to avoid in icy weather, or to give them an idea of previous incidents in their area. With an idea starting to form we went into Sunday in a pretty good state, and ready to start coding very quickly.

The day started well, everyone arrived pretty much on time, which made for a quick start. Our app idea solidified easily, and early on we realised that while we had historical data to display it would also be nice to have some current updates, so we added a social element to the idea, by scraping twitter for a certain hashtag so that people could report problems with icy roads in their area, along the lines of the '#uksnow' hashtag that has been used frequently the last couple of winters. We were able to jump in quickly to start getting things built. Matt was in charge of getting the data into a database and building an api to query to retrieve it, using django to get a backend up and running quickly. I was in charge of building a front end map to display the data on, primarily working with HTML and javascript (and the Google Maps API obviously) and then Mark was in charge of getting the twitter scraping and postcode geo-locating working using python (so it could be slotted into the django backend), accessing the twitter API through tweepy.

Things went really well with the implementation, I spent a while cannibalising elements from previous sites to get a frontend up and running quickly (and also thank you bootstrap!) and by lunch we were pretty close to starting to consolidate things together. We set a target of starting to deploy by 6, so that by the time dinner was served Matt was starting to battle his server and its stubborn refusal to allow easy deployment of django apps. Things went so smoothly in fact that we were all finished and ready with almost an hour to spare, meaning we were the only team able to slope off for a self-congratulatory pint before the presentations and judging. We headed to the Taf (a place I hadn't been in at least five years) for a swift one, then went back to the hackday for the final presentations.

Our presentation was first, and Matt demoed the app well while I played around on the laptop next to him. After us the rest of the teams went, and there were some really good apps shown off. My particular favourite was the 'occupy where' app which allowed you to search for an 'occupy' camp or demo near a location, then presented information pulled in from multiple sources about that particular 'occupy' movement. A nice idea, well executed. Following the presentations the judges spent some time talking, then announced the winners. The surprise of the day was that they'd decided our app was the best and that we'd won! I was pretty shocked, but it was a very nice surprise. A group of first years from COMSC were runners up, and then a third year COMSC student (Tom Hancocks) won the best individual developer prize, so COMSC actually walked away with all the prizes! After a quick tear down of the venue it was time to head to the Taf for some celebratory drinks, a satisfying end to a really great day.

UPDATE 25/11/11:

There some excellent other write ups by various other people involved in the event below. I'll add to this list as and when I find any more:

http://www.boxuk.com/news/box-uks-public-hacking-event-a-resounding-success

http://johngreenaway.co.uk/hack-day-for-the-social-good

http://marvelley.com/2011/11/24/for-the-social-good-box-uk-hackday-1-recap/

http://whitehandkerchief.co.uk/blog/?p=53

http://www.cs.cf.ac.uk/newsandevents/hackday.html

http://www.cardiff.ac.uk/news/articles/diwrnod-hacio-7933.html

http://www.boxuk.com/blog/the-box-uk-hack-day-a-review 

Foursquare Hackathon Post Mortem

November 22, 2011

**Note: I wrote this post just after the hackathon, but then forgot to actually publish it! **

There's a fair amount I learnt from taking part in and helping to organise the foursquare hackathon here at the university that probably deserves its own post. I'll split it into two parts, the taking part and the organisation.

Organising a hack event

There were a couple of things I learnt from helping to organise the event that perhaps weren't clear to begin with:

  1. Expect dropouts. A lot of them. We ended up with about 50% of the people that said they would attend actually attending. Which is fine because we hadn't planned on providing catering etc. anyway, but if we had, we may have overspent massively.

  2. Get someone in to run the day that doesn't actually want to code. All three of us organising the event really wanted to get involved and create stuff, so there was no-one left to organise the other essential things like food. Too often people would talk about ordering food, or going out for dinner and then get sucked back into programming and it would get forgotten about. By the end of the weekend we'd all eaten really badly, often very late at night. Having someone there to do things like order pizza or make coffee runs would probably have helped things go a lot smoother.

  3. If you're going to be communicating with the world at large using twitter or something similar, make sure you communicate with each other within the organising team about posting updates. Often two people would try and reply to a query at the same time, which made us look a bit silly. Not a major issue, but it might help give a more professional feel, event if you're total amateurs.

Participating in a hack event

We also learnt a lot about participating in a hack event. My top tips for a successful project would be:

  1. Do your research. If you can, go into the event with an idea already, just in case no-one else has any. It'll help you get started quicker.

  2. Keep it small. Don't over reach. 24/48 hours is not a long time to code something, and by making it too complicated you'll be disappointed with the outcome. Simple is best. Add extra features later if you have time.

  3. Small, agile teams. These projects have to be small because of the time constraints, so if teams get too big there won't be enough for everyone to do. People will end up feeling useless or left out, which is never good. I would say a maximum of 4 people per team.

  4. If you want to learn something new, a hack event can be a great place to be forced to up your skills in a particular area very quickly. It can also be a great place to learn new skills and tools from others. However, this may lead to the end result not being ideal. If you can live with that, great, you'll walk away happy. I personally upped my javascript knowledge over the weekend from approximately 0 to something approaching a passable working knowledge, which was great.

  5. Know your tools. This is pretty much common sense, but if you're not after learning something new and you just want a successful app/outcome, pick and use the tools you know really well. We went with django on the back end because myself and Matt know it pretty well, and we were able to get that part of the site running really really quickly. Had we gone with something else it may not have worked so well.

  6. Get coding. Screw the design, you can worry about that later. Start coding early, and code fast.

So thats the things I learnt from the weekend. Hopefully we'll be able to use this knowledge again in the future, both organising and participating in future events.