Google Web History - Wordle

June 7, 2011

So, you've downloaded your Google search history, what's the first thing you do? Split all the queries into individual words and make a wordle of course:

There we have it - my use of Google for five years. Turns out I do programming and live in Cardiff. Who'd have thought it?

**edit: **why have I googled for google so much?

Losing Weight in 2011 continued... My Fitness Pal

June 7, 2011

(The first in a series of posts on apps I've found useful under 'the new regime')

One of the best apps/services I've found for general fitness and nutrition and weight loss is MyFitnessPal. I'm fairly sure I wouldn't have made quite as good progress without it.

The main selling point of the service is that it allows you to track what you eat and what exercise you do, in order to monitor and help regulate your calorific intake. When you create an account you put in the usual details as well as your weight and height, tell it how much activity you do on a daily basis and how much weight you'd like to lose, and it works out how many calories you should eat each day to hit that target. All you have to do is enter the food you eat (by searching the food database) and the exercise you do (cardiovascular/strength training can be entered separately) and it calculates your net deficit/over spend each day.

I'm a big sucker for life-logging, and logging each part of each meal takes this to the extreme. I now have almost 5 months of data on what I've been eating. Why I'd want this, I'm not sure, but it's there now! The service also handles logging stats such as weight, waist, neck and chest measurements, although the progress graphs for these leave a lot to be desired. Like most good services, this is a website with associated mobile app, with android, iPhone and blackberry versions available.

**Features: **Large database of foods with calorific and nutritional information Easy logging of food and exercise, weight and measurements Adjusts calorie allowance as weight changes Social features - friends, forums etc

**Pros: **Free! Mobile app makes it easy to log food or exercise while out and about Easy to stay on top of calorific intake - actually helps with weight loss Lots of support, encouragement and advice on forums Can contribute to database if food is missing Can report inaccurate data, up-vote correct data

Cons: Users can contribute to database and some users are stupid Progress graphs are pretty useless Mobile app and website sometimes disagree on calories burnt from exercise

The MyFitnessPal website is at http://www.myfitnesspal.com and the mobile apps are available here.

(thanks to my good friend Christopher for the initial heads up on this site!)

How to access and download your Google Web History with wget

June 5, 2011

Google Web History has now been recording all of the searches I made in Google since about 2005. Obviously 6 years of search queries and results is a phenomenal amount of data, and it would be nice to get hold of it all to see what I could make of it. Fortunately Google make the data available as an RSS feed, although it's not particularly well documented.

(caution - many 'ifs' coming up next)

If you're logged into your Google account the rss feed can be accessed at:

https://www.google.com/history/?q=&output=rss&num=NUM&start=START

If you're using a *nix based operating system (Linux, Mac OS X etc) you can then use wget on the command line to get the data. The below example works for retrieving the 1000 most recent searches in your history:

wget --user=GOOGLE_USERNAME  \
--password=PASSWORD --no-check-certificate \
"https://www.google.com/history/?q=&output=rss&num=1000&start=0"

If you've enabled 2-factor authentication on your google account you'll need to add an app-specific password for wget so it can access your account - the password in the example above should be this app-specific password, not your main account password. If you haven't enabled 2 factor authentication then you might be able to use your normal account password, but I haven't tested this.

A simple bash script will then allow you to download the entire search history:

for START in 0 1000 2000 3000 ... 50000  
do
wget --user=GOOGLE_USERNAME \
--password=WGET_APP_SPECIFIC_PASSWORD --no-check-certificate \
"https://www.google.com/history/?output=rss&num=1000&start=$START"
done

You may need to adjust the numbers in the first line - I had to go up to 50000 to get my entire search history back to 2005, you may need to make fewer calls if your history is shorter, or more if its longer.

Losing weight in 2011

June 5, 2011

Since the beginning of the year I've been living under what we've been calling 'the new regime'. This 'new regime' basically involves not living like a fat useless slob, so I've been getting fit, eating healthily and losing weight. So far I've lost over 10kg and can now run around the park a few times without collapsing to the floor clutching at my chest and screaming about ambulances, so I'd say its going pretty well. The basic concept behind the new regime is:

Eat less + do more = lose weight.

This will be followed once some weight has been lost by:

_Eat a normal amount + do more = stay the same. _

About a month ago, I came across someone somewhere on the internet recommending "The Hacker's Diet" as a guide for weight loss. Not having read such a guide before I started on 'the new regime' I skimmed it a bit; the tl;dr version is:

Eat less + do more = lose weight.

This doesn't exactly seem like rocket science to me, but lots of people seem to have a problem grasping this concept. The Hacker's Diet does a pretty decent job of describing the human body as a simple system with inputs and outputs and manages to explain that if you limit your input and increase your output, you get a deficit and lose weight. So if you find anyone that says 'Oh, I really struggle to lose weight', slap them round the back of the head and point them in that direction.

The whole point of this post is that the last couple of chapters of the book contain a lot of information about tracking the calories you eat, the calories you burn, analysing trend from daily weight figures and so on. There's a lot of detail on how to create spreadsheets to calculate weight trends, how to keep a daily log of calorie intake, and pages and pages of calorific information for food. The thing is, it's 2011 now so none of those chapters are necessary, because as with anything that's a pain in the rear end there are now loads of apps available to make life easier. As I've been using a number of them for 5 months I figure I'll share the knowledge and review some of them over the next few days.

Summer project

June 3, 2011

Over the summer, Ian and I are going to be supervising a summer project. We're going to get a 2nd year undergraduate student for an 8 week project. I've stuck up a page about it here, and will post whenever there's something interesting to see.

Paper published

June 2, 2011

Our latest paper ("Opportunistic social dissemination of micro-blogs") on some of the last work we did for the Socialnets project has finally been published, and can be viewed online here, or in preprint form from my publications page.

Unexpected

June 2, 2011

So I mentioned something about keeping busy? Yeah, well....

A while back I spent a couple of months playing around with an idea that we thought could make an interesting bit of research. It started with a simple modification to a protocol proposed by someone else (Gavidia et al, A gossip-based distributed news service for wireless mesh networks), where we added some elements of self-adaptation and cooperation. As things sometimes go, the results were quite good but not outstanding and we had more pressing things to look at, so we dropped it and moved onto something else.

It's never nice to just drop work and not get anything from it though, so we wrote a technical report about the work we'd completed that we could stick in a deliverable somewhere. At the same time, we noticed a conference workshop where a paper on the work might fit, and decided it might be an idea to trim the report down for submission. Unfortunately, at that point things kicked off with a couple of journal papers that we'd been working on at the same time so we didn't have time to do the submission.

Fast forward to the SocialNets/Recognition meeting in mid February and we learn that the deadline for the workshop was extended. On the spur of the moment we decided to have a bash at a paper for it. We cut the tech report down, gave it an edit, and submitted it. Fast forward again to this week and we get the notification through that the paper has been accepted. Previously we had a bit of work that would never see the light of day, buried at the back of an EU deliverable. Now for very little effort we have a published bit of work, I've got another publication to add to the list, and a trip to a conference as well. In Italy. In June. Sometimes life is just too cruel :-)

Keeping Busy

April 7, 2011

I have found that the secret to forgetting that it's my viva in 13 days is to keep busy. Extraordinarily busy. Luckily, work is conspiring right along with me, with coding on our first experiment in full go-mode, a workshop + social dinner next week to help organise, reading and planning for our main 'project' to do, a paper to edit and revise for a workshop and a side project looking at 4sq to move from prototype 'buggy as hell, falls over if you look at it funny' mode to 'production, solid as a rock, leave it running and forget about it' mode. That's all before you remember that there's also the final deliverables to work on as well! Luckily it's just what I need to keep my mind off the 20th.

Just hoping I can get it all done before I do actually need to stop and remember what it was I did for my PhD of course....

Django + Tweepy + OAuth

April 4, 2011

There is a lot of information out there that talks about putting Django and Tweepy together to make a Twitter web app. I read a lot of it recently, and although a lot of it is helpful some of it can be quite complicated or out of date.

To make things simple, I thought I'd create an example Django project that just contains the basics needed to get up and running.

The 'DjangoTweepy' project, hosted on Github here contains all the code needed to get up and running with a Twitter web app using Django and Tweepy. Download it and follow the instructions to get up and running quickly, or with a bit of hacking add the 'twitter_auth' app to your existing projects.

Twitter Wordle

April 4, 2011

Next week we have the project partners coming over to Cardiff for a workshop. Stu and myself were discussing that we should have some way to display information throughout the day, something that makes it easy to pick out the main themes of talks etc. I'm a big fan of the Wordle as a way of displaying text, so we thought it would be nice if we could have a dynamic Wordle displayed that people could add to throughout the workshop.

I found a python library called pyTagCloud that will turn text into wordle-like tag clouds either as images or as html. As I mentioned previously I already have a django based project on the go which interfaces with twitter, so I already had code written that uses Tweepy to OAuth with twitter and do a search for keywords. Combining the two, I get an app that will continually search twitter for a given keyword, extract the text from all the tweets and display a wordle along with the latest tweets. People can contribute to the display by tweeting with a given hash tag, and as long as we search for that hash tag, their opinions and notes will be displayed.

For instance, here's the display running with one of today's trending topics: 'Alan Titchmarsh':

It's not entirely perfect, I need to do some more filtering on the text to remove some words that aren't removed by the stop word filtering that pyTagCloud does such as 'rt', 'http', 'bit', 'ly' and so on. It would also be good to remove things that aren't words, like 'xxxxxx'. I've been looking at the Natural Language Toolkit for a couple of days for the other project, so I'll probably re-use some of that code here too.

The only problem now is that I'm probably going to be the only person at the workshop tweeting...