Google Web History has now been recording all of the searches I made in Google since about 2005. Obviously 6 years of search queries and results is a phenomenal amount of data, and it would be nice to get hold of it all to see what I could make of it. Fortunately Google make the data available as an RSS feed, although it’s not particularly well documented.
(caution - many ‘ifs’ coming up next)
If you’re logged into your Google account the rss feed can be accessed at:
If you’re using a *nix based operating system (Linux, Mac OS X etc) you can then use wget on the command line to get the data. The below example works for retrieving the 1000 most recent searches in your history:
wget --user=GOOGLE_USERNAME \ --password=PASSWORD --no-check-certificate \ "https://www.google.com/history/?q=&output=rss&num=1000&start=0"
If you’ve enabled 2-factor authentication on your google account you’ll need to add an app-specific password for wget so it can access your account - the password in the example above should be this app-specific password, not your main account password. If you haven’t enabled 2 factor authentication then you might be able to use your normal account password, but I haven’t tested this.
A simple bash script will then allow you to download the entire search history:
for START in 0 1000 2000 3000 ... 50000 do wget --user=GOOGLE_USERNAME \ --password=WGET_APP_SPECIFIC_PASSWORD --no-check-certificate \ "https://www.google.com/history/?output=rss&num=1000&start=$START" done