Visualisation of 566,638 checkins at 7,048 venues in New York City. Data from the Foursquare network.

I've been analysing Foursquare data for Paris, London and New York. In case you don't know, Foursquare is a location-based social network which users interact with on their 3G mobile phones.

Visit the project page for a number of visualisations and remarks on the data. What follows here is a discussion of the data and analytical techniques that form the basis of the content on the project page, so have a look at that page first before reading below.

Data

I've used Foursquare data relating social venues to checkins (activity) at those venues. The data has been collected by a systematic crawl of the Foursquare Search API, which returns upto 50 nearby venues when supplied with a geolocation. The radius of this search is not explicitly documented by Foursquare. For each city, I constructed a lattice of search locations 2km apart and performed a search on each point of the grid. 2km was chosen as it produces some overlap in results, implying good coverage of the intervening space between search locations. This resulted in 200-400 searches per city, the exact number varying based on the size of the surface area covered for each city.

The search API additionally takes keyword searches, and further passes of the grid were carried out using a number of keywords (bar, club, restaurant, cafe, museum, hall, food). The resulting venues list was then de-duplicated. As such, the data does not represent a comprehensive data dump, but sufficient venue data has been collected (in excess of 6,000 venues for each city) to assume a representative sample of Foursquare data for each city.

It should be noted that Foursquare produces data skewed towards the network demographic, which is a 3G mobile phone owning portion of the population engaged in online social networking (typically skewed towards under 35s).

Venues are classified into parks, arts, shops, food and nightlife according to Foursquare's own classification scheme.

Analytics

The claim that Paris has a more contiguously walkable structure is based on a scan-based clustering of the venue data, using the DBSCAN algorithm. With a threshold distance of 400m (chosen as a comfortable walking distance) and a minimum cluster size of 3 venues, Paris breaks down to far fewer, larger clusters than the other two cities (PAR,NYC,LDN = 254,394,439 clusters), generating under a quarter of the noise (PAR,NYC,LDN = 401,1795,1599). Noise in this case represents isolated venues that cannot be assigned to a cluster.

The claim that activity is less spatially dispersed in Paris is based on dispersion calculated for the 100 highest activity walkable cells using a weighted standard distance measure [more] where venue popularity (total number of checkins) is used as a weighting factor and euclidean distance is used as the distance measure. This gives us a measure of standard deviation in space of all the points taken into consideration, measured in meters (PAR,NYC,LDN = 4965,6473,8657).

Walkable links per venue are calculated by constructing a network representation in which venues are nodes and edges are produced when any two nodes are within a walkable distance of 400m. This creates undirected graphs with K edges (PAR,NYC,LDN = 55478,39277,44977). Edges per venue gives a rudimentary expression of global connectivity (PAR,NYC,LDN = 9.64,5.64,6.30). The degree distributions of these networks and further network characteristics are outside the scope of this article, but can follow from these representations.

Power law remarks relate to a regression analysis of venue popularity rank distributions for each city. Zipf's Law is only a fit for part of the distribution.

Tools

Processing and Flash were used for visualisation, Proj4 and Proj4js for coordinate conversion, Tom Taylor's Boundaries for neighbourhood names, the iGraph Ruby extension for network representations, R for statistical analysis, Google Maps Geocoding API and lastly, the Foursquare API, for all the venue and checkins data. Everything else was done by hijacking a language designed for Hypertext Preprocessing.

Download

The data used to create the images is made available in full under the Foursquare API terms. The data consists of csv files with aggregate checkin figures and geo-location data for each venue crawled. These files are broken down by city. Additionally, csv files are included with venue location converted to the so called WebMercator/GoogleMercator (EPSG:900913) projection, to facilitate visualisation using metric coordinates.

This data represents a snapshot collected in mid July 2010. In all there are over 800,000 checkins at over 20,000 venues. Checkins are expressed in summarised form for each venue as this is what is available via the public API. No raw checkins appear in the data.

I spent Spring in 17 & 18, was mostly to be found in 15, 31 & 34 during the Summer months and am now moving my activities over to 10 & 13 in time for Autumn. [GMap]







7 things I saw in Lisboa, 22/06-26/06.

Pencil & grid / Nasreen Mohamedi
Foam / Jorge Barbi
Space / Galeria Ze Dos Bois
Gesture / Manuel Mota & Margarida Garcia
Wordplay / Art & Language

On Air / Jim Ferraro
Violin Alap / Dr. L Subramaniam
SND / Florian Hecker

Title: Jean Baudrillard



Plans / Van Eyck
Fabrication / Driessens & Verstappen
Cubes / Piet Blom

Title: Benoit Mandelbrot.



Interventions: Thomas Demand/Olafur Eliasson.

Title: David Gissen, Subnature [read].

Control Space: Assembled images on urban cybernetics. Title: Norbert Wiener.



(a repetition reduced to two)

Space Syntax

Gerhard Richter: Overpainted Photographs, bound in a book by Hatje Cantz.

stro' phe' nome is a series of graphics produced using a gestural interface. A study of unseen architecture. The work is published as stdio.006.

Piotr Kamler & Bernard Parmegiani, Une Mission Ephemere (1993)








Minimal paths, pneumatics / Frei Otto
Cantenary Bifurcations / Thomas Wong
Artesanal Voronoi / Seven Six Five
Complex City / Lee Jang Sub
Three 3 / Kat Masback
Vector Fields / Biothing




We attain to dwelling, so it seems, only by means of building. The latter, building, has the former, dwelling, as its goal. Still, not every building is a dwelling.
The Old English and High German word for building, buan, means to dwell. This signifies: to remain, to stay in place. The real meaning of the verb bauen, namely, to dwell, has been lost to us. But a covert trace of it has been preserved in the German word Nachbar, neighbour. The neighbour is in Old English the neahgebur; neah, near, and gebur, dweller.
Building and thinking are, each in their own way, inescapable for dwelling.
Only if we are capable of dwelling, only then can we build.

- Martin Heidegger

Sennett's Corrosion of Character and Heidegger's Building, Dwelling, Thinking interacting on the site of SANAA's Moriyama House, Tokyo, which is arranged as a set of distinct housing components forming a network of compact structures[1][2]. Aside from its modularity (and flexibility) and play on house/garden public/private polarities, I'm drawn in particular to the proportions of the site. How the dimensions of both street and house are strictly related to the human body, despite it being a suburban location, how this kind of scale makes it seem all the more dwellable.

Photographs by Takashi Homma & Iwan Baan, from the books Tokyo and Single Story Urbanism.

A portrait of Eliane Radigue (2009) by Maxime Guitton.








Paris with H, 8/04-12/04

drawings / Claude Parent
horizons / Jan Dibbets
volumes / Charles-Édouard Jeanneret
cubes / Sol LeWitt

The full transcript of my first academic seminar is now online: Microplexes. It's housed at urbagram.net, which will be the home of my research into urban systems.


Papilio Dardanus exhibit phenotypic polymorphism in their variation of wing pattern. The forms can be reproduced mathematically using a variation of Turing's reaction-diffusion model with a particular reaction kinetics (Sekimura et al).

Philip Beesley's talk on living architectures, one of several highlights of mine at Sonic Acts XIII, along with J.P. Sonntag's low frequency standing waves and BJ Nilsen's multi-channel storm in a church.

Mark Wilson, PSC31, Digital Inkjet print (2003), @Room 90 V&A

Habitat pavilion, Montreal Expo '67
OMA Interlace residential complex, singapore

Top: Habitat '67 (Montreal Expo). Bottom: OMA interlace city, Singapore