Drawing a Map from Pub Locations with the Matplotlib Basemap Toolkit
Author: Ramiro Gómez
In this notebook I show how you can draw a map of Britain and Ireland from location data using the Matplotlib Basemap Toolkit. The data points that will be drawn are pub locations extracted from OpenStreetMap and provided by osm-x-tractor.
When you download and extract the Points Of Interest (POI) dataset as a CSV file it has a file size of about 800 MB and more than 9 million entries for different types of locations. To filter out only the locations tagged as pubs you can use csvgrep, which is part of the csvkit toolkit.
csvgrep -c amenity -m pub POIWorld.csv > POIWorld-pub.csv
The resulting CSV file is much smaller with a size of 7.7 MB and contains a little more than 120,000 entries for pub locations all over the world. Since the coverage varies across countries and regions, I chose to limit the map to Britain and Ireland where POI coverage seems quite comprehensive and there are a lot of pubs. Who could have thought?
Next we load the required libraries and define a function that checks whether a given location tuple is within the given bounding box.
%load_ext signature
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
def within_bbox(bbox, loc):
"""Determine whether given location is within given bounding box.
Bounding box is a dict with ll_lon, ll_lat, ur_lon and ur_lat keys
that locate the lower left and upper right corners.
The loction argument is a tuple of longitude and latitude values.
"""
return bbox['ll_lon'] < loc[0] < bbox['ur_lon'] and bbox['ll_lat'] < loc[1] < bbox['ur_lat']
The next statements load the pub dataset into a Pandas DataFrame, remove all columns that have missing values and print the length and the first few entries.
df = pd.read_csv('csv/POIWorld-pub.csv')
df.dropna(axis=1, how='any', inplace=True)
print(len(df))
df.head()
123379
| "osmid" | amenity | Longitude | Latitude | |
|---|---|---|---|---|
| 0 | 105 | pub | -0.143209 | 51.523681 |
| 1 | 108042 | pub | -0.135599 | 51.523544 |
| 2 | 262706 | pub | -0.725178 | 51.035030 |
| 3 | 262707 | pub | -0.854808 | 51.043900 |
| 4 | 262708 | pub | -0.726258 | 51.024400 |
Restrict dataset to actual pubs
Reddit user kwwillett pointed out that the dataset not only contains pubs in a comment. The csvgrep call also matches strings containing it, see some examples below.
df.amenity.value_counts().head()
pub 94061
public_building 27544
public_hall 774
public_bookcase 465
public_bath 176
dtype: int64
This changes number of pubs significantly so we need to perform additional filtering.
df = df[df.amenity.str.match(r'\bpub\b')]
df.amenity.value_counts()
pub 94061
pub;restaurant 12
pub;bar 4
pub; restaurant 3
pub:demolished 2
pub;ktv 1
pub,brewery 1
pub;atm;restaurant 1
pub;nightclub 1
pub;cafe;restaurant 1
pub;cafe;bar;lounge 1
pub food=no 1
pub;lottery 1
pub;bicycle_parking 1
dtype: int64
The remaining location classifications are good to include, so now we have the actual pubs of the world as covered by OpenStreetMap in July 2015.
Limiting locations geographically
To limit the dataset and the displayed map to Britain and Ireland, we now create a dict with appropriate coordinates from the GeoPlanet Explorer and filter the longitude and latitude values from the DataFrame that are within the bounding box. We end up with roughly 28,000 records, meaning that almost a quarter of the pubs in the POI dataset are located on Britain and Ireland.
bbox = {
'lon': -5.23636,
'lat': 53.866772,
'll_lon': -10.65073,
'll_lat': 49.16209,
'ur_lon': 1.76334,
'ur_lat': 60.860699
}
locations = [loc for loc in zip(df.Longitude.values, df.Latitude.values) if within_bbox(bbox, loc)]
len(locations)
28139
Drawing the map
Now we get to the actual drawing part. We set the size of the image and some variables for marker diplay, create a Basemap object using the Miller Cylindrical Projection and the coordinates looked up before to center the map and limit the area. Also we make sure that the background is white and no border is printed around the map image.
The previously created list of location tuples is then unpacked into 2 lists of longitudes and latitudes, which are provided as arguments in the call to the Basemap instance. This converts the lon/lat values (in degrees) to x/y map projection coordinates (in meters).
The location dots are drawn with the scatter method passing the x and y coordinate lists, and the marker display variables. The map is then annotated with a title that says what the map shows, a footer with information about the author and data source and finally displayed.
fig = plt.figure(figsize=(20, 30))
markersize = 1
markertype = ',' # pixel
markercolor = '#325CA9' # blue
markeralpha = .8 # a bit of transparency
m = Basemap(
projection='mill', lon_0=bbox['lon'], lat_0=bbox['lat'],
llcrnrlon=bbox['ll_lon'], llcrnrlat=bbox['ll_lat'],
urcrnrlon=bbox['ur_lon'], urcrnrlat=bbox['ur_lat'])
# Avoid border around map.
m.drawmapboundary(fill_color='#ffffff', linewidth=.0)
# Convert locations to x/y coordinates and plot them as dots.
lons, lats = zip(*locations)
x, y = m(lons, lats)
m.scatter(x, y, markersize, marker=markertype, color=markercolor, alpha=markeralpha)
# Set the map title.
plt.annotate('Britain & Ireland\ndrawn from pubs',
xy=(0, .87),
size=120,
xycoords='axes fraction',
color='#888888',
family='Gloria')
# Set the map footer.
plt.annotate('Author: Ramiro Gómez - ramiro.org • Data: OpenStreetMap - openstreetmap.org',
xy=(0, 0),
size=14,
xycoords='axes fraction',
color='#666666',
family='Droid Sans')
plt.savefig('img/britain-ireland-drawn-from-pubs.png', bbox_inches='tight')

The resulting map of Britain and Ireland shows the contour of the islands and unsurprisingly which areas are more or less densely populated. A large version of the map is available as prints from Redbubble and Zazzle.
%signature
Author: Ramiro Gómez • Last edited: August 05, 2015
Published: July 24, 2015 by Ramiro Gómez.
If you want to be notified about new content, click here to subscribe to the newsletter or RSS feed.
Disclosure: External links on this page may contain affiliate IDs, which means that I earn a commission if you make a purchase via such a link. This allows me to operate this site and offer hopefully valuable content that is freely accessible. More information about affiliate programs.