Getting longitude-latitude coordinates for a (long) list of cities using Python and a free API
Today I've decided to expand the number of cities included on my murder rate map to everywhere with 100,000+ people.
In order to do that using the FBI data (which only includes the names of the cities), I need to find the longitude-latitude for each city on my data set, and add it as new columns. This was not a big deal for the previous case, when I had 35 cities, but now my data set includes over 400, so I obviously won't be looking them up by hand.
Here is one way of doing it using Python:
First, you need to create a free account on OpenCage Geocoder, which is an API that can be use to look up coordinates of places, and also find out the place a set of coordinates corresponds to. You can use any API you want, really. I just picked this one for simplicity and convenience. You will then get YOUR_API_KEY that you need to use every time that you make a request for a location. You also need to install and import the corresponding Python package, opencage (here is a tutorial in case you want more info).
Let's start with a simple example, by looking for the coordinates of one single place. As an example, I'm gonna use Bijuesca, the village in Spain where I grew up, because it is awesome.
The 'results' variable has a lot more information than we need right now:
but you can access the important fields that include the info about the coordinates in a similar way as when accessing a Python dictionary:
Which are Bijuesca's coordinates!
Ok, so now we are ready to get the coordinates for all the cities in my data set, which looks like this:
As the simplest, not-most-efficient approach, I am going to iterate over each row to get the city and state, then use the API to get the corresponding coordinates. I'll save longitudes and latitudes in two separate lists. Then I can add these two lists as new columns once I'm done:
Here we have our dataframe with the new added columns:
In order to do that using the FBI data (which only includes the names of the cities), I need to find the longitude-latitude for each city on my data set, and add it as new columns. This was not a big deal for the previous case, when I had 35 cities, but now my data set includes over 400, so I obviously won't be looking them up by hand.
Here is one way of doing it using Python:
First, you need to create a free account on OpenCage Geocoder, which is an API that can be use to look up coordinates of places, and also find out the place a set of coordinates corresponds to. You can use any API you want, really. I just picked this one for simplicity and convenience. You will then get YOUR_API_KEY that you need to use every time that you make a request for a location. You also need to install and import the corresponding Python package, opencage (here is a tutorial in case you want more info).
from opencage.geocoder import OpenCageGeocode
Let's start with a simple example, by looking for the coordinates of one single place. As an example, I'm gonna use Bijuesca, the village in Spain where I grew up, because it is awesome.
key = YOUR_API_KEY # get api key from: https://opencagedata.com geocoder = OpenCageGeocode(key) query = 'Bijuesca, Spain' results = geocoder.geocode(query) print (results)
The 'results' variable has a lot more information than we need right now:
but you can access the important fields that include the info about the coordinates in a similar way as when accessing a Python dictionary:
lat = results[0]['geometry']['lat'] lng = results[0]['geometry']['lng'] print (lat, lng)
41.5405092 -1.9203562
Which are Bijuesca's coordinates!
As the simplest, not-most-efficient approach, I am going to iterate over each row to get the city and state, then use the API to get the corresponding coordinates. I'll save longitudes and latitudes in two separate lists. Then I can add these two lists as new columns once I'm done:
list_lat = [] # create empty lists list_long = [] for index, row in df_crime_more_cities.iterrows(): # iterate over rows in dataframe City = row['City'] State = row['State'] query = str(City)+", "+str(State) results = geocoder.geocode(query) lat = results[0]['geometry']['lat'] long = results[0]['geometry']['lng'] list_lat.append(lat) list_long.append(long) # create new columns from lists df_crime_more_cities['lat'] = list_lat df_crime_more_cities['lon'] = list_long
Here we have our dataframe with the new added columns:
Comments
thanks for the Blog. I was looking for an app to achieve the same and found your blog and simple python code. I will put this into use for my project.
Great to know that you are part of the NU.
Ramesh Natarajan
https://www.linkedin.com/in/rameshusa/
I wanted to do the same thing with some Asia's localities (names) but failed with the code as the outputs were only one coordinate from USA.
Will be grateful if there is any chance to have a help.
Many thanks !
For example, if you want to find out the lag-long coordinates of Beijing, China, you simply run:
>>> from opencage.geocoder import OpenCageGeocode
>>> key = YOUR_API_KEY
#NOTE: get your API key at: https://opencagedata.com/dashboard#api-keys
>>> geocoder = OpenCageGeocode(key)
>>> query = 'beijing china'
>>> results = geocoder.geocode(query)
>>> lat = results[0]['geometry']['lat']
>>> lng = results[0]['geometry']['lng']
>>> print (lat, lng)
39.906217 116.3912757
It has been really appreciated.
Best wishes !