Getting longitude-latitude coordinates for a (long) list of cities using Python and a free API

Today I've decided to expand the number of cities included on my murder rate map to everywhere with 100,000+ people.
In order to do that using the FBI data (which only includes the names of the cities), I need to find the longitude-latitude for each city on my data set, and add it as new columns. This was not a big deal for the previous case, when I had 35 cities, but now my data set includes over 400, so I obviously won't be looking them up by hand.

Here is one way of doing it using Python:

First, you need to create a free account on OpenCage Geocoder, which is an API that can be use to look up coordinates of places, and also find out the place a set of coordinates corresponds to. You can use any API you want, really. I just picked this one for simplicity and convenience. You will then get YOUR_API_KEY that you need to use every time that you make a request for a location. You also need to install and import the corresponding Python package, opencage (here is a tutorial in case you want more info).

from opencage.geocoder import OpenCageGeocode

Let's start with a simple example, by looking for the coordinates of one single place. As an example, I'm gonna use Bijuesca, the village in Spain where I grew up, because it is awesome.

key = YOUR_API_KEY  # get api key from:

geocoder = OpenCageGeocode(key)

query = 'Bijuesca, Spain'  

results = geocoder.geocode(query)

print (results)

The 'results' variable has a lot more information than we need right now:

but you can access the important fields that include the info about the coordinates in a similar way as when accessing a Python dictionary:

lat = results[0]['geometry']['lat']
lng = results[0]['geometry']['lng']
print (lat, lng)

41.5405092 -1.9203562

Which are Bijuesca's coordinates!

Ok, so now we are ready to get the coordinates for all the cities in my data set, which looks like this:

As the simplest, not-most-efficient approach, I am going to iterate over each row to get the city and state, then use the API to get the corresponding coordinates. I'll save longitudes and latitudes in two separate lists. Then I can add these two lists as new columns once I'm done:

list_lat = []   # create empty lists
list_long = []

for index, row in df_crime_more_cities.iterrows(): # iterate over rows in dataframe

    City = row['City']
    State = row['State']       
    query = str(City)+", "+str(State)
    results = geocoder.geocode(query)   
    lat = results[0]['geometry']['lat']
    long = results[0]['geometry']['lng']

# create new columns from lists    
df_crime_more_cities['lat'] = list_lat   
df_crime_more_cities['lon'] = list_long

Here we have our dataframe with the new added columns:


Hi Julia,

thanks for the Blog. I was looking for an app to achieve the same and found your blog and simple python code. I will put this into use for my project.

Great to know that you are part of the NU.

Ramesh Natarajan
Unknown said…
Thank you very much for this useful code.

I wanted to do the same thing with some Asia's localities (names) but failed with the code as the outputs were only one coordinate from USA.

Will be grateful if there is any chance to have a help.

Many thanks !
Julia Poncela-Casasnovas said…
You can use this for any location (it doesn't need to be from the US).
For example, if you want to find out the lag-long coordinates of Beijing, China, you simply run:

>>> from opencage.geocoder import OpenCageGeocode
>>> key = YOUR_API_KEY
#NOTE: get your API key at:
>>> geocoder = OpenCageGeocode(key)
>>> query = 'beijing china'
>>> results = geocoder.geocode(query)
>>> lat = results[0]['geometry']['lat']
>>> lng = results[0]['geometry']['lng']
>>> print (lat, lng)
39.906217 116.3912757

Unknown said…
Huge thanks to you for posting this, Julia. You really made it easy for a non-hacker like me to solve this problem. Greatly appreciated!
Anonymous said…
Many thanks Julia for your reply on August 31, 2020.
It has been really appreciated.

Best wishes !