Welcome to py-openaq’s documentation!¶
py-openaq provides easy access to the Open AQ API.
Installation¶
You can install this package in the usual way using pip:
pip install py-openaq
You can upgrade this package using pip as well:
pip install py-openaq --upgrade
Requirements¶
The only requirement for this package is requests. If you are not limited
by memory or space, I would highly recommend installing pandas and seaborn
which will enable you to use the new visualization helpers that were released with
version 1.
Current Limitations¶
As of now, the only feature that is not built into the API wrapper is returning various formats
from the openaq.OpenAQ.measurements call. This is because I don’t see any reason to use python
to return a csv. If a csv is your desired output, I recommend using pandas’ DataFrame.to_csv() method.
Initialization¶
The following code example shows how to make your first API call:
import openaq
api = openaq.OpenAQ()
status, resp = api.cities()
Understanding the Response format¶
Each API call will reply with a tuple containing the status code and the response in json format. The three most common API status codes you will see are:
- 200: Success
- 40x: Error: Bad Request
- 500: Server Error
The json response will look something like the following with both meta and results:
{
'meta': {'license': 'CC By 4.0', 'name': 'openaq-api', 'website': 'https://docs.openaq.org/'},
'results': [
{'city': 'Amsterdam', 'count': 71125, 'country': 'NL', 'locations': 14},
{'city': 'Antofagasta', 'count': 3416, 'country': 'CL', 'locations': 1},
{'city': 'Arica', 'count': 1682, 'country': 'CL', 'locations': 1},
{'city': 'Ayutthaya', 'count': 3880, 'country': 'TH', 'locations': 1},
{'city': 'Badhoevedorp', 'count': 7862, 'country': 'NL', 'locations': 1},
...
]
}
Coupling with Pandas DataFrame¶
The pandasize decorator was added to easily allow you to read in data directly to a DataFrame. To do so, simply add the argument df = True to your request.
The following API methods allow you to return your data as a DataFrame:
- cities
- countries
- latest
- locations
- measurements
- sources
By using this keyword argument, the results of the API call will return a pandas DataFrame rather than a json response.
Example:
>>> df = api.latest(df = True)
The results are parsed through the pandasize decorator which tries to interpret the fields in the most ideal format possible. Thus, all datetime fields should be converted to proper python datetimes to allow for easy splicing, manipulation, and plotting.
API Reference¶
-
class
openaq.OpenAQ(version='v1', **kwargs)¶ Create an instance of the OpenAQ API
Parameters: - version (string) – API version.
- kwargs – API options.
-
cities(*args, **kwargs)¶ Returns a listing of cities within the platform.
Parameters: - country (2-digit ISO code) – limit results by a certain country
- limit (number) – limit results in the query. Default is 100. Max is 10000.
- page (number) – paginate through the results. Default is 1.
- df (boolean) – convert the output from json to a pandas DataFrame
- index (string) – if returning as a DataFrame, set index to (‘utc’, ‘local’, None). The default is local
Returns: dictionary containing the city, country, count, and number of locations
Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.cities() >>> resp['results'] [ { "city": "Amsterdam", "country": "NL", "count": 21301, "locations": 14 }, { "city": "Badhoevedorp", "country": "NL", "count": 2326, "locations": 1 }, ... ]
-
countries(*args, **kwargs)¶ Returns a listing of all countries within the platform
Parameters: - limit (int) – change the number of results returned. Max is 10000. Default is 100.
- page (int) – paginate through results. Default is 1.
- df (boolean) – return the results as a pandas DataFrame
- index (string) – if returning as a DataFrame, set index to (‘utc’, ‘local’, None). The default is local
Returns: dictionary containing the code, name, count, cities, and locations.
Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.countries() >>> resp['results'] [ { "cities": 174, "code": "AT", "count": 121987, "locations": 174, "name": "Austria" }, { "cities": 28, "code": "AU", "count": 1066179, "locations": 28, "name": "Australia", }, ... ]
-
fetches(**kwargs)¶ Provides data about individual fetch operations that are used to populate data in the platform.
Parameters: - limit (int) – change the number of results returned. Max is 10000. Default is 100.
- page (int) – paginate through the results. Default is 1.
Returns: dictionary containing the timeStarted, timeEnded, count, and results
Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.fetches() >>> resp { "meta": { "name": "openaq-api", "license": "website": "page": 1, "limit": 100, "found": 3, "pages": 1 }, "results": [ { "count": 0, "results": [ { "message": "New measurements inserted for Mandir Marg: 1", "failures": {}, "count": 0, "duration": 0.153, "sourceName": "Mandir Marg" }, { "message": "New measurements inserted for Sao Paulo: 1898", "failures": {}, "count": 1898, "duration": 16.918, "sourceName": "Sao Paulo" }, ... ], "timeStarted": "2016-02-07T15:25:04.603Z", "timeEnded": "2016-02-07T15:25:04.793Z", } ] }
-
latest(*args, **kwargs)¶ Provides the latest value of each parameter for each location
Parameters: - city (string) – limit results by a certain city. Defaults to
None. - country (string) – limit results by a certain country. Should be a 2-digit
ISO country code. Defaults to
None. - location (string) – limit results by a city. Defaults to
None. - parameter (string) – limit results by a specific parameter. Options include [ pm25, pm10, so2, co, no2, o3, bc]
- has_geo (boolean) – filter items that do or do not have geographic information.
- coordinates (string) – center point (lat, long) used to get measurements within a certain area. (Ex: coordinates=40.23,34.17)
- radius (int) – radius (in meters) used to get measurements. Must be used with coordinates. Default value is 2500.
- limit (int) – change the number of results returned. Max is 1000. Default is 100.
- page (int) – paginate through the results.
- df (boolean) – return results as a pandas DataFrame
- index (string) – if returning as a DataFrame, set index to (‘utc’, ‘local’, None). The default is local
Returns: dictionary containing the location, country, city, and number of measurements
Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.latest() >>> resp['results'] [ { "location": "Punjabi Bagh", "city": "Delhi", "country": "IN", "measurements": [ { "parameter": "so2", "value": 7.8, "unit": "ug/m3", "lastUpdated": "2015-07-24T11:30:00.000Z" }, { "parameter": "co", "value": 1.3, "unit": "mg/m3", "lastUpdated": "2015-07-24T11:30:00.000Z" }, ... ] ... } ]
- city (string) – limit results by a certain city. Defaults to
-
locations(*args, **kwargs)¶ Provides metadata about distinct measurement locations
Parameters: - city (string, array, or tuple) – Limit results by one or more cities. Defaults to
None. Can define as a single city (ex. city = ‘Delhi’), a list of cities (ex. city = [‘Delhi’, ‘Mumbai’]), or as a tuple (ex. city = (‘Delhi’, ‘Mumbai’)). - country (string, array, or tuple) – Limit results by one or more countries. Should be a 2-digit ISO country code as a string, a list, or a tuple. See city for details.
- location (string, array, or tuple) – Limit results by one or more locations.
- parameter (string, array, or tuple) – Limit results by one or more parameters. Options include [ pm25, pm10, so2, co, no2, o3, bc]
- has_geo (boolean) – Filter items that do or do not have geographic information.
- coordinates (string) – center point (lat, long) used to get measurements within a certain area. (Ex: coordinates=40.23,34.17)
- nearest (int) – get the X nearest number of locations to coordinates. Must be used with coordinates. Wins over radius if both are present. Will add the distance property to locations.
- radius (int) – radius (in meters) used to get measurements. Must be used with coordinates. Default value is 2500.
- limit (int) – change the number of results returned. Max is 1000. Default is 100.
- page (int) – paginate through the results.
- df (boolean) – return results as a pandas DataFrame
- index (string) – if returning as a DataFrame, set index to (‘utc’, ‘local’, None). The default is local
Returns: a dictionary containing the location, country, city, count, sourceName, sourceNames, firstUpdated, lastUpdated, parameters, and coordinates
Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.locations() >>> resp['results'] [ { "count": 4242, "sourceName": "Australia - New South Wales", "firstUpdated": "2015-07-24T11:30:00.000Z", "lastUpdated": "2015-07-24T11:30:00.000Z", "parameters": [ "pm25", "pm10", "so2", "co", "no2", "o3" ], "country": "AU", "city": "Central Coast", "location": "wyong" }, ... ]
- city (string, array, or tuple) – Limit results by one or more cities. Defaults to
-
measurements(*args, **kwargs)¶ Provides data about individual measurements
Parameters: - city (string) – Limit results by a certain city. Defaults to
None. - country (string) – Limit results by a certain country. Should be a 2-digit
ISO country code. Defaults to
None. - location (string) – Limit results by a city. Defaults to
None. - parameter (string, array, or tuple) – Limit results by one or more parameters. Options include [ pm25, pm10, so2, co, no2, o3, bc]
- has_geo (boolean) – Filter items that do or do not have geographic information.
- coordinates (string) – center point (lat, long) used to get measurements within a certain area. (Ex: coordinates=40.23,34.17)
- radius (int) – radius (in meters) used to get measurements. Must be used with coordinates. Default value is 2500.
- value_from (number) – Show results above a value threshold. Must be used with parameter.
- value_to (number) – Show results below a value threshold. Must be used with parameter.
- date_from (date) – Show results after a certain date. Format should be
Y-M-D. - date_to (date) – Show results before a certain date. Format should be
Y-M-D. - sort (string) – The sort order (
ascordesc). Must be used with order_by. - order_by (string) – Field to sort by. Must be used with sort.
- include_fields (array) – Include additional fields in the output. Allowed values are: attribution, averagingPeriod, and sourceName.
- limit (number) – Change the number of results returned.
- page (number) – Paginate through the results
- df (boolean) – return the results as a pandas DataFrame
- index (string) – if returning as a DataFrame, set index to (‘utc’, ‘local’, None). The default is local
Returns: a dictionary containing the date, parameter, value, unit, location, country, city, coordinates, and sourceName.
Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.measurements(city = 'Delhi') >>> resp['results'] { "parameter": "Ammonia", "date": { "utc": "2015-07-16T20:30:00.000Z", 'local': "2015-07-16T18:30:00.000-02:00" }, "value": "72.9", "unit": "ug/m3", "location": "Anand Vihar", "country": "IN", "city": "Delhi", "coordinates": { "latitude": 43.34, "longitude": 23.04 }, "attribution": { "name": "SINCA", "url": "http://sinca.mma.gob.cl/" }, { "name": "Ministerio del Medio Ambiente" } ... }
- city (string) – Limit results by a certain city. Defaults to
-
parameters()¶ Provides a simple listing of parameters within the platform.
Returns: a dictionary containing the id, name, description, and preferredUnit. Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.parameters() >>> resp['results'] [ { "id": "pm25", "name": "PM2.5", "description": "Particulate matter less than 2.5 micrometers in diameter", "preferredUnit": "ug/m3" } ... ]
-
sources(*args, **kwargs)¶ Provides a list of data sources.
Parameters: - limit (number) – Change the number of results returned.
- page (number) – Paginate through the results
- df (boolean) – return the results as a pandas DataFrame
- index (string) – if returning as a DataFrame, set index to (‘utc’, ‘local’, None). The default is local
Returns: a dictionary containing the url, adapter, name, city, country, description, resolution, sourceURL, and contacts.
Example: >>> import openaq >>> api = openaq.OpenAQ() >>> status, resp = api.sources() >>> resp['results'] [ { "url": "http://airquality.environment.nsw.gov.au/aquisnetnswphp/getPage.php?reportid=2", "adapter": "nsw", "name": "Australia - New South Wales", "city": "", "country": "AU", "description": "Measurements from the Office of Environment & Heritage of the New South Wales government.", "resolution": "1 hr", "sourceURL": "http://www.environment.nsw.gov.au/AQMS/hourlydata.htm", "contacts": [ "olaf@developmentseed.org" ] } ... ]
Visualization Reference¶
-
openaq.viz.tsplot(*args, **kwargs)¶ If there are multiple locations && multiple params, issue a warning!
Parameters: - data – dataframe with data
- time – column name with time. Defaults to index.
- ax – Plot on ax if you would like.
- parameter – string with parameter to plot. Can only plot 1 at a time.