Dasymetric Spatial Interpolation in BigQuery

Mapping census population to H3 hexagons

In my current job at Unacast, we transform large amounts of locational data and often times aggregate metrics based on administrative areas. But these areas vary in size and distribution of population, which makes them less suitable for normalizing mobility metrics or comparing areas with each other. Dasymetric spatial interpolation can mitigate this issue to a large degree, a and a spatial index gaining popularity is Uber's H3 hexagons.

  • A dasymetric interpolation using Open Street Map data
  • Dasymetric interpolation using parcel information
Census block groups in Orange county, FL; Dark red indicates higher population
WITH 
county_polyfill AS(
SELECT
jslibs.h3.ST_H3_POLYFILLFROMGEOG(county_geom ,7) as geo
FROM `bigquery-public-data.geo_us_boundaries.counties`
WHERE geo_id = '12095'
),
h3_hexagons as (
SELECT
h3,
jslibs.h3.ST_H3_BOUNDARY(h3) as h3geo
FROM county_polyfill, UNNEST(geo) as h3
),
geom_overlap AS(
SELECT
h3,
h3geo,
geo_id,
total_pop,
ST_AREA(ST_INTERSECTION(h3geo, blockgroup_geom)) as intersection_h3,
ST_AREA(blockgroup_geom) as cbg_area
FROM h3_hexagons
JOIN `bigquery-public-data.geo_census_blockgroups.us_blockgroups_national` cbg_geo
ON ST_INTERSECTS(h3geo, blockgroup_geom)
JOIN `bigquery-public-data.census_bureau_acs.blockgroup_2018_5yr` cbg_pop
USING(geo_id)
)
SELECT
h3,
ANY_VALUE(h3geo) as hex_geo,
SUM(SAFE_DIVIDE(intersection_h3,cbg_area) * total_pop) as h3_pop
FROM geom_overlap
GROUP BY h3
Population interpolated from CBG to H3 zoom level 7; simple overlap
WITH buildings_overlap AS (
SELECT
osm.osm_way_id,
osm.geometry,
cbg_geo.geo_id,
cbg_geo.blockgroup_geom,
ST_AREA(ST_INTERSECTION(blockgroup_geom,osm.geometry)) as intersection_area,
FROM `my-project.mydataset.buildings_osm` osm
JOIN `bigquery-public-data.geo_census_blockgroups.us_blockgroups_national` cbg_geo
ON ST_INTERSECTS(blockgroup_geom, geometry)
WHERE CONCAT(state_fips_code, county_fips_code) = '12095'
),
--total area of all residential polygons for each CBG
building_area_per_cbg AS (
SELECT
geo_id,
SUM(intersection_area) as building_area_total
FROM buildings_overlap
GROUP BY geo_id
),
--population for each overlapping area
share_population_building AS (
SELECT
osm_way_id,
geometry,
geo_id,
SAFE_DIVIDE(intersection_area, building_area_total) * total_pop as building_pop_overlap
FROM buildings_overlap
JOIN building_area_per_cbg
USING(geo_id)
JOIN `bigquery-public-data.census_bureau_acs.blockgroup_2018_5yr`
USING(geo_id)
),
population_per_building as (
SELECT
osm_way_id,
ANY_VALUE(geometry) as building_geo,
SUM(building_pop_overlap) as building_pop_total
FROM share_population_building
GROUP BY osm_way_id
),
county_polyfill as (
SELECT
jslibs.h3.ST_H3_POLYFILLFROMGEOG(county_geom ,7) as geo
FROM `bigquery-public-data.geo_us_boundaries.counties`
WHERE geo_id = '12095'
),
h3_hexagons as (
SELECT
h3,
jslibs.h3.ST_H3_BOUNDARY(h3) as h3geo
FROM county_polyfill, UNNEST(geo) as h3
),
--overlap of each H3 hexagon with each building
geom_overlap as (
SELECT
h3,
h3geo,
osm_way_id,
ST_AREA(ST_INTERSECTION(building_geo, h3geo)) as intersection_h3,
ST_AREA(building_geo) as building_area,
building_pop_total
FROM h3_hexagons
JOIN population_per_building
ON ST_INTERSECTS(building_geo, h3geo)
)
SELECT
h3,
ANY_VALUE(h3geo) as hex_geo,
SUM(SAFE_DIVIDE(intersection_h3,building_area) * building_pop_total) as h3_pop
FROM geom_overlap
GROUP BY h3
Population interpolated from CBG to H3 zoom level 7 using OSM data for refinement
Population interpolated from CBG to H3 zoom level 7 using parcel data for refinement
Top Left: CBG polygons and population; Top Right: spatial interpolation- simple overlap; BottomLeft: spatial interpolation using OSM data; Bottom Right: spatial interpolation using parcel data
  • OSM building polygon approach- assumption: population increases linearly with building area within CBG
  • Parcel polygon approach- assumption: population increases linearly with parcel area within CBG

Sources:

Dasymetric maps: https://en.wikipedia.org/wiki/Dasymetric_map
H3: https://eng.uber.com/h3/
JS library: https://github.com/CartoDB/bigquery-jslibs
Carto: https://spatial-data-science-conference.com/
Open Street Map: https://www.openstreetmap.org/
BigQuery Geography functions: https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions
Kepler: https://kepler.gl/
BigQuery public datasets: https://cloud.google.com/bigquery/public-data
Geospatial data science: https://geographicdata.science/book/intro.html

Economist turned Data Scientist. Creating human mobility insights at Unacast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store