Title: | Areal Weighted Interpolation |
---|---|
Description: | A pipeable, transparent implementation of areal weighted interpolation with support for interpolating multiple variables in a single function call. These tools provide a full-featured workflow for validation and estimation that fits into both modern data management (e.g. tidyverse) and spatial data (e.g. sf) frameworks. |
Authors: | Christopher Prener [aut, cre] , Charlie Revord [aut], Branson Fox [aut] |
Maintainer: | Christopher Prener <[email protected]> |
License: | GPL-3 |
Version: | 0.1.8.9000 |
Built: | 2025-01-07 04:15:17 UTC |
Source: | https://github.com/chris-prener/areal |
A simple features data set containing the geometry and asthma estimates from the Centers for Disease Control for St. Louis.
data(ar_stl_asthma)
data(ar_stl_asthma)
A data frame with 106 rows and 24 variables:
full GEOID string
state FIPS code
county FIPS code
tract FIPS code
tract name
area of tract land, square meters
area of tract water, square meters
percent of residents with current asthma diagnosis, estimated
simple features geometry
Centers for Disease Control's 500 Cities Data
str(ar_stl_asthma) head(ar_stl_asthma) summary(ar_stl_asthma$ASTHMA)
str(ar_stl_asthma) head(ar_stl_asthma) summary(ar_stl_asthma$ASTHMA)
A simple features data set containing the geometry and associated attributes for the 2013-2017 American Community Survey estimates for race in St. Louis.
data(ar_stl_race)
data(ar_stl_race)
A data frame with 106 rows and 24 variables:
full GEOID string
state FIPS code
county FIPS code
tract FIPS code
tract name
area of tract land, square meters
area of tract water, square meters
total populaton count, estimated
total populaton count, margin of error
white populaton count, estimated
white populaton count, margin of error
black populaton count, estimated
black populaton count, margin of error
american indian and alskan native populaton count, estimated
american indian and alskan native populaton count, margin of error
asian populaton count, estimated
asian populaton count, margin of error
native hawaiian and pacific islander populaton count, estimated
native hawaiian and pacific islander populaton count, margin of error
other race populaton count, estimated
other race populaton count, margin of error
two or more races populaton count, estimated
two or more races populaton count, margin of error
simple features geometry
tidycensus
package
str(ar_stl_race) head(ar_stl_race) summary(ar_stl_race$ALAND)
str(ar_stl_race) head(ar_stl_race) summary(ar_stl_race$ALAND)
A simple features data set containing the 2010 Ward boundaries, which
are used as districts for Alderpersons who serve as elected representatives.
The OBJECTID
and AREA
columns are included to simulate "real"
data that may have superfluous or unclear columns.
data(ar_stl_wards)
data(ar_stl_wards)
A data frame with 28 rows and 4 variables:
Artifact from ESRI data creation
Ward number
area of each ward
simple features geometry
City of St. Louis
str(ar_stl_wards) head(ar_stl_wards) summary(ar_stl_wards$AREA)
str(ar_stl_wards) head(ar_stl_wards) summary(ar_stl_wards$AREA)
A simple features data set containing the 2010 Ward boundaries, which are used as districts for Alderpersons who serve as elected representatives. This version of the ward boundary has been modified so that the wards only extend to the Mississippi River shoreline.
data(ar_stl_wardsClipped)
data(ar_stl_wardsClipped)
A data frame with 28 rows and 2 variables:
Ward number
simple features geometry
City of St. Louis
str(ar_stl_wardsClipped) head(ar_stl_wardsClipped)
str(ar_stl_wardsClipped) head(ar_stl_wardsClipped)
Create Tessellations From SF Object
ar_tessellate(.data, shape = "square", size = 1)
ar_tessellate(.data, shape = "square", size = 1)
.data |
An object of class |
shape |
One of 'square' or 'hexagon', the shape to make tessellations from |
size |
Numeric multiplier for size of tessellations, default is one kilometer |
A sf
object
ar_tessellate(ar_stl_wards) ar_tessellate(ar_stl_wards, shape = "hexagon", size = .75)
ar_tessellate(ar_stl_wards) ar_tessellate(ar_stl_wards, shape = "hexagon", size = .75)
ar_validate
executes a series of logic tests for sf
object status,
shared coordinates between source and target data, appropriate project, and absence of
variable name conflicts.
ar_validate(source, target, varList, method = "aw", verbose = FALSE)
ar_validate(source, target, varList, method = "aw", verbose = FALSE)
source |
A |
target |
A |
varList |
A vector of variable names to be added to the |
method |
The areal interpolation method validation is being performed for. This
should be set to |
verbose |
A logical scalar; if |
If verbose
is FALSE
, a logical scalar is returned that is TRUE
is all tests are passed and FALSE
if one or more tests is failed. If verbose
is TRUE
, a tibble with detailed test results is returned.
ar_validate(source = ar_stl_asthma, target = ar_stl_wards, varList = "ASTHMA") ar_validate(source = ar_stl_asthma, target = ar_stl_wards, varList = "ASTHMA", verbose = TRUE)
ar_validate(source = ar_stl_asthma, target = ar_stl_wards, varList = "ASTHMA") ar_validate(source = ar_stl_asthma, target = ar_stl_wards, varList = "ASTHMA", verbose = TRUE)
aw_aggregate
sums the new estimates produced by aw_calculate
based on the target id. These are then joined with the target data. This is
the fourth step in the interpolation process after aw_weight.
aw_aggregate(.data, target, tid, interVar, newVar)
aw_aggregate(.data, target, tid, interVar, newVar)
.data |
A given intersected dataset |
target |
A |
tid |
A unique identification number within |
interVar |
A variable containing an interpolated value created by |
newVar |
Optional; a new field name to store the interpolated value in. If not specified,
the |
A sf
object with the interpolated value added to it.
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") %>% aw_total(source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive") %>% aw_weight(areaVar = "area", totalVar = "totalArea", areaWeight = "areaWeight") %>% aw_calculate(value = "TOTAL_E", areaWeight = "areaWeight") -> intersect aw_aggregate(intersect, target = wards, tid = WARD, interVar = TOTAL_E)
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") %>% aw_total(source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive") %>% aw_weight(areaVar = "area", totalVar = "totalArea", areaWeight = "areaWeight") %>% aw_calculate(value = "TOTAL_E", areaWeight = "areaWeight") -> intersect aw_aggregate(intersect, target = wards, tid = WARD, interVar = TOTAL_E)
aw_calculate
multiplies the given value
by the area weight. This
is the fourth step in the interpolation process after aw_weight.
aw_calculate(.data, value, areaWeight, newVar)
aw_calculate(.data, value, areaWeight, newVar)
.data |
A given intersected dataset |
value |
A column within |
areaWeight |
The name of the variable containing area weight per feature |
newVar |
Optional; a new field name to store the interpolated value in. If not specified,
the |
An intersected file of class sf with a new field of interest recalculated with area weight
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") %>% aw_total(source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive") %>% aw_weight(areaVar = "area", totalVar = "totalArea", areaWeight = "areaWeight") -> intersect aw_calculate(intersect, value = "TOTAL_E", areaWeight = "areaWeight")
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") %>% aw_total(source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive") %>% aw_weight(areaVar = "area", totalVar = "totalArea", areaWeight = "areaWeight") -> intersect aw_calculate(intersect, value = "TOTAL_E", areaWeight = "areaWeight")
This is the core function within the package for areal weighted interpolation. It validates both data sources before interpolating one or more listed values from the source data into the target data.
aw_interpolate(.data, tid, source, sid, weight = "sum", output = "sf", extensive, intensive)
aw_interpolate(.data, tid, source, sid, weight = "sum", output = "sf", extensive, intensive)
.data |
A |
tid |
A unique identification number within |
source |
A |
sid |
A unique identification number within |
weight |
For |
output |
One of either |
extensive |
A vector of quoted variable names to be treated as spatially extensive
(e.g. population counts); optional if |
intensive |
A vector of quoted variable names to be treated as spatially intensive
(e.g. population density); optional if |
Areal weighted interpolation can be used for generating demographic estimates for overlapping but incongruent polygon features. It assumes that individual members of a population are evenly dispersed within the source features (an assumption not likely to hold in the real world). It also functions best when data are in a projected coordinate system, like the UTM coordinate system.
A sf
object or a tibble
with the value or values interpolated into
the target
data.
aw_interpolate(ar_stl_wards, tid = WARD, source = ar_stl_race, sid = GEOID, weight = "sum", output = "sf", extensive = "TOTAL_E") aw_interpolate(ar_stl_wards, tid = WARD, source = ar_stl_asthma, sid = GEOID, weight = "sum", output = "tibble", intensive = "ASTHMA")
aw_interpolate(ar_stl_wards, tid = WARD, source = ar_stl_race, sid = GEOID, weight = "sum", output = "sf", extensive = "TOTAL_E") aw_interpolate(ar_stl_wards, tid = WARD, source = ar_stl_asthma, sid = GEOID, weight = "sum", output = "tibble", intensive = "ASTHMA")
aw_intersect
intersects the source and target datasets and
computes a new area field for the intersected data using the units associated
with whatever project the data are currently in. This is the first step in the
interpolation process after data validation and subsetting.
aw_intersect(.data, source, areaVar)
aw_intersect(.data, source, areaVar)
.data |
A |
source |
A |
areaVar |
The name of the new area variable to be calculated. |
A sf
object with the intersected data and new area field.
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) aw_intersect(wards, source = race, areaVar = "area")
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) aw_intersect(wards, source = race, areaVar = "area")
Provides a preview of the weight options for areal weighted interpolation.
This can be useful for selecting the final specification for aw_interpolate
without having to construct a pipeline of all of the subfunctions manually.
aw_preview_weights(.data, tid, source, sid, type)
aw_preview_weights(.data, tid, source, sid, type)
.data |
A |
tid |
A unique identification number within |
source |
A |
sid |
A unique identification number within |
type |
One of either |
A tibble with the areal weights that would be used for interpolation if type
is either "extensive"
or "intensive"
. If it is mixed, two tibbles (one for
"extensive"
and one for "intensive"
) are returned as a list.
aw_preview_weights(ar_stl_wards, tid = WARD, source = ar_stl_race, sid = GEOID, type = "extensive") aw_preview_weights(ar_stl_wards, tid = WARD, source = ar_stl_asthma, sid = GEOID, type = "intensive")
aw_preview_weights(ar_stl_wards, tid = WARD, source = ar_stl_race, sid = GEOID, type = "extensive") aw_preview_weights(ar_stl_wards, tid = WARD, source = ar_stl_asthma, sid = GEOID, type = "intensive")
aw_total
produces a new total area field that contains
the total area by source
id. This is the second step in the
interpolation process after aw_intersect.
aw_total(.data, source, id, areaVar, totalVar, type, weight)
aw_total(.data, source, id, areaVar, totalVar, type, weight)
.data |
A |
source |
A |
id |
A unique identification number |
areaVar |
The name of the variable measuring a feature's area, which is created as part of aw_intersect |
totalVar |
The name of a new total area field to be calculated |
type |
One of |
weight |
One of |
A sf
object with the intersected data and new total area field.
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") -> intersect aw_total(intersect, source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive")
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") -> intersect aw_total(intersect, source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive")
Verify Correct Extensive-Sum Interpolation
aw_verify(source, sourceValue, result, resultValue)
aw_verify(source, sourceValue, result, resultValue)
source |
A |
sourceValue |
A column within |
result |
A |
resultValue |
A column within |
aw_verify
ensures that the sum of the resulting interpolated
value is equal to the sum of the original source value. This functionality
only works for interpolations that are extensive and use the sum
approach to calculating areal weights.
A logical scalar; if TRUE
, these two values are equal.
result <- aw_interpolate(ar_stl_wards, tid = WARD, source = ar_stl_race, sid = GEOID, weight = "sum", output = "tibble", extensive = "TOTAL_E") aw_verify(source = ar_stl_race, sourceValue = TOTAL_E, result = result, resultValue = TOTAL_E)
result <- aw_interpolate(ar_stl_wards, tid = WARD, source = ar_stl_race, sid = GEOID, weight = "sum", output = "tibble", extensive = "TOTAL_E") aw_verify(source = ar_stl_race, sourceValue = TOTAL_E, result = result, resultValue = TOTAL_E)
aw_weight
creates an area weight field by dividing the area
field by the total area field. This is the third step in the interpolation
process after aw_weight.
aw_weight(.data, areaVar, totalVar, areaWeight)
aw_weight(.data, areaVar, totalVar, areaWeight)
.data |
A |
areaVar |
The name of the variable measuring a feature's area |
totalVar |
The name of the variable containing total area field by |
areaWeight |
The name of a new area weight field to be calculated |
A sf
object with the intersected data and new area weight field.
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") %>% aw_total(source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive") -> intersect aw_weight(intersect, areaVar = "area", totalVar = "totalArea", areaWeight = "areaWeight")
library(dplyr) race <- select(ar_stl_race, GEOID, TOTAL_E) wards <- select(ar_stl_wards, WARD) wards %>% aw_intersect(source = race, areaVar = "area") %>% aw_total(source = race, id = GEOID, areaVar = "area", totalVar = "totalArea", weight = "sum", type = "extensive") -> intersect aw_weight(intersect, areaVar = "area", totalVar = "totalArea", areaWeight = "areaWeight")