7 Spatial Regression Models
7.1 Spatial Regression Models
To address spatial autocorrelation when performing regression analyses, there are several models that are used to account for issues where:
- coefficient estimates may be biased
- standard errors may be incorrect
- significance tests may be misleading
- important spatial processes may be missed
7.2 Why Spatial Autocorrelation Happens
When regression residuals show spatial autocorrelation, it usually means something spatial is missing from the model But the reason can vary. There are several common types of spatial autocorrelation problems that each require a different modeling approach.
7.3 Four Types of Spatial Autocorrelation Issues
Spatial autocorrelation in regression often comes from one of four sources:
- Missing variables (confounding variables)
- The mix of confounding variables is so complex that we can’t identify the issue
- Each object might be impacting the ones next to it
- The relationship itself between the variables might changes as you move across space
7.3.1 Missing/Confounding Variables
Sometimes spatial autocorrelation appears because an important explanatory variable is missing. For example:
Suppose we model housing prices using:
- house size
- number of bedrooms
But we ignore:
- school quality
- neighborhood desirability
- access to parks
The best solution is to add better explanatory variables. This improves the model directly. Sometimes spatial autocorrelation disappears once the missing variable is included.
7.3.2 Spatial Error Model
Sometimes the missing influences are:
- numerous
- complex
- difficult to measure
- unknown
This creates spatially autocorrelated error. Essentially, nearby model errors become similar because unmeasured spatial processes are affecting observations.
A Spatial Error Model (SEM) handles this by modeling spatial structure in the residuals.
Key idea: the autocorrelation is in the error term, not in the dependent variable itself
7.3.3 Spatial Lag Model
Sometimes nearby locations directly affect one another. In this case, one observation influences neighboring observations
Examples:
- housing prices affecting nearby housing prices
- disease transmission
- voting behavior influencing nearby communities
Here, the response variable itself has spatial dependence.
A Spatial Lag Model (SLM) includes the influence of neighboring values directly in the regression.
Key idea: nearby outcomes help explain local outcomes
7.3.4 Geographically Weighted Regression (GWR)
Sometimes the relationship between variables is not constant.
For example:
- Rainfall may affect crop yield strongly in one region, but weakly in another.
- Population density may influence housing prices differently in cities versus rural areas.
This means that regression coefficients may vary across space
Geographically Weighted Regression (GWR) allows model relationships to change by location.
Instead of estimating one global coefficient, it estimates local coefficients across space. This reveals spatial variation in relationships.
7.4 Choosing the Right Model
Different spatial problems require different solutions:
| Problem | Model |
|---|---|
| Quantifiable Missing variable | Add explanatory variables |
| Spatially structured residual error | Spatial Error Model |
| Neighbor influence | Spatial Lag Model |
| Relationships vary across space | Geographically Weighted Regression |