spacereg

A Spatial Standard Errors Implementation for Several Commonly Used M-Estimators

Luis Calderon (University of Bonn) & Leander Heldring (Northwestern University / Briq)

Conley standard errors for OLS, Logit, Probit, Poisson, and Negative Binomial — unified through the M-estimator framework.

Read the Paper View on GitHub

Python & STATA

Core Contribution

The Unifying Idea: M-Estimators

An M-estimator minimizes (or solves the first-order conditions of) an objective function. OLS, Logit, Probit, Poisson, and Negative Binomial are all M-estimators. This means their variance can be estimated with a single framework: the sandwich estimator. spacereg implements Conley's (1999) spatial HAC correction within this framework for all five models.

\widehat{\text{Var}}(\hat{\beta}) \;=\; \underbrace{H^{-1}}_{\text{Bread}} \;\;\underbrace{\Omega}_{\text{Filling}}\;\; \underbrace{H^{-1}}_{\text{Bread}}

Click a model below to see how its specific score vector and Hessian plug into this unified framework. The sandwich structure stays the same — only the ingredients change.

Score Vector $ g_i $

$$g_i = x_i \, \varepsilon_i$$

Bread Matrix $ H^{-1} $

$$\left(X^\top X\right)^{-1}$$

Linear regression estimated by ordinary least squares.

Score Vector $ g_i $

$$g_i = \bigl(y_i - \Lambda(x_i^\top\beta)\bigr)\, x_i$$

Bread Matrix $ H^{-1} $

$$\left(-\frac{\partial^2 \ell}{\partial \beta \,\partial \beta^\top}\right)^{\!-1}$$

Binary outcomes with logistic link function, estimated by maximum likelihood.

Score Vector $ g_i $

$$g_i = \frac{\bigl(y_i - \Phi(x_i^\top\beta)\bigr)\,\phi(x_i^\top\beta)}{\Phi(x_i^\top\beta)\bigl(1-\Phi(x_i^\top\beta)\bigr)}\, x_i$$

Bread Matrix $ H^{-1} $

$$\left(-\frac{\partial^2 \ell}{\partial \beta \,\partial \beta^\top}\right)^{\!-1}$$

Binary outcomes with normal CDF link, estimated by maximum likelihood.

Score Vector $ g_i $

$$g_i = \bigl(y_i - \exp(x_i^\top\beta)\bigr)\, x_i$$

Bread Matrix $ H^{-1} $

$$\left(-\frac{\partial^2 \ell}{\partial \beta \,\partial \beta^\top}\right)^{\!-1}$$

Count data with log-linear mean, estimated by maximum likelihood.

Score Vector $ g_i $

$$g_i = \frac{y_i - \mu_i}{1 + \alpha\,\mu_i}\, x_i$$

Bread Matrix $ H^{-1} $

$$\left(-\frac{\partial^2 \ell}{\partial \beta \,\partial \beta^\top}\right)^{\!-1}$$

Overdispersed count data with NB2 parameterization, estimated by maximum likelihood.

Spatial Correction

Spatial Dependence and Kernel Weights

When observations are spatially correlated, standard errors that ignore this dependence are too small, and confidence intervals under-cover. The Conley (1999) estimator corrects for this by weighting the outer product of score vectors by a spatial kernel. Observations close together receive higher weight; those far apart receive zero weight beyond a cutoff distance.

Cutoff $ c_d $: 3.0

Bartlett Uniform

Click any observation to select $ i $

$ w_{ij} = 0 $

$ w_{ij} = 1 $ ● Observation $ i $

Each dot is an observation $ j $. Its shade reflects the kernel weight $ w_{ij} $ — how much observation $ j $ contributes to the spatial correction of observation $ i $. The dashed circle shows the cutoff $ c_d $ beyond which $ w_{ij} = 0 $. The distance $ |s_{id} - s_{jd}| $ is the spatial separation between $ i $ and $ j $.

Bartlett Kernel

$$w_{ij} = \prod_{d=1}^{D} \max\!\left(0,\; 1 - \frac{|s_{id} - s_{jd}|}{c_d}\right)$$

Uniform Kernel

$$w_{ij} = \begin{cases} 1 & \text{if } |s_{id} - s_{jd}| < c_d \;\;\forall\, d \\ 0 & \text{otherwise} \end{cases}$$

Technical Detail

Methodology

The core of the Conley estimator is the spatially-weighted outer product of score vectors — the "filling" of the sandwich:

\Omega = \sum_{i=1}^{n} \sum_{j=1}^{n} w_{ij} \, g_i \, g_j^\top

Combined with the "bread" (the inverse Hessian), this gives the full variance estimator:

\widehat{\text{Var}}(\hat{\beta}) = H^{-1} \, \Omega \, H^{-1}

The matrix $\Omega$ is the spatially-weighted outer product of score vectors. $H$ is the Hessian. For OLS, $H^{-1} = (X^\top X)^{-1}$. For maximum likelihood estimators (logit, probit, poisson, NB), $H$ is the negative Hessian of the log-likelihood. spacereg computes both components and assembles the sandwich for each supported model.

Getting Started

Usage

import pandas as pd
from spacereg import SpatialStandardErrorsComputer

# Load data
data = pd.read_stata("spatial_data.dta")

# Initialize with coordinates and cutoffs
coordinates = ["C1", "C2"]
cutoffs = ["cutoff1", "cutoff2"]
base = SpatialStandardErrorsComputer(data, coordinates, cutoffs)

# OLS with Bartlett kernel (default)
ols_se = base.compute_conley_standard_errors_all_models(
    "OLS", y="dep", x=["indep1", "const"]
)

# Logit
logit_se = base.compute_conley_standard_errors_all_models(
    "logit", y="binarydep", x=["indep1", "const"]
)

# Poisson with Uniform kernel
poisson_se = base.compute_conley_standard_errors_all_models(
    "poisson", y="poissondep", x=["indep1", "const"],
    kernel="uniform"
)

* Load data
use "spatial_data.dta", clear
gen byte const = 1

* OLS with Bartlett kernel (default)
spacereg dep indep1 const, coords(C1 C2) cutoffs(100 100) model(ols)

* OLS with Uniform kernel
spacereg dep indep1 const, coords(C1 C2) cutoffs(100 100) ///
    model(ols) kernel(uniform)

* Logit
spacereg binarydep indep1 const, coords(C1 C2) cutoffs(100 100) ///
    model(logit)

* Poisson
spacereg poissondep indep1 const, coords(C1 C2) cutoffs(100 100) ///
    model(poisson)

* Fixed effects with reghdfe
spacereg dep indep1, coords(C1 C2) cutoffs(100 100) ///
    model(reghdfe, fe1 fe2)

Sample Output

OLS Conley Standard Errors (Bartlett): indep1 0.2145 const 1.3311 Logit Conley Standard Errors: indep1 0.0533 const 0.2793 Probit Conley Standard Errors: indep1 0.0328 const 0.1720

Empirical Validation

Finite-Sample Performance

We evaluate finite-sample performance via Monte Carlo simulation. The DGP uses a Gaussian copula to introduce spatial dependence while preserving correct marginal distributions. Setup: 200 repetitions, 10×10 grid ($N=100$), true $\beta = 0.5$, spatial correlation length $\varphi = 1.0$, Conley cutoff = 3.0 grid units.

95% Coverage Rates

Hover over each cell for detailed statistics.

HC1

Bartlett

Uniform

Logit

0.950

0.940

0.865

Probit

0.970

0.960

0.875

Poisson

0.900

0.830

0.935

0.880

0.815

Nominal: 0.95

Model	Mean β̂	Emp. SD	SE (HC1)	SE (Bartlett)	SE (Uniform)	Cov. HC1	Cov. Bartlett	Cov. Uniform
Logit	0.552	0.227	0.228	0.220	0.203	0.950	0.940	0.865
Probit	0.521	0.145	0.147	0.140	0.130	0.970	0.960	0.875
Poisson	0.500	0.105	0.094	0.092	0.088	0.900	0.900	0.830
NB	0.494	0.148	0.135	0.128	0.114	0.935	0.880	0.815

In the presence of spatial dependence, naive robust standard errors can produce confidence intervals that substantially under-cover. For Poisson, HC1 achieves only 90% coverage at the nominal 95% level. The Bartlett kernel generally provides the best coverage among the spatial estimators.

Get Started

Installation

Python

pip install pandas numpy statsmodels

Download spacereg.py and place it in your working directory.

Dependencies: Python ≥ 3.9, Pandas, NumPy, Statsmodels

Download spacereg.py

STATA

1. Download spacereg.ado and spacereg.sthlp

2. Place in your ado folder (find with sysdir in STATA)

3. Install dependency:

ssc install reghdfe

spacereg.ado spacereg.sthlp

Download Paper (PDF)

Reference

Citation

Calderon, L. and Heldring, L. (2026). “A Spatial Standard Errors Implementation for Several Commonly Used M-Estimators.” University of Bonn / Northwestern University.

BibTeX

@article{calderon2026spatial,
  title   = {A Spatial Standard Errors Implementation
             for Several Commonly Used {M}-Estimators},
  author  = {Calderon, Luis and Heldring, Leander},
  year    = {2026},
  institution = {University of Bonn and
                 Northwestern University}
}

The Unifying Idea: M-Estimators

Spatial Dependence and Kernel Weights

Bartlett Kernel

Uniform Kernel

Methodology

Usage

Finite-Sample Performance

95% Coverage Rates

Installation

Python

S STATA

Citation

BibTeX

STATA