GEO / Dataset

GEO Dataset

A harmonized, geocoded
school-level database

The GEO Dataset provides open, standardized, school-level administrative census data spanning geocoordinates, personnel, infrastructure resources, and educational outcomes across low- and middle-income countries — built for GeoAI, spatial data science, and development research.

countries available
4
data dimensions
80k+
schools
V1
current version
Available countries
Loading coverage layers...

Four joinable dimensions

Every country is structured as four flat CSV files sharing a common oedc_id key. All tables have identical column structure across countries.

📍
Geo
{ISO3}_geo.csv

School identity, geocoordinates with provenance tier, administrative hierarchy (adm0–adm3), GHSL urban/rural classification, and operational status. One row per school — no year column.

👩‍🏫
Personnel
{ISO3}_personnel.csv

Enrollment (total and sex-disaggregated), teaching staff counts, qualified teachers, classrooms, and pupil-teacher ratio. Aligned to UIS headcount definitions. One row per school × year.

🏗️
Resources
{ISO3}_resources.csv

WASH access (JMP basic tier), electricity, internet connectivity and type, computer lab, library, and permanent building status. All binary. One row per school × year.

📊
Outcomes
{ISO3}_outcomes.csv

Promotion, repetition, and dropout rates (UIS reconstructed cohort method), completion rate, and gross intake ratio where available. Expressed as proportions. One row per school × year.

Access the data

All data is released under CC BY 4.0. Please cite the accompanying data descriptor when using the GEO Dataset in published work.

Schema documentation: Full variable definitions, NA rules, and standards anchors for all four dimensions are available in the dataset schema. Per-country harmonization notes documenting source provenance, variable mappings, and known quality issues are in /metadata/{ISO3}_metadata.md.
Country ISO3 Schools Dimensions Year(s) Metadata Download
Loading available countries...

Full dataset bundle and harmonization notes also available on Harvard Dataverse upon V1 publication.

How to cite

Baier, H. (). Baier, H. (2025). GEO Dataset: A Multi-Country Geocoded School-Level Administrative 
Dataset for Low- and Middle-Income Countries [Data set]. GitHub. 
https://github.com/global-education-observatory/geo-dataset