These are the public datasets powering the v0 prototype. The USGS 3DEP coverage spans Brian's full licensure footprint (AL, GA, TN, MS, FL) — 247 projects, 30B+ public-domain points. The Hobu COPC samples are browser-streamable today. The academic benchmarks anchor our model training plan.
| Name | Size | Classified | Notes | Stream |
|---|---|---|---|---|
| Autzen Stadium | ~80 MB | ✓ | RGB + ASPRS classified. The canonical COPC demo. | View ↗ |
| Millsite, UT | medium | ✓ | Open terrain. Good for fly-through demos. | View ↗ |
| SoFi Stadium | ~2.3 GB | — | Very large. Proves streaming works at GB scale. | View ↗ |
Public domain. Hosted on AWS Open Data (s3://usgs-lidar-public, us-west-2). EPT format — native to Potree 2.0, transcodes to COPC via untwine.
| Project | State | Notes | EPT manifest |
|---|---|---|---|
| USGS_LPC_AL_JeffersonCo_2013_LAS_2015 | AL | Jefferson County (Birmingham metro). Direct hit for Weygand HQ. | ept.json ↗ |
| AL_NorthAL_2019 | AL | Madison / Limestone / Jackson / Morgan / Marshall. High-density 2019. | ept.json ↗ |
| AL_17Co_1_2020 | AL | 17-county central Alabama, 2020. Broadest recent coverage. | ept.json ↗ |
| AL_SWCentral_1_B22 | AL | Southwest Central Alabama, 2022. Newest available. | ept.json ↗ |
| USGS_LPC_AL_MobileCo_2014_LAS_2016 | AL | Mobile County. Coastal AL. | ept.json ↗ |
| USGS_LPC_TN_ShelbyCo_2017_LAS_2019 | TN | Memphis metro. | ept.json ↗ |
| USGS_LPC_GA_Georgia_A1_2016_LAS_2018 | GA | Georgia statewide block A1. | ept.json ↗ |
| USGS_LPC_MS_Madison_Yazoo_2012_LAS_2016 | MS | Madison / Yazoo, MS. | ept.json ↗ |
| USGS_LPC_FL_Panhandle_B1_2018_LAS_2019 | FL | Florida panhandle (Pensacola area). | ept.json ↗ |
Full list: 247 projects rendered on the /viewer map. Boundary index: github.com/hobuinc/usgs-lidar.
These are the canonical benchmarks for ground-classification ML. We pretrain on these (esp. DALES + OpenGF + Hessigheim) before fine-tuning on Weygand-labeled tracts.
| Dataset | Size | License | Notes | Link |
|---|---|---|---|---|
| DALES | ~500M pts | Academic (request) | Largest public ALS benchmark, 8 semantic classes. Dayton, OH. | ↗ |
| Hessigheim 3D (H3D) | ~10 GB | Research (registration) | UAV LiDAR ~800 pts/m² + textured mesh. Closest match for our use case. | ↗ |
| OpenGF | ~500M pts / 47 km² | Academic | Ground filtering benchmark. Direct relevance. | ↗ |
| Toronto3D | 78M pts | Academic | Mobile mapping LIDAR, 8 classes. | ↗ |
| SensatUrban | 2.85B pts | Academic | UK urban UAV. Recent benchmark. | ↗ |
OpenTopography hosts raw LAZ tiles per dataset with shapefile tile indexes. The intended workflow: query Catalog API → download tile index → PDAL merges selected tiles into a COPC you self-host. Not a one-URL drop-in but the highest-quality source for academic / research demos.