SRTM and LiDAR Data Acquisition

Reliable elevation data forms the geometric backbone of every hydrologic simulation, from overland flow routing to floodplain delineation. In modern Python-driven hydrology, manual tile selection and GUI-based downloads have been replaced by reproducible, API-driven pipelines. This guide details programmatic SRTM and LiDAR Data Acquisition, establishing a foundation for the broader Hydrology Data Preparation & DEM Processing workflow. By automating data retrieval, engineering teams eliminate version drift, ensure spatial consistency, and accelerate the transition from raw elevation grids to calibrated watershed models.

Prerequisites & Environment Configuration

Before implementing acquisition scripts, ensure your Python environment meets the following specifications:

  • Python 3.9+ with venv or conda isolation
  • Core GIS stack: rasterio>=1.3, geopandas>=0.12, shapely>=2.0, numpy>=1.23
  • Remote data clients: earthaccess>=0.8 (NASA Earthdata), requests>=2.31, laspy>=2.4 (point cloud I/O)
  • Authentication credentials: NASA Earthdata login account, OpenTopography API key (if accessing regional LiDAR)
  • Storage: Minimum 50 GB SSD for staging raw tiles, point clouds, and intermediate mosaics

Install dependencies via pip:

bash
pip install earthaccess rasterio geopandas laspy requests tqdm

Configure Earthdata credentials securely using environment variables or a .netrc file. Avoid hardcoding tokens in version-controlled scripts. For enterprise deployments, integrate with a secrets manager or CI/CD vault to inject credentials at runtime. Detailed authentication guidance is available through the NASA Earthdata Login documentation.

Strategic Dataset Selection: SRTM vs. LiDAR

Choosing between SRTM and LiDAR depends on watershed scale, required hydraulic precision, and computational constraints. SRTM provides near-global coverage at 30-meter (1-arc-second) resolution, making it ideal for regional basin analysis, continental-scale runoff modeling, and baseline terrain characterization. LiDAR delivers sub-meter vertical accuracy and dense point clouds, capturing microtopography, channel morphology, and engineered structures essential for urban drainage design and high-resolution 1D/2D hydraulic modeling.

When planning acquisition, consider spatial resolution tradeoffs early. Coarser DEMs smooth channel networks and underestimate flow accumulation, while ultra-high-resolution LiDAR introduces significant computational overhead and requires aggressive filtering to remove vegetation and infrastructure artifacts. Align your dataset choice with the intended model physics and available compute resources before initiating downloads.

Additionally, anticipate downstream coordinate transformations. Raw SRTM tiles typically arrive in WGS84 (EPSG:4326), while LiDAR datasets often use local projected systems or UTM zones. Establishing a consistent spatial reference early prevents geometric distortion during rasterization and ensures seamless integration with Coordinate Reference System Alignment protocols later in the pipeline.

Automated Acquisition Pipeline

A production-ready acquisition pipeline follows a deterministic sequence: spatial query, metadata resolution, authenticated retrieval, and structural validation. The workflow below is designed for idempotency, meaning repeated executions will only fetch missing or updated files without duplicating storage.

1. Spatial Boundary Definition & Catalog Query

Begin by loading your watershed boundary as a GeoDataFrame. Reproject to EPSG:4326 (WGS84) for compatibility with most global catalog APIs, then extract bounding coordinates for spatial filtering.

python
import geopandas as gpd
from shapely.geometry import box

# Load and reproject watershed boundary
watershed = gpd.read_file("data/watersheds/basin_04.shp")
watershed_4326 = watershed.to_crs(epsg=4326)

# Extract bounding box
minx, miny, maxx, maxy = watershed_4326.total_bounds
bbox = (minx, miny, maxx, maxy)
print(f"Querying catalog within: {bbox}")

For teams requiring granular control over tile selection, a dedicated implementation covering chunked requests and metadata caching is documented in How to download SRTM DEMs for Python hydrology workflows.

2. Authenticated SRTM Tile Retrieval

NASA distributes SRTM data through the Earthdata Cloud. The earthaccess library handles token negotiation, pagination, and resumable downloads automatically.

python
import earthaccess
import os
from pathlib import Path

# Authenticate (reads .netrc or prompts if missing)
auth = earthaccess.login()

# Search for SRTM GL30 tiles
results = earthaccess.search_data(
    short_name="SRTMGL30",
    bounding_box=bbox,
    temporal=("2000-02-01", "2000-02-23")  # SRTM acquisition window
)

# Download with idempotent behavior
output_dir = Path("data/raw/srtm")
output_dir.mkdir(parents=True, exist_ok=True)

files = earthaccess.download(results, str(output_dir))
print(f"Retrieved {len(files)} SRTM tiles")

This approach avoids manual cookie management and gracefully handles network interruptions. For large watersheds spanning dozens of tiles, consider implementing a retry decorator with exponential backoff to respect API rate limits.

3. Regional LiDAR Point Cloud Fetching

LiDAR acquisition typically requires querying regional portals. OpenTopography provides a robust REST API for discovering and downloading airborne laser scanning datasets.

python
import requests
import json
from tqdm import tqdm

OPENTOP_API = "https://portal.opentopography.org/API/otCatalog"
API_KEY = os.getenv("OPENTOP_API_KEY")

# Query available LiDAR datasets within bounds.
# The OpenTopography catalog endpoint uses lowercase bbox keys and a single
# productFormat token; an API key is required only for download endpoints.
params = {
    "productFormat": "PointCloud",
    "minx": minx,
    "miny": miny,
    "maxx": maxx,
    "maxy": maxy,
    "detail": "true",
    "outputFormat": "json"
}
if API_KEY:
    params["API_Key"] = API_KEY

response = requests.get(OPENTOP_API, params=params)
response.raise_for_status()
datasets = response.json().get("Datasets", [])

# Download first matching dataset (LAS/LAZ)
if datasets:
    target = datasets[0]
    download_url = target.get("downloadURL")
    las_path = Path("data/raw/lidar") / f"{target['datasetID']}.laz"
    
    with requests.get(download_url, stream=True) as r:
        r.raise_for_status()
        with open(las_path, "wb") as f:
            for chunk in tqdm(r.iter_content(chunk_size=8192), desc="Downloading LiDAR"):
                f.write(chunk)
    print(f"Saved LiDAR to {las_path}")

Always verify dataset licensing before commercial deployment. The OpenTopography Developer Portal maintains current endpoint specifications, rate limits, and dataset metadata schemas.

4. Structural Validation & Staging

Raw elevation data frequently contains voids, inconsistent metadata, or unexpected CRS definitions. Implement a lightweight validation routine before passing files to downstream processors.

python
import rasterio
import laspy

def validate_srtm_tile(filepath: Path) -> bool:
    try:
        with rasterio.open(filepath) as src:
            if src.count != 1:
                return False
            if src.nodata is None:
                return False
            return True
    except Exception:
        return False

def validate_lidar(filepath: Path) -> bool:
    try:
        with laspy.open(filepath) as las:
            header = las.header
            if header.point_count == 0:
                return False
            # Verify vertical datum is referenced
            return header.global_encoding > 0 or header.point_format_id >= 6
    except Exception:
        return False

# Run validation
srtm_valid = [f for f in Path("data/raw/srtm").glob("*.tif") if validate_srtm_tile(f)]
lidar_valid = [f for f in Path("data/raw/lidar").glob("*.laz") if validate_lidar(f)]

print(f"Validated {len(srtm_valid)} SRTM tiles and {len(lidar_valid)} LiDAR files")

Validation failures should trigger automated alerts rather than silent skips. Missing tiles or corrupted point clouds will propagate errors into flow direction matrices and hydraulic roughness calculations.

Transitioning to Hydrologic Conditioning

Once raw datasets pass structural validation, the pipeline shifts toward terrain conditioning. SRTM mosaics require seamless stitching, void filling, and hydrological enforcement to remove artificial sinks that disrupt flow routing. LiDAR point clouds must be classified, ground-filtered, and rasterized at a resolution compatible with your hydraulic solver.

At this stage, automated DEM Pit Filling Algorithms become critical for preserving natural drainage networks while eliminating numerical artifacts. Teams should also standardize vertical datums (e.g., NAVD88 vs. EGM96) before merging multi-source elevation products.

By codifying SRTM and LiDAR Data Acquisition into a version-controlled, API-driven workflow, organizations establish a repeatable foundation for watershed modeling, flood risk assessment, and climate resilience planning. The next phase focuses on terrain preprocessing, where raw grids are transformed into hydrologically sound digital elevation models ready for simulation.