Removing flat area artifacts from flow direction grids
Removing flat area artifacts from flow direction grids requires preprocessing the digital elevation model (DEM) to eliminate zero-gradient cells before computing routing matrices. In Python, this is reliably achieved by applying a depression-filling algorithm followed by a flat-resolution step that imposes a minimal, deterministic gradient (typically 0.001–0.01 m) across contiguous flat zones. The pysheds library provides a memory-efficient, array-native pipeline for this workflow, while whitebox and richdem offer C+±backed alternatives for production-scale environments.
Why zero-gradient cells break hydrologic routing
Flat areas originate from LiDAR interpolation artifacts, DEM resampling, or genuine topographic features like floodplains, playas, and lake beds. When a flow direction algorithm encounters a cell with no downslope neighbor, routing halts, creating artificial sinks that break watershed delineation and corrupt accumulation matrices. This is particularly disruptive when implementing Multiple Flow Direction Methods, where proportional flow partitioning assumes continuous slope gradients. Unresolved flats propagate errors downstream, inflate computational overhead due to iterative sink checks, and ultimately compromise Flow Routing Algorithms & Stream Network Extraction pipelines.
Key failure modes include:
- Routing dead-ends: D8/D∞ algorithms return null directions, leaving cells unassigned.
- Accumulation underestimation: Flow paths terminate prematurely, skewing discharge calculations.
- Network fragmentation: Stream extraction thresholds fail to trigger, producing disconnected channel heads.
Core algorithmic workflow
The standard resolution sequence follows three deterministic stages:
- Depression filling: Breaches or fills closed basins to ensure hydrologic connectivity. Modern implementations use priority-queue flood algorithms (e.g., Wang & Liu, 2006) that guarantee O(n log n) performance and preserve natural drainage divides.
- Flat resolution: Identifies contiguous zero-slope regions and imposes a micro-gradient toward the nearest downslope outlet. The gradient magnitude must be small enough to avoid distorting real topography, but large enough to survive floating-point precision limits.
- Direction computation: Calculates flow vectors on the modified surface. The output raster encodes direction codes (1–8 for D8, continuous angles for D∞, or fractional weights for MFD).
For authoritative implementation details, consult the pysheds API reference and the WhiteboxTools hydrological analysis documentation.
Production-ready Python implementation
The following function chains depression filling, flat resolution, and flow direction computation while preserving geospatial metadata. It uses rasterio for robust I/O and pysheds for array-native processing.
import rasterio
import numpy as np
from pysheds.grid import Grid
from pathlib import Path
def remove_flat_artifacts(
dem_path: str,
output_dem_path: str,
output_fd_path: str,
flat_gradient: float = 0.001,
fd_method: str = 'D8'
) -> tuple[str, str]:
"""
Fills depressions, resolves flat areas, and computes a clean flow direction grid.
Returns paths to the resolved DEM and flow direction raster.
"""
dem_path = Path(dem_path)
if not dem_path.exists():
raise FileNotFoundError(f"DEM not found: {dem_path}")
# Load DEM with explicit float32 casting to prevent gradient clipping
with rasterio.open(dem_path) as src:
dem = src.read(1).astype(np.float32)
transform = src.transform
crs = src.crs
nodata = src.nodata
# Initialize pysheds grid
grid = Grid.from_array(dem, affine=transform, crs=crs)
# 1. Fill depressions (carves/fills based on topological context)
grid.fill_depressions(dem, out_name='dem_filled')
# 2. Resolve flat areas by imposing minimal gradient
# flat_gradient controls the slope imposed across zero-gradient cells
grid.resolve_flats('dem_filled', out_name='dem_resolved', flat_gradient=flat_gradient)
# 3. Compute flow direction on the resolved surface
grid.flowdir('dem_resolved', out_name='flow_dir', method=fd_method)
# Extract processed arrays
resolved_dem = grid.view('dem_resolved')
flow_dir = grid.view('flow_dir')
# Write resolved DEM
with rasterio.open(
output_dem_path, 'w',
driver='GTiff',
height=resolved_dem.shape[0],
width=resolved_dem.shape[1],
count=1,
dtype=resolved_dem.dtype,
crs=crs,
transform=transform,
nodata=nodata
) as dst:
dst.write(resolved_dem, 1)
# Write flow direction raster
with rasterio.open(
output_fd_path, 'w',
driver='GTiff',
height=flow_dir.shape[0],
width=flow_dir.shape[1],
count=1,
dtype=flow_dir.dtype,
crs=crs,
transform=transform,
nodata=0
) as dst:
dst.write(flow_dir, 1)
return output_dem_path, output_fd_path
Validation & troubleshooting checklist
Before integrating resolved grids into hydrologic models, verify output integrity:
- Residual flat count:
np.sum(flow_dir == 0)should approach zero. Values >0.1% indicate unresolved lakes or masked boundaries requiring pre-processing. - Direction code distribution: D8 outputs should contain values 1–8 (or 255 for nodata). Unexpected zeros or negatives signal array misalignment.
- Gradient magnitude check: Compute
np.nanmax(np.abs(np.gradient(resolved_dem))). Values should not exceedflat_gradient * 1.5in non-flat zones. - Lake masking: Genuine water bodies should be excluded from flat resolution. Apply a permanent water mask (e.g., JRC Global Surface Water) before running
resolve_flatsto prevent artificial drainage through lakes. - Edge handling: Pysheds pads arrays internally. If your DEM touches the raster boundary, ensure
nodatais correctly propagated to avoid directional artifacts at the edges.
Scaling to large rasters
Array-native Python workflows excel at rapid prototyping but encounter memory ceilings with continental-scale DEMs (>10 GB). For production pipelines:
- Windowed processing: Use
rasterio’sblock_shapesandwindowedreading to process tiles sequentially. Stitch outputs with overlapping buffers to prevent edge discontinuities. - C++ backends: Switch to
richdemorwhiteboxCLI tools for multi-threaded, out-of-core processing. Both implement priority-queue flood algorithms optimized for cache locality. - Dask integration: Wrap raster I/O in
dask.arrayto parallelize tile processing across distributed workers. Maintain spatial indexing to preserve watershed topology across chunk boundaries. - Precision management: Always cast DEMs to
float32before resolution.float64doubles memory overhead without improving hydrologic accuracy, whileint16clips micro-gradients during flat resolution.
When flat resolution is correctly applied, downstream routing matrices stabilize, accumulation surfaces become monotonic, and stream networks align with observed hydrography. This preprocessing step is non-negotiable for reproducible hydrologic modeling and automated watershed extraction.