Geospatial data models

Helena Mitasova

GIS/MEA582 Geospatial Modeling and Analysis NCSU

Learning objectives

  • Define raster and vector data models
  • Distinguish between continuous and discrete phenomena
  • Understand and use data models transformations
  • Recognize geospatial data formats
  • Identify and use on-line geospatial data repositories and services

Geospatial data models

Mapped data, results of modeling or analysis are represented in GIS using

  • raster (regular grid) data model
  • vector (feature) data model
  • specialized representations: meshes

Geospatial phenomena

  • Continuous fields
    • elevation surfaces
    • temperature, precipitation
    • concentration of chemicals in soil or water bodies

  • Discrete features: lines, points or areas with attributes
    • roads, buildings, cell towers
    • land use types, administrative units

  • Some phenomena can be treated as both types
    • agricultural fields (crop type vs. crop height), soil properties
    • population densities

Continuous fields

  • each point in space is assigned a distinct value, change in values between neighboring points is small
  • mathematical representation: bi-variate or multi-variate continuous functions $w=f(x,y), w=f(x,y,z), w=f(x,y,z,t)$
  • often represented by a raster data model
  • vector model is also used: isolines, meshes, or points.

Discrete objects / features

  • points, lines, or areas (polygons) with attributes
  • represented by vector data model as geometry(shape) with attribute table
  • raster representation is also used

Raster data model: 2D

  • header: spatial extent and resolution, followed by matrix of values (INT, FP, DP),
  • continuous field : value assigned to a grid point
  • discrete object : category value assigned to pixel (area)

Raster data model: continuous fields

Elevation, 10m resolution (combined with shaded relief)
Precipitation, 500m resolution (color map draped over shaded relief)

Raster data model: discrete features

Land use classes at 30m resolution, qualitative data

Raster data model: discrete features

Roads: Speed limits for roads and walking speed for off-road areas, 30m resolution, quantitative data

Raster data model: 3D hybrid

  • vertical stack of 2D raster layers
  • can be used to represent soil horizons or geological layers
  • combined representation:
    • continuous (horizontally)
    • discrete (vertically)

Cross-sections through 3D model of soil horizons

Raster data model: 3D grid

  • header + 3D matrix of values, voxel model
  • spatial extent N,S,E,W,Top, Bottom
  • vertical resolution is usually much finer than horizontal
  • often used for 3D continuous representation $w=f(x,y,z)$

Soil properties: Percent organic carbon, soil pH reaction

Vector data model

Abstract representation of complex features
school – point, road – centerline, park - polygon

  • Geometry:
    • Points [x,y,(z)] represent point, line, or polygon(area) features
    • Set of points create elements of feature geometry: line: nodes, vertices; polygon: centroid, boundary
    • Geospatial Topology defines how the elements of feature geometry are interrelated and organized (don't confuse with topography!)
    • Topology ensures integrity of features and efficiency by defining shared coincident geometry (shared boundaries, nodes).

  • Attributes are stored in data management systems

Vector data model geometry

Point data: no topology

Elements of line data: vertices, nodes

Vector data model geometry

Elements: vertices(red), nodes(blue), centroids(green)

Polygons: vertices+nodes=boundaries, centroids

Vector data model examples

Geometry and attributes for point, line and polygon data:

Vector data: 3D models

Full 3D meshes: representation of structures

Modifications of data representation

  • Changing raster resolution, e.g., when model inputs are rasters at different resolutions

  • Changing vector geometry type, e.g., when model input requires different geometry than given data (points instead of lines)

Raster data - changing resolution

Resolution: size of the grid cell (pixel) in map units (m)

  • continuous fields: interpolation
    • the higher resolution raster values are interpolated using the values of the neighboring lower resolution cells
    • methods: bi-linear, bi-cubic, spline.

  • discrete raster data: nearest neighbor resampling
    • assigns the higher resolution cell the same value as the nearest lower resolution cell
    • resulting raster has only the values present in the input raster

Increasing resolution: continuous

Elevation at 30m resolution resampled to 10m resolution

Nearest neighbor creates "flats" in the resampled DEM, interpolation preserves smooth surface.
See equations for bi-linear interpolation

Increasing resolution: discrete

Geology at 30m resolution resampled to 10m resolution

Raster values are classes of observed geology

Increasing resolution: compare

Effect of resampling / reinterpolation on the results
More complex downscaling techniques using additional variables and machine learning may be needed if the difference in resolution is large

Decreasing resolution

Continuous data: nearest neighbor, average, min, max, or re-interpolation is used

Nearest neighbor resampling of 10m DEM to 30m and 20m DEMs

Decreasing resolution

Discrete data: nearest neighbor resampling, mode (most common or majority class)

Nearest neighbor resampling of 10m soil typemap to 30m and 20m maps

Modifying vector data

  • Converting vector data type
    • lines to points, areas to lines or points
    • points to lines: network building or interpolation may be needed
    • usually preserves the shape

  • Generalization
    • simplifying geometry while preserving important information
    • both data geometry and type can be modified
    • line to simplified line, polygon to simplified polygon or point
    • selecting subset of features
    • important when combining local, state and national scale data

Changing vector data type

Data geometry is not modified, but subset is extracted and stored in a different data structure

Topology building is required for conversions point to line, line to polygon

Conversion between data models

  • vector to raster
    • continuous: spatial interpolation (covered by a separate topic)
    • discrete: nearest neighbor

  • raster to vector
    • continuous: point sampling, isolines
    • discrete: nearest neighbor, grid center or boundary

Continuous: vector to raster data

Spatial interpolation is used to compute raster representation from point measurements

Discrete: vector to raster data

  • lines, areas: nearest neighbor
  • areas: attribute value applies to the entire polygon

Raster to vector data

Continuous data: isolines, sampling points

Raster to vector data

  • points – centers of grid cells
  • lines, polygon border lines: connected grid cell centers
  • thinning and smoothing is often performed for lines

Raster to vector data

  • areas – boundary, centroid, requires building topology
  • connects points on grid cell boundary

Common geospatial data formats

Format: specific implementation of data model,
open standard or proprietary

Raster

  • GIS (ascii and binary): GeoTIFF, ArcGRID, GRASS, SURFER
  • Imagery: MrSID, GeoTIFF, BIN, JPEG2000, IMG
  • Graphics: GIF, JPG, PNG, Bitmap
  • HDF, NetCDF

Vector

  • GeoPackage exchange format, KML, Shape, ArcSDE, GML, MapInfo, TIGER
  • PostGIS, OracleSpatial

Geospatial data format conversion

Format description is usually stored with data: automated format recognition and conversion

General library for geospatial raster and vector format conversions:
Geospatial Data Abstraction Library (GDAL/OGR)
gdal.org

given format - single abstract model - new format

GDAL includes command line utilities for data processing

Coupled with PROJ library it also provides coordinate system transformations

Data repositories

Data repositories: WMS, WPS, WebGIS
  • Web mapping Service
  • Web Processing Service

See Webpage with links to relevant services

Summary

  • raster and vector data models
  • modifying raster and vector data representation
  • converting between raster and vector data models
  • geospatial data formats
  • data repositories, wms services