QGIS - Fixing Geometry Errors in Habitat Polygons

-
The devil is in the detail – meaning you will never get perfect data, no matter the source. Even if you have access to state provided parcel or landuse data, there will be geometry issues that can heavilly affect your entire spatial analysis.


The Challenge

Habitat or land use data is used for a variety of tasks. Looking at certain areas over time gives you answers how forests have grown, what turned from farmland into residential development areas or where land needs general improvement.

Comparing data from different time sets or sources is key. You might assume that polygons in land coverage should typically align, especially if there are no changes. Unfortunately, this is not always true, and the differences are most of the time, very tiny.
But no matter how tiny those overlaps or gaps, dangles or self-intersections might be, they might cause errors or unwanted results in geoprocessing and analysing.

Validation the input data is essential and fixing such geometry errors helps to get clean results. The output must align with parcel boundaries. As they set the official perimeter for all upcoming analyses.

In cooperation with Maplango, we developed a QGIS tool that helps to fix such geometry errors by giving the user also possibility to adjust certain thresholds to optimise the output.


Our Approach

First, we looked at the data and tried to understand what type of common geometry errors occur when comparing different datasets.
The most common and visible geometry errors are:

  • Overlapping Features     > results in counting area parts twice
  • Gaps between Features > results in not counting at all
  • Not aligned        > not aligned with higher hierarchies e.g. borders
  • Multipart Geometries  > might cause issues in geoprocessing
  • Sliver Polygons     > often result of polygon splitting

Other, not always visible errors are:

  • Self Intersections
  • Duplicate Vertices
  • Vertices within the Polygon

Common problems in polygon geometries.

Understanding these common errors helped us to understand what we need to fix.

Thought process

Not only do we need to fix the geometries, but we also need to make a comparison to another dataset of the same structure. The outcome is a new dataset that will include split and new polygons. These will show the changes that occurred in the land use.

Adjustability

The user should be able to adjust the parameters. How big gaps are varies from project to project. So, it makes sense to investigate some gaps first and let them set the parameters themselves.

Try and Error

Having an idea ready how to tackle the challenges ahead, it was time to figure out how to make it work. We basically tried everything at first manually, meaning running geoprocess after geoprocess and later tying all those steps together. The result is a very complex iteration of geoprocesses. The tool was built in QGIS model builder, later transformed into a Python script.

Screenshot of the tool's interface in QGIS

Data Output

The data output should contain a collection of feature values from the different sources, meaning current and proposed habitat types. Also, the geometries should be cleaned up and especially align with parcel boundaries.

Output geometries with fixed errors and resulting attribute fields.


Impact

The tool helps to quickly eliminate 95% of all geometry issues and delivers a data output that contains values of both input datasets. Ready to showcase the new proposed land use or habitat types.


Lessons Learned

Seeing this kind of geometry error is nothing new, but the amount of time spent fixing those before processing the data is very time consuming.
However, even though the piecing together, we learned that some invalid geometries are generated during the processing which caused later issues. By always checking the processing results of almost each step, we could eliminate those issues.

Previous
QGIS - Automated Permission Maps generated with Atlas

Add a comment

Email again: