QGIS - Fixing Geometry Errors in Habitat Polygons
The Challenge
Habitat or land use data is used for a variety of tasks. Looking at certain areas over time gives you answers how forests have grown, what turned from farmland into residential development areas or where land needs general improvement.
Comparing data from different time sets or
sources is key. You might assume that polygons in land coverage should
typically align, especially if there are no changes. Unfortunately, this is not
always true, and the differences are most of the time, very tiny.
But no matter how tiny those overlaps or gaps, dangles or self-intersections
might be, they might cause errors or unwanted results in geoprocessing and
analysing.
Validation the input data is essential and fixing such geometry errors helps to get clean results. The output must align with parcel boundaries. As they set the official perimeter for all upcoming analyses.
In cooperation with Maplango, we developed a QGIS tool that helps to fix such geometry errors by giving the user also possibility to adjust certain thresholds to optimise the output.
Our Approach
First, we looked at the data
and tried to understand what type of common geometry errors occur when
comparing different datasets.
The most common and visible geometry errors are:
- Overlapping Features > results in counting area parts twice
- Gaps between Features > results
in not counting at all
- Not aligned > not aligned with higher hierarchies e.g. borders
- Multipart Geometries > might cause issues in geoprocessing
- Sliver Polygons > often result of polygon splitting
Other, not
always visible errors are:
- Self Intersections
- Duplicate Vertices
- Vertices within the Polygon
Understanding these common errors helped us to understand what we need to fix.
Thought process
Not only do we need to fix the
geometries, but we also need to make a comparison to another dataset of the
same structure. The outcome is a new dataset that will include split and new
polygons. These will show the changes that occurred in the land use.
Adjustability
The user should be able to adjust
the parameters. How big gaps are varies from project to project. So, it makes
sense to investigate some gaps first and let them set the parameters
themselves.
Try and Error
Having an idea ready how to tackle the challenges ahead, it was time to figure out how to make it work. We basically tried everything at first manually, meaning running geoprocess after geoprocess and later tying all those steps together. The result is a very complex iteration of geoprocesses. The tool was built in QGIS model builder, later transformed into a Python script.Data Output
The data output should contain a collection of feature values from the
different sources, meaning current and proposed habitat types. Also, the
geometries should be cleaned up and especially align with parcel boundaries.
Impact
The tool helps to quickly eliminate 95% of all geometry issues and delivers a data output that contains values of both input datasets. Ready to showcase the new proposed land use or habitat types.
Lessons Learned
Seeing this kind of geometry error is
nothing new, but the amount of time spent fixing those before processing the
data is very time consuming.
However, even though the piecing together, we learned that some invalid
geometries are generated during the processing which caused later issues. By
always checking the processing results of almost each step, we could eliminate
those issues.
Add a comment