Journal of Earth Sciences & Environmental Studies

ISSN: 2472-6397

Impact Factor: 1.235

VOLUME: 4 ISSUE: 1

Page No: 502-515

Exploiting low-cost and commonly shared aerial photographs and LiDAR data for detailed vegetation structure mapping of the Wadden Sea island of Ameland


Affiliation

Mücher, C.A.a,*, Kramer, H.a, Najafabadi, M.R. b, Kooistra, L.b, Kuiters, A.T.a, Slim, P.A.a

aWageningen Environmental Research, Wageningen University & Research, Droevendaalsesteeg 3, NL-6708 PB  Wageningen, The Netherlands.

bLaboratory of Geo-Information Science and Remote Sensing, Wageningen University & Research, Droevendaalsesteeg 3, NL-6700 AA  Wageningen, The Netherlands

Citation

Mücher, C.A. et al., Exploiting low-cost and commonly shared aerial photographs and LiDAR data for detailed vegetation structure mapping of the Wadden Sea island of Ameland (2019) SDRP Journal of Earth Sciences & Environmental Studies 4(1)

Abstract

Regular mapping of vegetation structure is important for biodiversity monitoring, and increasingly for tracking compliance with nature policy mandates. As such, the Netherlands uses vegetation structure mapping to monitor the Natura 2000 site on the Dutch Wadden Sea island of Ameland. Three decades of natural gas extraction here has caused soil subsidence, impacting vegetation and habitats on the island. In the Netherlands, vegetation structure mapping is typically done using conventional techniques, primarily field surveys combined with visual interpretation of aerial photographs. This procedure is time-consuming and often too inconsistent and inefficient for large areas. In the current study we exploited commonly shared and low-cost aerial photographs and LiDAR data for detailed vegetation structure mapping. Aerial photographs are not always easy to use in automatic classification procedures, as they often lack calibrated spectral reflectance values. Furthermore, pre-processing of aerial photographs to render them more attractive may skew the image so that it no longer accurately depicts the original scene anymore. Our aim was to determine if automatic or semi-automatic classification techniques could be applied to these readily available Dutch data to support mapping and monitoring of the vegetation structure of larger areas. We compared the effectiveness of two well-known classification methods, namely rule-based (RB) and random forest (RF). The RF algorithm was applied with its default settings, as supplied by eCognition software. Both classification methods performed well, with overall accuracies of 84.1% (RB) and 86.4% (RF). Each method, however, has its advantages and disadvantages, which are discussed. Overall, RF classification was preferred over RB classification, as it was better able to handle the complexity of the rules needed for distinguishing more classes. Provision of in situ training data, such as vegetation relevés, was not really problem in the Dutch context. Nevertheless, exploitation of new digital aerial photographs produced each year in a semi-automatic process remains a challenge. Commercial high resolution satellite imagery (~0.5 m resolution) is therefore still preferred by us. This latter, unfortunately, is more costly than aerial photographs which, while not always ideal, are readily available at no additional costs for involved organisations.

Highlights

  • Both RB and RF classification methods performed well in vegetation structure mapping, respectively, 84.1% and 86.4%.
  • RF was preferred over RB, since the former was better able to handle the complexity of the rules needed to distinguish many classes.
  • Exploitation of digital aerial photographs in semi-automatic classification processes remains challenging, due to inaccurate calibration of the reflectance values and the limited number of spectral bands in aerial photographs.
  • High resolution satellite imagery is a good alternative if aerial photographs are not available.

 

Keywords

LiDAR; aerial photographs; open data; rule-based; random forest; vegetation mapping; Ameland

Introduction

Regular mapping of vegetation structure is important for biodiversity monitoring, especially in areas with dynamic landscapes. For instance, the Habitats Directive (Art. 6, 12, 16 and 17) requires European Union (EU) member states to report on the status of protected habitats and species. For each Natura 2000 site within their boundaries, member states must assess once every six years the conservation status of the habitat types concerned, in particular, their area, structure and function (ETC, 2016). In addition, site-level information on vegetation structure is required for management plans and local impact assessments.

In the Netherlands, vegetation structure monitoring is required for the island of Ameland, situated in the Wadden Sea just north of the Dutch mainland. Here, three decades of natural gas extraction has resulted in soil subsidence, which has impacted vegetation structure and habitats. In addition to vegetation plot recordings to track changes in species composition (Van Dobben and Slim, 2012; Brus et al., 2014, 2016), wider spatial changes in vegetation structure must be monitored. Mapping of vegetation structure is also important for species identification and distribution modelling, since fauna and flora often have strong preferences for specific vegetation niches (Bunce et al., 2013).

Vegetation species and structure mapping in the Netherlands is typically done by field surveys combined with visual interpretation of aerial photographs. Yet, this procedure is time-consuming and often too inconsistent and inefficient to be feasible for large areas. Airborne and spaceborne imagery for the Netherlands are increasingly available at affordable cost or as open data. It would therefore seem fruitful to explore alternative and semi-automatic classification techniques using these readily available imagery sources.

Semi-automatic classification systems such as EODHaM (Earth Observation Data for Habitat Monitoring) have proven suitable for mapping land cover and habitats using commercial very high resolution satellite imagery such as Worldview-2 (Lucas et al., 2015; Mücher et al., 2015). However, such imagery remains relatively expensive, while for the Netherlands at least, very detailed aerial photographs are often readily available at little or no cost. In the Netherlands, in particular, aerial photographs with spatial resolutions of 25 cm and often including NIR (near-infrared) are produced every year covering the entire country (and since 2012 twice a year with a winter and summer acquisition). Due to their higher spatial resolution, aerial photographs are normally preferred over very high resolution commercial imagery. Aerial photographs generally offer sufficient resolution to detect individual trees and shrubs, which makes them suitable for the needs of conservation site managers and many other users. Examples of commonly shared and recent aerial photographs available for non-commercial applications for the Netherlands can be found at http://pdokviewer.pdok.nl/.

Also freely available for the Netherlands, next to aerial photographs, are high spatial resolution LiDAR (Light Detection and Ranging) point cloud data (~15 point measurements per m2) and the derived digital elevation models on a 50 cm grid measured in centimetres’ height (source: www.ahn.nl). Digital elevation model data for the Netherlands (AHN, Actueel Hoogtebestand Nederland) has been available as open data since 2003. It is produced about once every six years (AHN1 is from 2003, AHN2 is from 2007–2012 and AHN3 is from 2014–2019 and expected to be available around 2020).

This paper explores the extent that these two readily available high-resolution data sources (aerial photographs and LiDAR) can be exploited to support more efficient vegetation structure mapping and monitoring. If so, field surveys can be more efficiently focused on other aspects that cannot be derived from airborne imagery. Previous studies have shown that LiDAR data used alone or in combination with high-resolution multi-spectral or hyperspectral satellite imagery can perform quite well in vegetation studies (Hill and Thomson, 2005; Lucas et al., 2015; Mason et al., 2003; Mücher et al., 2015; Vierling et al., 2008). Furthermore, vegetation structure and canopy metrics ranging from grassland to forest have been shown to be strong predictors of species richness (Bunce et al., 2013; Hill and Thomson, 2005; Hyde et al., 2005; Mason et al., 2003; Vierling et al., 2008) and bird distribution patterns (Bradbury et al., 2005; Ficetola et al., 2014; Hinsley et al., 2006). Within habitats, too, LiDAR data have been successfully employed to quantify variation and dynamics in vegetation structure (Bradbury et al., 2005; Hantson et al., 2012; Korpela et al., 2009; Weishampel et al., 2007). Some studies suggest that the results obtained in habitat analyses using LiDAR may be enhanced by combining LiDAR data with spectral data (Bergen et al., 2009; Clawges et al., 2008; Hyde et al., 2006). LiDAR data appears to have performed well in characterizing tree species using canopy height as the main explanatory variable (Geerling et al., 2007; Hantson et al., 2012; Korpela et al., 2009). By integrating spectral data and canopy height data generated from LiDAR, Hill and Thomson (2005) produced an ecologically relevant thematic classification for a complex woodland environment, and Hantson et al. (2012) were able to identify invasive woody species.

In view of these results, LiDAR datasets would seem to offer a promising alternative for mapping habitats in fine detail across large areas. It may eventually be able replace labour-intensive, field-based measurements, and offer means of characterizing habitats in novel ways (Vierling et al., 2008). In this sense, LiDAR could be an efficient tool for short-term and long-term monitoring of changes in vegetation structure.

The best way to exploit aerial photography is with object-oriented classification, which is not dependent on the value of individual pixels and is able to take texture into account. In this sense, it is similar to the process of visual image interpretation, in which different vegetation mapping units are identified based on tone, texture, pattern and contrast. Object-based image analysis (OBIA) separates the identification and delineation of objects from the classification of objects, in line with the traditional, manual approach of delineating boundaries and assigning labels in the field (Fu and Mui, 1981;  Liu and Xia, 2010).

This investigation employed an OBIA-based approach, with a general segmentation followed by classification. Two main image classification methods were compared: rule-based (RB) and random forest (RF). These were applied to homogenous vegetation units identified from recurring datasets derived from low cost or public domain aerial photographs and LiDAR remote sensing data.

Materials & Methods

2. Study area and Materials

2.1 Study area

The study area for this research encompassed the eastern part of the barrier island of Ameland. The total area was approximately 1,600 ha, which is too large to be covered easily by traditional vegetation structure mapping. The island of Ameland (57 km2, 53°27'43"N, 5°54'12"E, Fig. 1) is located in the Wadden Sea, north of the Dutch mainland. The Wadden Sea and its islands are of great ecological significance, as underlined by their designation as a Natura 2000 site and a United Nations Educational, Scientific and Cultural Organization (UNESCO) world heritage site. Ameland has a large variety of habitats, ranging from dry dunes and dune slacks to tidal salt marshes, heathlands and fresh water (Roelofsen et al., 2014). Soil subsidence here due to natural gas extraction affects an area some 14 km in diameter. The maximum subsidence in 2017 was around 34 cm (Piening et al., 2017). Since 1986 vegetation monitoring has been executed here, mainly by recording vegetation characteristics on permanent plots along a gradient within the subsidence basin. Research has particularly sought to discover whether subsidence has had consequences for habitat quality (Van Dobben and Slim, 2012). In ecological terms, subsidence has the same effect as sea level rise. Therefore, the monitoring of soil subsidence also serves as a model for investigating the impact of accelerated sea level rise due to climate change.

2.2 Materials

This study used aerial photographs and LiDAR data from 2008 (AHN2 LiDAR data), as well as existing field data. The aerial photographs were made by EUROSENSE on 11 October 2008. Since the most recent LiDAR data (AHN2) for Ameland were from 15 February 2008, aerial photographs were selected from the same reference year. The available aerial photographs have four spectral bands: red (R), green (G), blue (B) and infrared (I), with a 25 cm spatial resolution. The addition of the infrared channel makes the aerial photographs more suitable for discriminating vegetation, since this channel is very sensitive to differences in the amount of photosynthetically active biomass. The aerial photographs were available as orthorectified mosaics, but no radiometric calibration was done. Fig. 1 shows the location of the study area and an ortho-mosaic of the aerial photographs used.

Figure 1

Fig. 1. Above: Location of the study area at the eastern end of the Wadden Sea island of Ameland, north of the Dutch mainland. Below: Ortho-mosaic of aerial photographs showing the study area.

We processed Object Heights for the Netherlands (OHN), a product derived from AHN2 based on the LiDAR data. The processed OHN has 50 cm by 50 cm grid cells, with object height given in centimetres, and can be applied for operational use at any location in the Netherlands (Kramer et al., 2014). Fig. 2 (top) shows the original LiDAR cloud points with the aerial photograph false-infrared colours draped over the LiDAR points. Fig. 2 (below) shows in brown all LiDAR points classified as ground, and in green all points classified as above-ground. The object height is defined as the height of an object above ground level and indicated by the real height within the specific grid cell. The OHN was in fact obtained by subtracting the Digital Terrain Model (DTM) from the Digital Surface Model (DSM), both available as open data (www.ahn.nl). The original DTM from AHN2 contains many no-data areas caused by gaps in the original data acquisition. For the OHN, these data gaps in the DTM were filled using ancillary data sources and extrapolation methods (Kramer et al., 2014).

https://www.siftdesk.org/articles/images/448/2.png

Fig. 2. Above: Detailed 3D perspective of a bicycle and unpaved road in the dunes of eastern Ameland with LiDAR cloud points (~15 point measurements/m2) as available for the entire Netherlands about once every six years since 2003; the 3D LiDAR cloud points (obtained 15 February 2008) are draped by RGB false colours (11 October 2008) at 25 cm resolution. The segment along the black dotted line is presented as a transect in the figure below. Below: Ground is represented by brown dots, while all objects are represented by green dots. These concerns in fact common shared data for the whole of the Netherlands.

 

Field work was done from 10 to 24 June 2014. Twelve vegetation structure classes were distinguished, including non-vegetated classes, and sampled by measuring their geographic locations. The classes were as follows: high thicket (>5 m height), medium thicket (2–5 m height), low thicket (0.5–2 m height), salt marsh vegetation, salt marsh sparsely vegetated, dune vegetation, dune sparsely vegetated, reed, sand, water, salt inland water and sea water (see Tables 1 and 2). A handheld GPS (Garmin eTrex 30) was used and at least one photograph was taken at each location. For each location, the dominant species were recorded alongside the heights measured in centimetres.

It merits noting that the time difference of six years between the acquisition date of the imagery (2008) and the field work (2014) could have produced small errors in the validation process. But the interpreters did not encounter these to any significant extent.

3. Methods

3.1 Object segmentation

We implemented the OBIA method, involving object segmentation followed by classification, both performed in eCognition. As observed earlier, instead of analysing individual pixels, the OBIA method groups pixels into meaningful objects and then those objects are analysed for classification, in our case, into vegetation structure classes. OBIA gives the user control over the mapping scale and can handle the implicit variability that comes with very high resolution imagery (Liu and Xia, 2010).

Image segmentation has its roots in the machine vision advances of the 1980s (Blaschke et al., 2004; Fu and Mui, 1981). The segmentation method used here is based on the fractal net evolution approach (FNEA) (Baatz and Schäpe, 2000). FNEA is applied using commercially available software, such as eCognition (Meinel and Neuber, 2004), and is widely employed in scientific studies (Benz et al., 2004; Geerling et al., 2009; Hantson et al., 2012; Myint et al., 2011; Zhang and Huang, 2010). Objects contain more information, such as texture, than the disparate digital image pixels alone. Hundreds of additional object features may be used, including texture metrics but also geometric characteristics of objects and spatial relations between objects, to provide additional means to differentiate classes.

We used two semi-automatic methods with a proven track record in scientific studies to classify the vegetation structure of eastern Ameland. The first is the rule-based (RB) classification method, in which a set of classification rules is formulated based on expert judgement. The second classification method, random forest (RF), explores segmented objects for specific object features on the basis of a training set. Optimizing the RF parameters was not part of this study. We used the default settings of the RF classifier supplied by eCognition software. The exact same segmentation was used for both the RB and the RF classifications (so the identified objects were exactly the same).

To train the classification (997 locations) and for validation (301 locations), in situ measurements were recorded; see Fig. 3 for their locations. Of course the training and validation points were different ones. The study area was stratified into major physio-geographic expanses (dunes, salt marsh, polder and sea) on the basis of elevation data (AHN) and topographic information (see Fig. 3). The stratification was used as contextual information in the post-processing of both classifications, in order to distinguish some of the vegetation structure classes and water types (see also Annex I). The strata were mainly defined as follows: (i) salt marsh concerns areas more than 2.1 m below NAP (Amsterdam Ordnance Datum) and connected to the Wadden Sea; (ii) dunes concern areas higher than 2.1 m NAP; (iii) polder concerns areas lower than 2.1 NAP, but not connected to the Wadden Sea; (iv) sea water is non-terrestrial area.

https://www.siftdesk.org/articles/images/448/3.png

Fig. 3. Stratification with training samples per vegetation structure class (coloured dots) and validation points (black triangles).

3.2 Rule-based (RB) classification

The segmentation and RB classification were performed with Trimble eCognition Developer 9.1 (Trimble, 2015). The segmentation itself involved two basic steps: a multi-resolution segmentation and if needed a spectral merge. The eCognition parameter settings are given in section 4. Rule sets were defined for classification of the identified and segmented objects. The classification domain was a set of image objects, and every process looped through the set of image objects, applying the rule sets to each. If all rules for a class were evaluated as effective to characterize an object, the object was labelled as that class. Complex conditions were defined to narrow down the classes of interest. The rule sets were fine-tuned by investigating individual object features and their values, to determine if these could help identify the vegetation structure class of the object. Finding the appropriate rule sets, including setting thresholds for specific object features, was often time consuming. A big advantage of RB classification compared to RF is that little training data is needed. The rules give a mechanistic understanding of the relevant factors influencing the classification procedure.

3.3 Random forest (RF) classification

Random forest (RF) is a modern machine-learning classifier using an ensemble of decision trees (Breiman, 2001) and efficient for cloud computing as well. Decision tree ensembles are one of several popular alternatives to the traditional maximum likelihood classification. RF makes no assumptions on the distributional characteristics of the independent variables or on the response variables (Cutler et al., 2007). This makes RF suitable for an object-based classification with sufficient training data and relevant object features. RF builds trees by taking a random subset of measured variables from the imagery and a random subset of training data. The number of decision trees, the number of variables and the number of training objects for each tree are all parameters in the RF algorithm (Breiman, 2001; Gislason et al., 2006; Liaw and Wiener, 2002). In eCognition, the default maximum tree number is 50. The learning termination criteria determine how training should be stopped, namely, by a maximum number of trees, by forest accuracy (0.01% is the default) or by both (both is the default) (see Trimble, 2015). In the running process, the data is recursively split into increasingly homogeneous regions. At each step, the most optimal variable and value is selected to produce subgroups of the data with the lowest impurity. The splitting continues until a maximum number of trees is reached or until all of the variance is explained to an acceptable level (Cutler et al., 2007). During classification, all of the objects are pushed through the trees, and the trees cast a vote according to the class of the terminal node. The objects are assigned to a class on the basis of a majority vote. For a more detailed review of RF, see Hastie et al. (2001).

Results

The same segmented vegetation objects were used as input for both classification methods. These were derived from segmentation techniques applied to the aerial photographs. To identify the spatial vegetation units (objects), the following parameter settings were used in eCognition Developer 9.1 (Trimble, 2015): scale 50, shape 0.1 and compactness 0.9. The scale parameter was selected by testing the values 10, 25, 50 and 75. The spatial objects created with the value 50 had the best visual match to the vegetation structure classes on the aerial photographs, while also best matching the requirements of the vegetation experts. Using the aerial photograph RGB colour bands only (without the LiDAR data) gave the best results for the segmentation. All parameters in eCognition were in fact set by trial and error, and in consultation with a vegetation specialist.

The RB classification was based on a limited set of object features, namely, (i) maximum value of OHN, (ii) Normalized Difference Vegetation Index (NDVI), (iii) brightness, (iv) grey-level co-occurrence matrix (GLCM) homogeneity of OHN in all directions and (v) red ratio. Annex I presents the thresholds for all specified object features in the RB classification. After a first run, some objects still remained unclassified. These were classified in a second round using additional thresholds (see Annex I). In a final step, the preliminary vegetation structure classes were recoded to the final classes per strata. This rule-based process used four main strata: (1) dunes, (2) salt marsh, (3) polder and (4) sea water. The final RB classification resulted in a vegetation structure map with 12 classes (minus polder and gas extraction location; Fig. 4).

https://www.siftdesk.org/articles/images/448/4.png

Fig. 4. Vegetation structure map resulting from rule-based (RB) classification using aerial photographs and LiDAR data from 2008.

As mentioned, the RF classification (Breiman, 2001) used the same segmentation as the RB classification. Both classification methods also targeted the same 12 vegetation structure classes, in order to enable a good comparison (see Fig. 5). Although an unlimited number of object features could be used for the RF classification, a careful selection of nine object features gave a better result for the classes of interest. The RF classifier was run in eCognition on the basis of these nine object features: (i) NDVI, (ii) spectral brightness, (iii) max pixel object height OHN, (iv) mean blue, (v) mean green, (vi) mean red, (vii) mean infrared, (viii) ratio red and (ix) GLCM homogeneity OHN in all directions and 997 training points.

For a comparison of some alternative classification methods, including RF and the popular support vector machines, see Meyer et al. (2003). For a comparison between decision tree ensemble methods such as RF and a method with the individual decision tree CART, see Gislason et al. (2006). For a comparison of RF and other decision tree ensemble methods using bagging and boosting, tested on land cover datasets, see Chan and Paelinckx (2008). All authors agree that RF generally ranks high in classification accuracy and that RF is relatively insensitive to its parameters and computationally fast. However, there is some debate about the tendency of decision trees to overfit (Segal, 2003), and RF is said to sometimes perform poorly with high class imbalance (Blagus and Lusa, 2010). Breiman (2001) disputes these weaknesses, stating that RF is insusceptible to these issues, which are commonly associated with decision trees, because of the double randomness of the subsets in tree construction and the law of large numbers.

https://www.siftdesk.org/articles/images/448/5.png

Fig. 5. Vegetation structure map resulting from random forest (RF) classification implemented based on nine feature classes as derived from aerial photographs and LiDAR data from 2008.

Fig. 6 presents a comparison of the classification results (RB and RF) for a small area. Overall, the vegetation patterns observed are very similar, although many minor differences in structure can be detected. Since the two classification methods provide visually quite similar results, a quantitative validation was performed.

https://www.siftdesk.org/articles/images/448/6.png

Fig. 6. Details for a small area for the two classifications. Above: Rule-based (RB) classification. Below: Random forest (RF) classification.

A total of 301 validation points were available (see Fig. 3 for their locations). Using these, an error or contingency matrix was produced for the outputs of both classification methods (tables 1 and 2).

Table 1. Validation matrix for the rule-based (RB) classification (counts) (veg. = vegetated).

RB classification

 Field reference

               
 

1

2

3

4

5

6

7

8

9

10

11

12

Total

Reliability (%)

01_High thicket (wood)

15

1

                   

16

93.8

02_Medium thicket

 

54

     

1

 

1

 

1

   

57

94.7

03_Low thicket

   

24

1

 

4

 

1

       

30

80.0

04_Salt marsh vegetation

   

1

70

5

   

1

   

2

 

79

88.6

05_Salt marsh sparsely veg.

       

4

   

2

1

 

2

 

9

44.4

06_Dune vegetation

   

1

   

24

3

2

       

30

80.0

07_Dune sparsely vegetated

         

1

7

2

       

10

70.0

08_Reed

 

1

     

1

 

15

       

17

88.2

09_Sand

       

1

 

5

 

13

     

19

68.4

10_Water

         

2

 

1

 

10

   

13

76.9

11_Salt water

     

1

           

16

 

17

94.1

12_Sea water

               

1

   

1

2

50.0

Unclassified

 

 

 

 

 

 

 

1

 

1

 

 

2

100.0

Total

15

56

26

72

10

33

15

26

15

12

20

1

301

 
                             

Accuracy (%)

100.0

96.4

92.3

97.2

40.0

72.7

46.7

57.7

86.7

83.3

80.0

100.0

   
                             

Overall accuracy (%)

84.1

                         

 

Table 2. Validation matrix for the random forest (RF) classification (counts) (veg. = vegetated).

RF classification

Field reference

                 
 

1

2

3

4

5

6

7

8

9

10

11

12

Total

 

Reliability (%)

01_High thicket (wood)

13

                     

13

 

100.0

02_Medium thicket

 

54

     

1

     

1

   

56

 

96.4

03_Low thicket

   

26

1

 

3

       

1

 

31

 

83.9

04_Salt marsh vegetation

     

71

2

   

1

       

74

 

95.9

05_Salt marsh sparsely veg.

 

1

   

7

   

2

1

 

1

 

12

 

58.3

06_Dune vegetation

         

26

3

1

       

30

 

86.7

07_Dune sparsely vegetated

         

2

7

3

       

12

 

58.3

08_Reed

             

18

       

18

 

100.0

09_Sand

       

1

 

5

 

9

     

15

 

60.0

10_Water

         

1

 

1

 

11

   

13

 

84.6

11_Salt water

               

4

 

17

 

21

 

81.0

12_Sea water

               

1

   

1

2

 

50.0

Unclassified

2

1

 

 

 

 

 

 

 

 

1

 

4

 

100.0

Total

15

56

26

72

10

33

15

26

15

12

20

1

301

   
                               

Accuracy (%)

86.7

96.4

100.0

98.6

70.0

78.8

46.7

69.2

60.0

91.7

85.0

100.0

     
                               

Overall accuracy (%)

86.4

                           

The impression of Fig. 6 is confirmed by Table 1 and Table 2, namely, that the classification accuracies of the RB and RF classification algorithms are quite similar, 84.1% versus 86.4% overall. Though these classification accuracies are high, it must be noted that the number of vegetation structure classes (12) was rather more limited than in most traditional vegetation studies. Nonetheless, small differences in the classification accuracies could favour a choice for one method or the other, along the lines discussed below.

With the RF classifier it is easier to distinguish more vegetation structure classes if sufficient training data is available. RF is better able to handle the complexity of the rules needed to distinguish many classes. In other words, RF is more suitable than RB classification for handling numerous different attributes for the class definitions. Nonetheless, RF needs training data, the collection of which requires substantial effort, and the classification itself is much more a black box than RB classification, with its transparent decision rules (see Annex I). The straightforwardness of RB classification means that it can be more easily extended to other areas. But for other areas the thresholds still need to be fine-tuned, which means additional work in most cases. Site managers and vegetation experts cannot help in providing rule sets or in fine-tuning the thresholds, but in most cases they are able to provide ground truth data. That last is of great advantage for the RF classification method. If provision of in situ data is not a problem, the RF method is preferred over RB classification. The current trend toward increased availability of harmonized and consistent in situ data in the public domain or as open data (e.g., see www.GBIF.org) would favour the RF classification method.

A major disadvantage of both classification methods is that the aerial photographs are not radiometrically calibrated. Calibrated, very high resolution multi-spectral or hyper-spectral satellite imagery improves semi-automatic classifications, but the spatial resolution of this imagery is still limited compared to aerial photographs and it is quite expensive. This is why exploitation of readily available, low-cost or public domain data, such as aerial photographs and LiDAR data, remains such an interesting proposition. But this does limit the number of classes that can be distinguished.

A major improvement that could be made for both classification methods based on very high spatial resolution imagery would be improvement of the positional accuracy of the field measurements. In most cases, field measurements are still made with simple handheld GPS devices with an accuracy of about 3 m. Use of very high spatial resolution imagery (25–50 cm) requires that the field measurements be done at a 10–25 cm positional accuracy, with RTK (real-time kinematic) GPS techniques. Positional errors in the training data produce confusion in the RF classification, which reduces classification accuracies. Investment in high-accuracy handheld GPS systems for field surveyors is therefore a prerequisite for reliable semi-automatic vegetation mapping.

An important next step for the evaluated methods is extrapolation of the models to other areas for which training and testing data were not included in the modelling phase. Earlier studies (e.g., Wenger and Olden, 2012) showed that model transferability gives better results for simpler models than for more complex approaches, like RF. In the case of RF models, the fitting should focus on modelling the general relations within the case study area while not including local characteristics that cannot be transferred to other sites away from the training area (Juel et al., 2015). Although this approach reduces the overall accuracy for the training area, it increases the potential transferability of the developed model to other kinds of areas. In addition, local vegetation observations are increasingly available in open-source databases (e.g., www.GBIF.org). These could be adopted to localize the general transferable RF model. This site-specific approach would also allow inclusion of artefacts in the aerial imagery related to variations in illumination, time of day, weather and observation angle.

Conclusion

This study compared the performance of two well-known classification methods on segmented homogeneous vegetation units exploiting very detailed, readily available data for the Netherlands. Specifically, aerial photographs and LiDAR data were employed to explore semi-automatic mapping of vegetation structure. This kind of mapping is very relevant, particularly in the framework of Natura 2000, as all EU member states have an obligation to assess once every six years the conservation status of the habitat types within their Natura 2000 sites (ETC, 2016). Vegetation structure is an important indicator of habitat quality.

We implemented two classification algorithms for large part of Wadden Sea island of Ameland, namely rule-based (RB) and random forest (RF) classification. Both methods produced comparable results for the 12 vegetation structure classes of interest. Overall accuracies were 84.1% (RB) and 86.4% (RF). Both methods have their advantages and disadvantages. RF can incorporate more object features in complex rules than a normal RB classification, so it can produce better results. In principle an RB classification can use as many object features as an RF classification, but this is not practicable. All sets of rules in RB classification are made by the expert in control, which is time-consuming (especially the fine-tuning of thresholds for specific classes). This limits the number of rules and feature classes that can be handled.

Supplementary Data

Annex I. Overview of features and thresholds used in rule-based (RB) classification.

Class number and name

Feature

 

 

 

 

 

 

OHN

Index

Spectral

GLCM homogeneity

Ratio

Stratum

 

Max OHN in m

NDVI

brightness

OHN

red

 

First classification round

           

01 High thicket (wood)

> 5

> 0.16

 

< 0.45

   

02 Medium thicket

2 – 5

> 0.16

 

< 0.45

< 0.31

 

03 Low thicket

0.5 – 2

> 0.16

 

< 0.65

< 0.31

 

04a Salt marsh vegetation

 

> 0.25

55 – 110

> 0.95

 

Salt marsh

04b Salt marsh vegetation

 

0.14 – 0.25

65 – 110

> 0.95

 

Salt marsh

06a Dune vegetation

 

> 0.25

55 – 110

> 0.95

 

Dunes

06b Dune vegetation

 

0.14 – 0.25

65 – 110

> 0.95

 

Dunes

08 Reed

< 3.5

0.05 – 0.31

< 110

< 0.95

> 0.31

 

09 Sand

 

–0.25 – 0.1

> 100

     

10a Water

 

< 0

< 110

> 0.95

 

Dunes

10b Water

 

< 0.3

< 50

> 0.95

 

Dunes

10c Water

 

< 0.45

< 60

> 0.98

 

Dunes

10d Water

 

< –0.1

< 110

   

Dunes

11a Salt water

 

< 0

< 110

> 0.95

 

Salt marsh

11b Salt water

 

< 0.3

< 50

> 0.95

 

Salt marsh

11a Salt water

 

< 0.45

< 60

> 0.98

 

Salt marsh

11a Salt water

 

< –0.1

< 110

   

Salt marsh

             

Second classification round (applied at unclassified first round)

           

01 High thicket (wood)

> 5

> 0.16

 

> 0.45

   

02 Medium thicket

2 – 5

         

03 Low thicket

0.5 – 2

> 0.16

 

> 0.45

   

08 Reed

< 3.5

< 0.31

< 110

< 0.45

   

05 Salt marsh sparsely vegetated

< 0.5

> 0.16

     

Salt marsh

07 Dune sparsely vegetated

< 0.5

> 0.16

     

Dunes

09 Sand

 

< 0.16

       
             

Classes based only on stratum

           

12 Sea water

         

Sea

13 Polder

         

Polder

14 Gas extraction location

 

 

 

 

 

Gas extraction location

Acknowledgement

This research was part of the strategic research programme KBIV ‘Sustainable Spatial Development of Ecosystems, Landscapes, Seas and Regions’ (KB-24-002-005), funded by the Dutch Ministry of Economic Affairs, Agriculture and Innovation, and carried out by Wageningen University & Research. Special thanks to Johan Krol and the late Frits Oud from the Nature Centre Ameland and It Fryske Gea, respectively, for their local information and logistics support during the fieldwork. We also thank George Wintermans and Jeroen Jansen for their inspiring and facilitating contributions to the research. We are grateful for the permission granted by It Fryske Gea and ‘De Vennoot’ (Neerlands Reid) to undertake our research on their property. The authors also thank Michelle Luijben for her comments and corrections, which helped to improve the manuscript.

References

  1. Vierling, K.T., Vierling, L.A., Gould, W.A., Martinuzzi, S., Clawges, R.M., 2008. Lidar: shedding new light on habitat characterization and modeling. Frontiers in Ecology and the Environment 6 (2), 90–98.

    View Article           
  2. Segal, M.R., 2003. Machine Learning Benchmarks and Random Forest Regression. Kluwer, Dordrecht, the Netherlands.

  3. Trimble, 2015. eCognition Developer 9.1 User Guide. Trimble, Munich, Germany.

  4. Van Dobben, H.F., Slim, P.A., 2012. Past and future plant diversity of a coastal wetland driven by soil subsidence and climate change. Climatic Change 110, 597–618, doi:10.1007/s10584-011-0118-5.

    View Article           
  5. Piening, H., Van der Veen, W., Van Eijs, R., 2017. Bodemdaling. In: De Vlas, J. (Ed.). Monitoring effecten van bodemdaling op Oost-Ameland, pp. 9–25.

    View Article           
  6. Roelofsen, H.D., Kooistra, L., Van Bodegom, P.M., Verrelst, J., Krol, J., Witte, J.-P.M., 2014. Mapping a priori defined plant associations using remotely sensed vegetation characteristics. Remote Sensing of Environment 140, 639–651.

    View Article           
  7. Myint, S.W., Gober, P., Brazel, A., Grossman-Clarke, S., Weng, Q., 2011. Per-pixel vs. object-based classification of urban land cover extraction using high spatial resolution imagery. Remote Sensing of Environment 115 (5), 1145–1161.

    View Article           
  8. Mücher, C.A., Roupioz, L., Kramer, H., Bogers, M.M.B., Jongman, R.H.G., Lucas, R.M., Kosmidou, V.E., Petrou, Z., Manakos, I., Padoa-Schioppa, E., Adamo, M., Blonda, P., 2015. Synergy of airborne LiDAR and Worldview-2 satellite imagery for land cover and habitat mapping: A BIO_SOS-EODHaM case study for the Netherlands. International Journal of Applied Earth Observation and Geoinformation 37, 48–55.

    View Article           
  9. Meyer, D., Leisch, F., Hornik, K., 2003. The support vector machine under test. Neurocomputing 55 (1–2), 169–186.

    View Article           
  10. Meinel, G., Neuber, M., 2004. A comparison of segmentation programs for high resolution remote sensing data. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 35-B4, 1097–1102.

  11. Mason, D.C., Anderson, G.Q.A., Bradbury, R.B., Cobby, D.M., Davenport, I.J., Vandepoll, M., Wilson, J.D., 2003. Measurement of habitat predictor variables for organism-habitat models using remote sensing and image segmentation. International Journal of Remote Sensing 24 (12), 2515–2532.

    View Article           
  12. Lucas, R., Blonda, P., Bunting, P., Jones, G., Inglada, J., Arias, M., Kosmidou, V., Petrou, Z.I., Manakos, I., Adamo, M., Charnock, R., Tarantino, C., Mücher, C.A., Jongman, R.H.G., Kramer, H., Arvor, D., Honrado, J.P., Mairota, P., 2015. The Earth Observation Data for Habitat Monitoring (EODHaM) System. International Journal of Applied Earth Observation and Geoinformation 37, 17–28.

    View Article           
  13. Liu, D., Xia, F., 2010. Assessing object-based classification: advantages and limitations. Remote Sensing Letters 1 (4), 187–194.

    View Article           
  14. Kramer, H., Clement, J., Mücher, C.A., 2014. OHN: Object Hoogten Nederland; De hoogte van alles wat boven het maaiveld uitsteekt. Geo-Informatie Nederland 3, 18–21.

  15. Liaw, A., Wiener, M., 2002. Classification and Regression by randomForest. R News 2/3, 18–22.

  16. Korpela, I., Koskinen, M., Vasander, H., Holopainen, M., Minkkinen, K., 2009. Airborne small-footprint discrete-return LiDAR data in the assessment of boreal mire surface patterns, vegetation, and habitats. Forest Ecology and Management 258 (7), 1549–1566.

    View Article           
  17. Juel, A., Groom, G.B., Svenning, J.-C., Ejrnæs, R., 2015. Spatial application of Random Forest models for fine-scale coastal vegetation classification using object-based analysis of aerial orthophoto and DEM data. International Journal of Applied Earth Observation and Geoinformation 42, 106–114.

    View Article           
  18. Hyde, P., Dubayah, R., Walker, W., Blair, J.B., Hofton, M., Hunsaker, C., 2006. Mapping forest structure for wildlife habitat analysis using multi-sensor (LiDAR, SAR/InSAR, ETM+, Quickbird) synergy. Remote Sensing of Environment 102 (1–2), 63–73.

    View Article           
  19. Hyde, P., Dubayah, R., Peterson, B., Blair, J.B., Hofton, M., Hunsaker, C., Knox, R., Walker, W., 2005. Mapping forest structure for wildlife habitat analysis using waveform lidar: Validation of montane ecosystems. Remote Sensing of Environment 96 (3–4), 427–437.

    View Article           
  20. Hinsley, S.A., Hill, R.A., Bellamy, P.E., Balzter, H., 2006. The Application of Lidar in Woodland Bird Ecology: Climate, Canopy Structure, and Habitat Quality. Photogrammetric Engineering and Remote Sensing 72 (12), 1399–1406.

    View Article           
  21. Hill, R.A., Thomson, A.G., 2005. Mapping woodland species composition and structure using airborne spectral and LiDAR data. International Journal of Remote Sensing 26 (17), 3763–3779.

    View Article           
  22. Hastie, T., Friedman, J., Tibshirani, R., 2001. The Elements of Statistical Learning (Volume 1). Springer, New York, NY.

    View Article           
  23. Hantson, W.P.R., Kooistra, L., Slim, P.A., 2012. Mapping invasive woody species in coastal dunes in the Netherlands: a remote sensing approach using LIDAR and high-resolution aerial photographs. Applied Vegetation Science 15 (4), 536–547.

    View Article           
  24. Gislason, P.O., Benediktsson, J.A., Sveinsson, J.R., 2006. Random Forests for land cover classification. Pattern Recognition Letters 27 (4), 294–300.

    View Article           
  25. Geerling, G.W., Vreeken-Buijs, M.J., Jesse, P., Ragas, A.M.J., Smits, A.J.M., 2009. Mapping river floodplain ecotopes by segmentation of spectral (CASI) and structural (LiDAR) remote sensing data. River Research and Applications 25 (7), 795–813.

    View Article           
  26. Fu, K.S., Mui, J., 1981. A survey on image segmentation. Pattern Recognition 13 (1), 3–16.

    View Article           
  27. Geerling, G.W., Labrador-Garcia, M., Clevers, J.G.P.W., Ragas, A.M.J., Smits, A.J.M., 2007. Classification of floodplain vegetation by data fusion of spectral (CASI) and LiDAR data. International Journal of Remote Sensing 28 (19), 4263–4284.

    View Article           
  28. Ficetola, G.F., Bonardi, A., Mücher, C.A., Gilissen, N.L.M., Padoa-Schioppa, E., 2014. How many predictors in species distribution models at the landscape scale? Land use versus LiDAR-derived canopy height. International Journal of Geographical Information Science Special Issue: Spatial Ecology 28 (8), 1723–1739.

    View Article           
  29. Cutler, D.R., Edwards, T.C., Beard, K.H., Cutler, A., Hess, K.T., Gibson, J., Lawler, J.J., 2007. Random forests for classification in ecology. Ecology 88 (11), 2783–2792. PMid:18051647

    View Article      PubMed/NCBI     
  30. ETC (European Topic Centre on Biological Diversity), 2016. Assessment and reporting under Article 17 of the Habitats Directive: Explanatory Notes & Guidelines for the period 2013–2018. Final Draft November 2016. European Topic Centre on Biological Diversity, Paris, France.

  31. Clawges, R., Vierling, K., Vierling, L., Rowell, E., 2008. The use of airborne lidar to assess avian species diversity, density, and occurrence in a pine/aspen forest. Remote Sensing of Environment 112 (5), 2064–2073, doi: 10.1016/j.rse.2007.08.023.

    View Article           
  32. Chan, J.C.W., Paelinckx, D., 2008. Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery. Remote Sensing of Environment 112 (6), 2999–3011.

    View Article           
  33. Bunce, R.G.H., Bogers, M.M.B., Evans, D., Halada, L., Jongman, R.H.G., Mucher, C.A., Bauch, B., De Blust, G., Parr, T.W., Olsvig-Whittaker, L., 2013. The significance of habitats as indicators of biodiversity and their links to species. Ecological Indicators Special Issue: Biodiversity Monitoring 33, 19–25.

    View Article           
  34. Brus, D.J., Slim, P.A., Heidema, A.H., Van Dobben, H.F., 2014. Trend monitoring of the areal extent of habitats in a subsiding coastal area by spatial probability sampling. Ecological Indicators 45, 313–319. Available at

    View Article           
  35. Brus, D.J., Slim, P.A., Gort, G., Heidema, A.H., Van Dobben, H., 2016. Monitoring habitat types by the mixed multinomial logit model using panel data. Ecological Indicators 67, 108–116. Available at

    View Article           
  36. Breiman, L., 2001. Random forests. Machine Learning 45 (1), 5–32.

    View Article           
  37. Bradbury, R.B., Hill, R.A., Mason, D.C., Hinsley, S.A., Wilson, J.D., Balzter, H., Anderson, G.Q.A., Whittingham, M.J., Davenport, I.J., Bellamy, P.E., 2005. Modelling relationships between birds and vegetation structure using airborne LiDAR data: a review with case studies from agricultural and woodland environments. Ibis 147 (3), 443–452.

    View Article           
  38. Blaschke, T., Burnett, C., Pekkarinen, A., 2004. Image segmentation methods for object-based analysis and classification, in: De Jong, S.M., Van der Meer, F.D. (Eds.), Remote Sensing Image Analysis: Including the Spatial Domain. Springer, Dordrecht, Netherlands, pp. 211–236.

    View Article           
  39. Blagus, R., Lusa, L., 2010. Class prediction for high-dimensional class-imbalanced data. BMC Bioinformatics 11: 523, doi: 10.1186/1471-2105-11-523.

    View Article           
  40. Bergen, K.M., Goetz, S.J., Dubayah, R.O., Henebry, G.M., Hunsaker, C.T., Imhoff, M.L., Nelson, R.F., Parker, G.G., Radeloff, V.C., 2009. Remote sensing of vegetation 3-D structure for biodiversity and habitat: Review and implications for lidar and radar spaceborne missions. Journal of Geophysical Research: Biogeosciences 114 (G2), 1–13.

  41. Benz, U.C., Hofmann, P., Willhauck, G., Lingenfelder, I., Heynen, M., 2004. Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information. ISPRS Journal of Photogrammetry and Remote Sensing 58 (3–4), 239–258.

    View Article           
  42. Baatz, M., Schäpe, A., 2000. Multiresolution Segmentation: an optimization approach for high quality multi-scale image segmentation. Wichmann, Heidelberg, pp. 12–23.

  43. Wenger, S.J., Olden, J.D., 2012. Assessing transferability of ecological models: an underappreciated aspect of statistical validation. Methods in Ecology and Evolution 3 (2), 260–267.

    View Article           
  44. Weishampel, J.F., Drake, J.B., Cooper, A., Blair, J.B., Hofton, M., 2007. Forest canopy recovery from the 1938 hurricane and subsequent salvage damage measured with airborne LiDAR. Remote Sensing of Environment 109 (2), 142–153.

    View Article           
  45. Zhang, L., Huang, X. 2010. Object-oriented subspace analysis for airborne hyperspectral remote sensing imagery. Neurocomputing 73 (4–6), 927–936. 

    View Article           

Journal Recent Articles