GeoLens Catalog Federation Use Case Scenario

Cliff Behrens, PI

USDAC/Bellcore

October 24, 1997



1. Proposed Application Domain

Use of cartographic and topographic models in a geographic data base to improve vegetation/land cover classification derived from satellite imagery for park management is presented as a scenario. The objective is to discover, evaluate and retrieve Landsat Multispectral imagery together with USGS 1 degree Digital Elevation Models and USGS 1:100K Transportation Digital Line Graphs to build a land use/land classification for Morris County, New Jersey.

2. Background

Computer-aided classification of digital satellite data has been shown to be an effective tool for conducting land resource inventories. Acceptable accuracies for lower levels in land use/land cover classification hierarchies have been obtained using satellite imagery acquired during different seasons. However, in some situations such as in mountainous terrain, seasonally variable sun angles produce different intensities of shadowing that can obscure otherwise clear distinctions between land cover classes.

Previous studies have used a GIS approach to land use/land cover classification in which satellite imagery and topographic data in digital formats are geographically referenced to a common base so that a digital elevation model and derivative layers (e.g., slope and aspect) can be used to refine classification results. Moreover, digital line graphs yield other useful information about cultural features, e.g., transportation network, and political boundaries, e.g., for masking, not easily obtained from imagery or digital elevation models.

3. Problem

The GIS approach above requires satellite imagery, topographic information and other cartographic data. New Landsat 7 Enhanced Thematic Mapper (ETM+) data will soon become available from the USGS EROS Data Center, as are USGS 1 degree and 7.5 minute Digital Elevation Models (DEMs), and USGS 1:100K Transportation DLGs. However, the catalogued metadata for these data types implement different standardized metadata schema. The ETM+ collection is managed by the EOSDIS Core System (ECS) which organizes its metadata compliant with the ECS Metadata Standard; while both the USGS DEM and DLG metadata are catalogued compliant with the FGDC Content Standard for Digital Geospatial Metadata. Moreover, the ETM+ imagery is of type raster stored in HDF-EOS containers, the DEM data are type grid stored in DEM format (as flat ASCII files), and the DLG data are of type vector stored in SDTF-VP (Spatial Data Transfer Standard-Vector Profile) distributions. In addition, the DLG data are attributed with the DLG-3 feature schema.

Many county-level planning applications require only a subset of the Landsat imagery to provide coverage for an area of interest. So there is also the need to efficiently discover all of the current image, topographic and cartographic data available for an area, browse the metadata for these data to determine their usefulness, and retrieve only the highest quality data required to cover the study area but in a form, and with sufficiently accurate georegistration, so that they can be used together by a classifier.

4. Data

This scenario involves Landsat 7 ETM+ imagery, USGS 1 degree DEM, and USGS 1:100K Transportation DLG data. These data repositories, and the catalogs containing metadata that reference them, are located on different network servers (but this is hidden from the planner). That is to say, these catalogs are heterogeneous and distributed on the Internet, but linked logically in an earth science data federation.

5. Actors

Two actors are identified for this scenario. The primary actor is the Planning Engineer who wants to use the data described above to generate a current land use/land cover thematic map for Morris County, NJ. The Engineer must locate these data, determine their suitability for classification purposes, and retrieve a useful subset of these ready for ingest by a local Geographical Information System. Secondary actors in this scenario are the Data Providers who are responsible for archiving Landsat ETM+, DEM and DLG data on network servers, and for updating digital catalogs with metadata that reference these new data as they are added to local archives.

6. Use Case

A Planning Enginneer in the Morris County Park Commission wants to create a USGS Level I land use/land cover classification for Morris County, NJ. Currently, the Park Commission has only historical black-and-white aerial photography for the county, so this engineer wishes to locate all available multispectral satellite imagery that covers Morris County. Since the county contains areas of relatively steep relief in the northwest and southeast, this engineer believes that digital elevation data for the same area could be used to improve the classification of image pixels obtained from an unsupervised classification. This engineer has ISDN access to the WWW/Internet and a workstation with a WWW browser, and so wants to use the Web to obtain the satellite imagery and topographic data. Since Morris County forms part of the New York Metropolitan region and is heavily trafficked, he also wants to acquire county boundary and transportation features to overlay on his land use/land cover layer. Once these data are obtained, the engineer can better determine the amount of the forested park land managed by the MCPC that is located in mountainous areas.

(1) Because the engineer doesn't have handy a map from which to determine coordinates, he begins his search for useful data by drawing a bounding box around a boundary line graph of Morris County presented in his client. Since he is also interested in identifying any data objects whose content relates to "land cover classification," he also enters this character string in a form provided by his client. The bounding box is used to query the footprint metadata attributes, and the character string the indices for full-text content, of all geospatial catalogs in the earth science data federation.

(2) An Object ID is returned for each geospatial data object whose footprint intersects the one drawn by the engineer. In this scenario, four OIDs are listed: two ETM+ objects, a DEM object, and a DLG object.

(3) The engineer browses the metadata for one of the ETM+ objects and discovers that it contains 90% cloud cover. While browsing metadata for the other, he learns that only 20% of it is covered by clouds. He inspects a browse image for this second ETM+ object and finds that most of the clouds are located outside the area containing Morris County.

(4) Satisfied that the second ETM+ image is useful for making his classification, he creates an order to extract bands 3, 4, 5 and PAN but only those rasters needed to fill his bounding box. This ETM+ data access order is stored in a "shopping cart" object.

(5) In a similar manner, the engineer browses metadata for the DEM and DLG objects returned from his catalog query to confirm that they also cover his area of interest. Then he creates orders to extract data from these, again storing with each order his bounding box coordinates so that he obtains only the data required for his application. The DEM and DLG data access orders are also added to the shopping cart.

(6) The ETM+, DEM and DLG data access information stored in the shopping cart is sent to networked Data Access Servers. The three extracts are retrieved and stored locally for use by the engineer's image classifier and GIS.