Creating a STAC of Landsat data#

In this tutorial we create a STAC of Landsat 8 data from Amazon Web Service’s Open Data program. There’s a lot of Landsat scenes, so we’ll only take a subset of scenes that are from a specific year and over a specific location. We’ll translate existing metadata about each scene to STAC information, utilizing the eo, view, and proj extensions. Finally we’ll write out the STAC catalog to our local machine, allowing us to use stac-browser to preview the images.

Requirements#

To run this tutorial you’ll have needed to install PySTAC with the validation extra. To do this, use:

pip install pystac[validation]

Also to run this notebook you’ll need jupyter installed locally as well. If you’re running in a docker container, make sure that port 5555 is exposed if you want to run the server at the end of the notebook.

[1]:
from typing import Any, Dict, List, Optional, cast
from typing_extensions import TypedDict

import pystac
from pystac.extensions.eo import EOExtension
from pystac.extensions.projection import ProjectionExtension
from pystac.extensions.view import ViewExtension

Reading Landsat 8 scene data#

AWS keeps a scene list that includes information about where the scene is located and how to access it’s information. Landsat 8 was reorganized into a Collection system in the past, and so there’s two places to read a scene list from (as mentioned on the aws page). We’ll pull from the data that is organized as Collection 1 data, which is where all new processed data is added since 2017.

[2]:
import csv
from datetime import datetime
import gzip
from io import StringIO
from urllib.request import urlopen

# Here we use a TypedDict to define the structure of the Landsat scene dictionaries
# we will be working with. You can read more about the TypedDict class here:
# https://docs.python.org/3/library/typing.html#typing.TypedDict
class Scene(TypedDict):
    productId: str
    entityId: str
    acquisitionDate: str
    cloudCover: str
    processingLevel: str
    path: str
    row: str
    min_lat: str
    min_lon: str
    max_lat: str
    max_lon: str
    download_url: str

# Collection 1 data
url = 'https://landsat-pds.s3.amazonaws.com/c1/L8/scene_list.gz'

# Read and unzip the content
response = urlopen(url)
gunzip_response = gzip.GzipFile(fileobj=response)
content = gunzip_response.read()

# Read the scenes in as dictionaries
scenes: List[Scene] = list(map(lambda s: cast(Scene, s), csv.DictReader(StringIO(content.decode("utf-8")))))
[3]:
len(scenes)
[3]:
2390885

As you can see, there are a lot of scenes! Even though STAC items contain just the metadata for the scenes (and not the bulky raster image), this would still be a lot of data and files to work with for this tutorial.

Let’s see what one of the scenes looks like, and then filter on those properties to scope things down.

[4]:
scenes[0]
[4]:
{'productId': 'LC08_L1TP_149039_20170411_20170415_01_T1',
 'entityId': 'LC81490392017101LGN00',
 'acquisitionDate': '2017-04-11 05:36:29.349932',
 'cloudCover': '0.0',
 'processingLevel': 'L1TP',
 'path': '149',
 'row': '39',
 'min_lat': '29.22165',
 'min_lon': '72.41205',
 'max_lat': '31.34742',
 'max_lon': '74.84666',
 'download_url': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/149/039/LC08_L1TP_149039_20170411_20170415_01_T1/index.html'}

As you can see, we have both a date and a location that we can filter on.

Filter scenes by a location#

Let’s pick a location to filter the scenes by. Here we choose Philly, but feel free to change the location by modifying the latitude and longitude coordinates below and changing the location name:

[5]:
lat, lon = 39.9526, -75.1652
location_name = "Philly"
filter_year = '2020'

We’ll use the coordinates and the year to filter out any unwanted scenes:

[6]:
def keep_scene(scene: Scene) -> bool:
    contains_location = float(scene['min_lat']) < lat and float(scene['max_lat']) > lat and \
                        float(scene['min_lon']) < lon and float(scene['max_lon']) > lon
    is_correct_year = '{}-'.format(filter_year) in scene['acquisitionDate']
    return contains_location and is_correct_year

location_scenes = [scene for scene in scenes if keep_scene(scene)]

This should leave us with a much more manageable subset of scenes:

[7]:
len(location_scenes)
[7]:
77

We’ll be working with a single scene through the next few sections, so let’s use the first scene in our list:

[8]:
scene = location_scenes[0]
scene
[8]:
{'productId': 'LC08_L1GT_015032_20200103_20200103_01_RT',
 'entityId': 'LC80150322020003LGN00',
 'acquisitionDate': '2020-01-03 15:46:15.625235',
 'cloudCover': '100.0',
 'processingLevel': 'L1GT',
 'path': '15',
 'row': '32',
 'min_lat': '39.23189',
 'min_lon': '-77.83282',
 'max_lat': '41.37536',
 'max_lon': '-75.02666',
 'download_url': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/index.html'}

Read metadata from the MTL file#

Landsat 8 metadata is contained in an MTL file that is alongside of the download_url file specified in the scene data. Let’s read the MTL file for the first scene and see what it looks like.

First we define a function that reads a file based on the download_url location - we’ll be using this a lot to get file URLs related to a scene:

[9]:
def get_asset_url(scene: Scene, suffix: str) -> str:
    product_id = scene['productId']
    download_url = scene['download_url']
    asset_filename = '{}_{}'.format(product_id, suffix)
    return download_url.replace('index.html', asset_filename)

We can then use function to get the MTL file for our scene. Notice we use pystac.STAC_IO.read_text - this is the method that PySTAC uses to read text as it crawls a STAC. It can read from the local filesystem or HTTP/HTTPS by default. Also, it can be extended to read from other sources such as cloud providers - see the documentation here. For now we’ll use it directly as an easy way to read a text file from an HTTPS source:

[10]:
stac_io = pystac.StacIO.default()
mtl_url = get_asset_url(scene, 'MTL.txt')
print(stac_io.read_text(mtl_url))
GROUP = L1_METADATA_FILE
  GROUP = METADATA_FILE_INFO
    ORIGIN = "Image courtesy of the U.S. Geological Survey"
    REQUEST_ID = "0502001033167_00013"
    LANDSAT_SCENE_ID = "LC80150322020003LGN00"
    LANDSAT_PRODUCT_ID = "LC08_L1GT_015032_20200103_20200103_01_RT"
    COLLECTION_NUMBER = 01
    FILE_DATE = 2020-01-03T21:39:14Z
    STATION_ID = "LGN"
    PROCESSING_SOFTWARE_VERSION = "LPGS_13.1.0"
  END_GROUP = METADATA_FILE_INFO
  GROUP = PRODUCT_METADATA
    DATA_TYPE = "L1GT"
    COLLECTION_CATEGORY = "RT"
    ELEVATION_SOURCE = "GLS2000"
    OUTPUT_FORMAT = "GEOTIFF"
    SPACECRAFT_ID = "LANDSAT_8"
    SENSOR_ID = "OLI_TIRS"
    WRS_PATH = 15
    WRS_ROW = 32
    NADIR_OFFNADIR = "NADIR"
    TARGET_WRS_PATH = 15
    TARGET_WRS_ROW = 32
    DATE_ACQUIRED = 2020-01-03
    SCENE_CENTER_TIME = "15:46:15.6252350Z"
    CORNER_UL_LAT_PRODUCT = 41.37536
    CORNER_UL_LON_PRODUCT = -77.83282
    CORNER_UR_LAT_PRODUCT = 41.41024
    CORNER_UR_LON_PRODUCT = -75.02752
    CORNER_LL_LAT_PRODUCT = 39.23189
    CORNER_LL_LON_PRODUCT = -77.74460
    CORNER_LR_LAT_PRODUCT = 39.26423
    CORNER_LR_LON_PRODUCT = -75.02666
    CORNER_UL_PROJECTION_X_PRODUCT = 263100.000
    CORNER_UL_PROJECTION_Y_PRODUCT = 4584300.000
    CORNER_UR_PROJECTION_X_PRODUCT = 497700.000
    CORNER_UR_PROJECTION_Y_PRODUCT = 4584300.000
    CORNER_LL_PROJECTION_X_PRODUCT = 263100.000
    CORNER_LL_PROJECTION_Y_PRODUCT = 4346100.000
    CORNER_LR_PROJECTION_X_PRODUCT = 497700.000
    CORNER_LR_PROJECTION_Y_PRODUCT = 4346100.000
    PANCHROMATIC_LINES = 15881
    PANCHROMATIC_SAMPLES = 15641
    REFLECTIVE_LINES = 7941
    REFLECTIVE_SAMPLES = 7821
    THERMAL_LINES = 7941
    THERMAL_SAMPLES = 7821
    FILE_NAME_BAND_1 = "LC08_L1GT_015032_20200103_20200103_01_RT_B1.TIF"
    FILE_NAME_BAND_2 = "LC08_L1GT_015032_20200103_20200103_01_RT_B2.TIF"
    FILE_NAME_BAND_3 = "LC08_L1GT_015032_20200103_20200103_01_RT_B3.TIF"
    FILE_NAME_BAND_4 = "LC08_L1GT_015032_20200103_20200103_01_RT_B4.TIF"
    FILE_NAME_BAND_5 = "LC08_L1GT_015032_20200103_20200103_01_RT_B5.TIF"
    FILE_NAME_BAND_6 = "LC08_L1GT_015032_20200103_20200103_01_RT_B6.TIF"
    FILE_NAME_BAND_7 = "LC08_L1GT_015032_20200103_20200103_01_RT_B7.TIF"
    FILE_NAME_BAND_8 = "LC08_L1GT_015032_20200103_20200103_01_RT_B8.TIF"
    FILE_NAME_BAND_9 = "LC08_L1GT_015032_20200103_20200103_01_RT_B9.TIF"
    FILE_NAME_BAND_10 = "LC08_L1GT_015032_20200103_20200103_01_RT_B10.TIF"
    FILE_NAME_BAND_11 = "LC08_L1GT_015032_20200103_20200103_01_RT_B11.TIF"
    FILE_NAME_BAND_QUALITY = "LC08_L1GT_015032_20200103_20200103_01_RT_BQA.TIF"
    ANGLE_COEFFICIENT_FILE_NAME = "LC08_L1GT_015032_20200103_20200103_01_RT_ANG.txt"
    METADATA_FILE_NAME = "LC08_L1GT_015032_20200103_20200103_01_RT_MTL.txt"
    CPF_NAME = "LC08CPF_20200101_20200331_01.02"
    BPF_NAME_OLI = "LO8BPF20200103153915_20200103162306.02"
    BPF_NAME_TIRS = "LT8BPF20200101032539_20200101033839.01"
    RLUT_FILE_NAME = "LC08RLUT_20150303_20431231_01_12.h5"
  END_GROUP = PRODUCT_METADATA
  GROUP = IMAGE_ATTRIBUTES
    CLOUD_COVER = 100.00
    CLOUD_COVER_LAND = 100.00
    IMAGE_QUALITY_OLI = 9
    IMAGE_QUALITY_TIRS = 7
    TIRS_SSM_MODEL = "PRELIMINARY"
    TIRS_SSM_POSITION_STATUS = "ESTIMATED"
    TIRS_STRAY_LIGHT_CORRECTION_SOURCE = "TIRS"
    ROLL_ANGLE = -0.001
    SUN_AZIMUTH = 158.89720783
    SUN_ELEVATION = 23.89789093
    EARTH_SUN_DISTANCE = 0.9832512
    SATURATION_BAND_1 = "N"
    SATURATION_BAND_2 = "N"
    SATURATION_BAND_3 = "N"
    SATURATION_BAND_4 = "N"
    SATURATION_BAND_5 = "N"
    SATURATION_BAND_6 = "N"
    SATURATION_BAND_7 = "N"
    SATURATION_BAND_8 = "N"
    SATURATION_BAND_9 = "N"
    TRUNCATION_OLI = "UPPER"
  END_GROUP = IMAGE_ATTRIBUTES
  GROUP = MIN_MAX_RADIANCE
    RADIANCE_MAXIMUM_BAND_1 = 786.17712
    RADIANCE_MINIMUM_BAND_1 = -64.92276
    RADIANCE_MAXIMUM_BAND_2 = 805.05499
    RADIANCE_MINIMUM_BAND_2 = -66.48170
    RADIANCE_MAXIMUM_BAND_3 = 741.85126
    RADIANCE_MINIMUM_BAND_3 = -61.26232
    RADIANCE_MAXIMUM_BAND_4 = 625.57080
    RADIANCE_MINIMUM_BAND_4 = -51.65984
    RADIANCE_MAXIMUM_BAND_5 = 382.81812
    RADIANCE_MINIMUM_BAND_5 = -31.61325
    RADIANCE_MAXIMUM_BAND_6 = 95.20338
    RADIANCE_MINIMUM_BAND_6 = -7.86193
    RADIANCE_MAXIMUM_BAND_7 = 32.08864
    RADIANCE_MINIMUM_BAND_7 = -2.64989
    RADIANCE_MAXIMUM_BAND_8 = 707.97394
    RADIANCE_MINIMUM_BAND_8 = -58.46472
    RADIANCE_MAXIMUM_BAND_9 = 149.61400
    RADIANCE_MINIMUM_BAND_9 = -12.35517
    RADIANCE_MAXIMUM_BAND_10 = 22.00180
    RADIANCE_MINIMUM_BAND_10 = 0.10033
    RADIANCE_MAXIMUM_BAND_11 = 22.00180
    RADIANCE_MINIMUM_BAND_11 = 0.10033
  END_GROUP = MIN_MAX_RADIANCE
  GROUP = MIN_MAX_REFLECTANCE
    REFLECTANCE_MAXIMUM_BAND_1 = 1.210700
    REFLECTANCE_MINIMUM_BAND_1 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_2 = 1.210700
    REFLECTANCE_MINIMUM_BAND_2 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_3 = 1.210700
    REFLECTANCE_MINIMUM_BAND_3 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_4 = 1.210700
    REFLECTANCE_MINIMUM_BAND_4 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_5 = 1.210700
    REFLECTANCE_MINIMUM_BAND_5 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_6 = 1.210700
    REFLECTANCE_MINIMUM_BAND_6 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_7 = 1.210700
    REFLECTANCE_MINIMUM_BAND_7 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_8 = 1.210700
    REFLECTANCE_MINIMUM_BAND_8 = -0.099980
    REFLECTANCE_MAXIMUM_BAND_9 = 1.210700
    REFLECTANCE_MINIMUM_BAND_9 = -0.099980
  END_GROUP = MIN_MAX_REFLECTANCE
  GROUP = MIN_MAX_PIXEL_VALUE
    QUANTIZE_CAL_MAX_BAND_1 = 65535
    QUANTIZE_CAL_MIN_BAND_1 = 1
    QUANTIZE_CAL_MAX_BAND_2 = 65535
    QUANTIZE_CAL_MIN_BAND_2 = 1
    QUANTIZE_CAL_MAX_BAND_3 = 65535
    QUANTIZE_CAL_MIN_BAND_3 = 1
    QUANTIZE_CAL_MAX_BAND_4 = 65535
    QUANTIZE_CAL_MIN_BAND_4 = 1
    QUANTIZE_CAL_MAX_BAND_5 = 65535
    QUANTIZE_CAL_MIN_BAND_5 = 1
    QUANTIZE_CAL_MAX_BAND_6 = 65535
    QUANTIZE_CAL_MIN_BAND_6 = 1
    QUANTIZE_CAL_MAX_BAND_7 = 65535
    QUANTIZE_CAL_MIN_BAND_7 = 1
    QUANTIZE_CAL_MAX_BAND_8 = 65535
    QUANTIZE_CAL_MIN_BAND_8 = 1
    QUANTIZE_CAL_MAX_BAND_9 = 65535
    QUANTIZE_CAL_MIN_BAND_9 = 1
    QUANTIZE_CAL_MAX_BAND_10 = 65535
    QUANTIZE_CAL_MIN_BAND_10 = 1
    QUANTIZE_CAL_MAX_BAND_11 = 65535
    QUANTIZE_CAL_MIN_BAND_11 = 1
  END_GROUP = MIN_MAX_PIXEL_VALUE
  GROUP = RADIOMETRIC_RESCALING
    RADIANCE_MULT_BAND_1 = 1.2987E-02
    RADIANCE_MULT_BAND_2 = 1.3299E-02
    RADIANCE_MULT_BAND_3 = 1.2255E-02
    RADIANCE_MULT_BAND_4 = 1.0334E-02
    RADIANCE_MULT_BAND_5 = 6.3239E-03
    RADIANCE_MULT_BAND_6 = 1.5727E-03
    RADIANCE_MULT_BAND_7 = 5.3008E-04
    RADIANCE_MULT_BAND_8 = 1.1695E-02
    RADIANCE_MULT_BAND_9 = 2.4715E-03
    RADIANCE_MULT_BAND_10 = 3.3420E-04
    RADIANCE_MULT_BAND_11 = 3.3420E-04
    RADIANCE_ADD_BAND_1 = -64.93575
    RADIANCE_ADD_BAND_2 = -66.49500
    RADIANCE_ADD_BAND_3 = -61.27457
    RADIANCE_ADD_BAND_4 = -51.67017
    RADIANCE_ADD_BAND_5 = -31.61957
    RADIANCE_ADD_BAND_6 = -7.86350
    RADIANCE_ADD_BAND_7 = -2.65042
    RADIANCE_ADD_BAND_8 = -58.47642
    RADIANCE_ADD_BAND_9 = -12.35764
    RADIANCE_ADD_BAND_10 = 0.10000
    RADIANCE_ADD_BAND_11 = 0.10000
    REFLECTANCE_MULT_BAND_1 = 2.0000E-05
    REFLECTANCE_MULT_BAND_2 = 2.0000E-05
    REFLECTANCE_MULT_BAND_3 = 2.0000E-05
    REFLECTANCE_MULT_BAND_4 = 2.0000E-05
    REFLECTANCE_MULT_BAND_5 = 2.0000E-05
    REFLECTANCE_MULT_BAND_6 = 2.0000E-05
    REFLECTANCE_MULT_BAND_7 = 2.0000E-05
    REFLECTANCE_MULT_BAND_8 = 2.0000E-05
    REFLECTANCE_MULT_BAND_9 = 2.0000E-05
    REFLECTANCE_ADD_BAND_1 = -0.100000
    REFLECTANCE_ADD_BAND_2 = -0.100000
    REFLECTANCE_ADD_BAND_3 = -0.100000
    REFLECTANCE_ADD_BAND_4 = -0.100000
    REFLECTANCE_ADD_BAND_5 = -0.100000
    REFLECTANCE_ADD_BAND_6 = -0.100000
    REFLECTANCE_ADD_BAND_7 = -0.100000
    REFLECTANCE_ADD_BAND_8 = -0.100000
    REFLECTANCE_ADD_BAND_9 = -0.100000
  END_GROUP = RADIOMETRIC_RESCALING
  GROUP = TIRS_THERMAL_CONSTANTS
    K1_CONSTANT_BAND_10 = 774.8853
    K2_CONSTANT_BAND_10 = 1321.0789
    K1_CONSTANT_BAND_11 = 480.8883
    K2_CONSTANT_BAND_11 = 1201.1442
  END_GROUP = TIRS_THERMAL_CONSTANTS
  GROUP = PROJECTION_PARAMETERS
    MAP_PROJECTION = "UTM"
    DATUM = "WGS84"
    ELLIPSOID = "WGS84"
    UTM_ZONE = 18
    GRID_CELL_SIZE_PANCHROMATIC = 15.00
    GRID_CELL_SIZE_REFLECTIVE = 30.00
    GRID_CELL_SIZE_THERMAL = 30.00
    ORIENTATION = "NORTH_UP"
    RESAMPLING_OPTION = "CUBIC_CONVOLUTION"
  END_GROUP = PROJECTION_PARAMETERS
END_GROUP = L1_METADATA_FILE
END

The MTL file contains metadata in a text format that’s a bit hard to use as-is; we can parse things out to a dict for easier access:

[11]:
def get_metadata(url: str) -> Dict[str, Any]:
    """ Convert Landsat MTL file to dictionary of metadata values """
    mtl = {}
    mtl_text = stac_io.read_text(url)
    for line in mtl_text.split('\n'):
        meta = line.replace('\"', "").strip().split('=')
        if len(meta) > 1:
            key = meta[0].strip()
            item = meta[1].strip()
            if key != "GROUP" and key != "END_GROUP":
                mtl[key] = item
    return mtl
[12]:
metadata = get_metadata(mtl_url)
metadata
[12]:
{'ORIGIN': 'Image courtesy of the U.S. Geological Survey',
 'REQUEST_ID': '0502001033167_00013',
 'LANDSAT_SCENE_ID': 'LC80150322020003LGN00',
 'LANDSAT_PRODUCT_ID': 'LC08_L1GT_015032_20200103_20200103_01_RT',
 'COLLECTION_NUMBER': '01',
 'FILE_DATE': '2020-01-03T21:39:14Z',
 'STATION_ID': 'LGN',
 'PROCESSING_SOFTWARE_VERSION': 'LPGS_13.1.0',
 'DATA_TYPE': 'L1GT',
 'COLLECTION_CATEGORY': 'RT',
 'ELEVATION_SOURCE': 'GLS2000',
 'OUTPUT_FORMAT': 'GEOTIFF',
 'SPACECRAFT_ID': 'LANDSAT_8',
 'SENSOR_ID': 'OLI_TIRS',
 'WRS_PATH': '15',
 'WRS_ROW': '32',
 'NADIR_OFFNADIR': 'NADIR',
 'TARGET_WRS_PATH': '15',
 'TARGET_WRS_ROW': '32',
 'DATE_ACQUIRED': '2020-01-03',
 'SCENE_CENTER_TIME': '15:46:15.6252350Z',
 'CORNER_UL_LAT_PRODUCT': '41.37536',
 'CORNER_UL_LON_PRODUCT': '-77.83282',
 'CORNER_UR_LAT_PRODUCT': '41.41024',
 'CORNER_UR_LON_PRODUCT': '-75.02752',
 'CORNER_LL_LAT_PRODUCT': '39.23189',
 'CORNER_LL_LON_PRODUCT': '-77.74460',
 'CORNER_LR_LAT_PRODUCT': '39.26423',
 'CORNER_LR_LON_PRODUCT': '-75.02666',
 'CORNER_UL_PROJECTION_X_PRODUCT': '263100.000',
 'CORNER_UL_PROJECTION_Y_PRODUCT': '4584300.000',
 'CORNER_UR_PROJECTION_X_PRODUCT': '497700.000',
 'CORNER_UR_PROJECTION_Y_PRODUCT': '4584300.000',
 'CORNER_LL_PROJECTION_X_PRODUCT': '263100.000',
 'CORNER_LL_PROJECTION_Y_PRODUCT': '4346100.000',
 'CORNER_LR_PROJECTION_X_PRODUCT': '497700.000',
 'CORNER_LR_PROJECTION_Y_PRODUCT': '4346100.000',
 'PANCHROMATIC_LINES': '15881',
 'PANCHROMATIC_SAMPLES': '15641',
 'REFLECTIVE_LINES': '7941',
 'REFLECTIVE_SAMPLES': '7821',
 'THERMAL_LINES': '7941',
 'THERMAL_SAMPLES': '7821',
 'FILE_NAME_BAND_1': 'LC08_L1GT_015032_20200103_20200103_01_RT_B1.TIF',
 'FILE_NAME_BAND_2': 'LC08_L1GT_015032_20200103_20200103_01_RT_B2.TIF',
 'FILE_NAME_BAND_3': 'LC08_L1GT_015032_20200103_20200103_01_RT_B3.TIF',
 'FILE_NAME_BAND_4': 'LC08_L1GT_015032_20200103_20200103_01_RT_B4.TIF',
 'FILE_NAME_BAND_5': 'LC08_L1GT_015032_20200103_20200103_01_RT_B5.TIF',
 'FILE_NAME_BAND_6': 'LC08_L1GT_015032_20200103_20200103_01_RT_B6.TIF',
 'FILE_NAME_BAND_7': 'LC08_L1GT_015032_20200103_20200103_01_RT_B7.TIF',
 'FILE_NAME_BAND_8': 'LC08_L1GT_015032_20200103_20200103_01_RT_B8.TIF',
 'FILE_NAME_BAND_9': 'LC08_L1GT_015032_20200103_20200103_01_RT_B9.TIF',
 'FILE_NAME_BAND_10': 'LC08_L1GT_015032_20200103_20200103_01_RT_B10.TIF',
 'FILE_NAME_BAND_11': 'LC08_L1GT_015032_20200103_20200103_01_RT_B11.TIF',
 'FILE_NAME_BAND_QUALITY': 'LC08_L1GT_015032_20200103_20200103_01_RT_BQA.TIF',
 'ANGLE_COEFFICIENT_FILE_NAME': 'LC08_L1GT_015032_20200103_20200103_01_RT_ANG.txt',
 'METADATA_FILE_NAME': 'LC08_L1GT_015032_20200103_20200103_01_RT_MTL.txt',
 'CPF_NAME': 'LC08CPF_20200101_20200331_01.02',
 'BPF_NAME_OLI': 'LO8BPF20200103153915_20200103162306.02',
 'BPF_NAME_TIRS': 'LT8BPF20200101032539_20200101033839.01',
 'RLUT_FILE_NAME': 'LC08RLUT_20150303_20431231_01_12.h5',
 'CLOUD_COVER': '100.00',
 'CLOUD_COVER_LAND': '100.00',
 'IMAGE_QUALITY_OLI': '9',
 'IMAGE_QUALITY_TIRS': '7',
 'TIRS_SSM_MODEL': 'PRELIMINARY',
 'TIRS_SSM_POSITION_STATUS': 'ESTIMATED',
 'TIRS_STRAY_LIGHT_CORRECTION_SOURCE': 'TIRS',
 'ROLL_ANGLE': '-0.001',
 'SUN_AZIMUTH': '158.89720783',
 'SUN_ELEVATION': '23.89789093',
 'EARTH_SUN_DISTANCE': '0.9832512',
 'SATURATION_BAND_1': 'N',
 'SATURATION_BAND_2': 'N',
 'SATURATION_BAND_3': 'N',
 'SATURATION_BAND_4': 'N',
 'SATURATION_BAND_5': 'N',
 'SATURATION_BAND_6': 'N',
 'SATURATION_BAND_7': 'N',
 'SATURATION_BAND_8': 'N',
 'SATURATION_BAND_9': 'N',
 'TRUNCATION_OLI': 'UPPER',
 'RADIANCE_MAXIMUM_BAND_1': '786.17712',
 'RADIANCE_MINIMUM_BAND_1': '-64.92276',
 'RADIANCE_MAXIMUM_BAND_2': '805.05499',
 'RADIANCE_MINIMUM_BAND_2': '-66.48170',
 'RADIANCE_MAXIMUM_BAND_3': '741.85126',
 'RADIANCE_MINIMUM_BAND_3': '-61.26232',
 'RADIANCE_MAXIMUM_BAND_4': '625.57080',
 'RADIANCE_MINIMUM_BAND_4': '-51.65984',
 'RADIANCE_MAXIMUM_BAND_5': '382.81812',
 'RADIANCE_MINIMUM_BAND_5': '-31.61325',
 'RADIANCE_MAXIMUM_BAND_6': '95.20338',
 'RADIANCE_MINIMUM_BAND_6': '-7.86193',
 'RADIANCE_MAXIMUM_BAND_7': '32.08864',
 'RADIANCE_MINIMUM_BAND_7': '-2.64989',
 'RADIANCE_MAXIMUM_BAND_8': '707.97394',
 'RADIANCE_MINIMUM_BAND_8': '-58.46472',
 'RADIANCE_MAXIMUM_BAND_9': '149.61400',
 'RADIANCE_MINIMUM_BAND_9': '-12.35517',
 'RADIANCE_MAXIMUM_BAND_10': '22.00180',
 'RADIANCE_MINIMUM_BAND_10': '0.10033',
 'RADIANCE_MAXIMUM_BAND_11': '22.00180',
 'RADIANCE_MINIMUM_BAND_11': '0.10033',
 'REFLECTANCE_MAXIMUM_BAND_1': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_1': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_2': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_2': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_3': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_3': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_4': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_4': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_5': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_5': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_6': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_6': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_7': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_7': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_8': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_8': '-0.099980',
 'REFLECTANCE_MAXIMUM_BAND_9': '1.210700',
 'REFLECTANCE_MINIMUM_BAND_9': '-0.099980',
 'QUANTIZE_CAL_MAX_BAND_1': '65535',
 'QUANTIZE_CAL_MIN_BAND_1': '1',
 'QUANTIZE_CAL_MAX_BAND_2': '65535',
 'QUANTIZE_CAL_MIN_BAND_2': '1',
 'QUANTIZE_CAL_MAX_BAND_3': '65535',
 'QUANTIZE_CAL_MIN_BAND_3': '1',
 'QUANTIZE_CAL_MAX_BAND_4': '65535',
 'QUANTIZE_CAL_MIN_BAND_4': '1',
 'QUANTIZE_CAL_MAX_BAND_5': '65535',
 'QUANTIZE_CAL_MIN_BAND_5': '1',
 'QUANTIZE_CAL_MAX_BAND_6': '65535',
 'QUANTIZE_CAL_MIN_BAND_6': '1',
 'QUANTIZE_CAL_MAX_BAND_7': '65535',
 'QUANTIZE_CAL_MIN_BAND_7': '1',
 'QUANTIZE_CAL_MAX_BAND_8': '65535',
 'QUANTIZE_CAL_MIN_BAND_8': '1',
 'QUANTIZE_CAL_MAX_BAND_9': '65535',
 'QUANTIZE_CAL_MIN_BAND_9': '1',
 'QUANTIZE_CAL_MAX_BAND_10': '65535',
 'QUANTIZE_CAL_MIN_BAND_10': '1',
 'QUANTIZE_CAL_MAX_BAND_11': '65535',
 'QUANTIZE_CAL_MIN_BAND_11': '1',
 'RADIANCE_MULT_BAND_1': '1.2987E-02',
 'RADIANCE_MULT_BAND_2': '1.3299E-02',
 'RADIANCE_MULT_BAND_3': '1.2255E-02',
 'RADIANCE_MULT_BAND_4': '1.0334E-02',
 'RADIANCE_MULT_BAND_5': '6.3239E-03',
 'RADIANCE_MULT_BAND_6': '1.5727E-03',
 'RADIANCE_MULT_BAND_7': '5.3008E-04',
 'RADIANCE_MULT_BAND_8': '1.1695E-02',
 'RADIANCE_MULT_BAND_9': '2.4715E-03',
 'RADIANCE_MULT_BAND_10': '3.3420E-04',
 'RADIANCE_MULT_BAND_11': '3.3420E-04',
 'RADIANCE_ADD_BAND_1': '-64.93575',
 'RADIANCE_ADD_BAND_2': '-66.49500',
 'RADIANCE_ADD_BAND_3': '-61.27457',
 'RADIANCE_ADD_BAND_4': '-51.67017',
 'RADIANCE_ADD_BAND_5': '-31.61957',
 'RADIANCE_ADD_BAND_6': '-7.86350',
 'RADIANCE_ADD_BAND_7': '-2.65042',
 'RADIANCE_ADD_BAND_8': '-58.47642',
 'RADIANCE_ADD_BAND_9': '-12.35764',
 'RADIANCE_ADD_BAND_10': '0.10000',
 'RADIANCE_ADD_BAND_11': '0.10000',
 'REFLECTANCE_MULT_BAND_1': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_2': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_3': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_4': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_5': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_6': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_7': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_8': '2.0000E-05',
 'REFLECTANCE_MULT_BAND_9': '2.0000E-05',
 'REFLECTANCE_ADD_BAND_1': '-0.100000',
 'REFLECTANCE_ADD_BAND_2': '-0.100000',
 'REFLECTANCE_ADD_BAND_3': '-0.100000',
 'REFLECTANCE_ADD_BAND_4': '-0.100000',
 'REFLECTANCE_ADD_BAND_5': '-0.100000',
 'REFLECTANCE_ADD_BAND_6': '-0.100000',
 'REFLECTANCE_ADD_BAND_7': '-0.100000',
 'REFLECTANCE_ADD_BAND_8': '-0.100000',
 'REFLECTANCE_ADD_BAND_9': '-0.100000',
 'K1_CONSTANT_BAND_10': '774.8853',
 'K2_CONSTANT_BAND_10': '1321.0789',
 'K1_CONSTANT_BAND_11': '480.8883',
 'K2_CONSTANT_BAND_11': '1201.1442',
 'MAP_PROJECTION': 'UTM',
 'DATUM': 'WGS84',
 'ELLIPSOID': 'WGS84',
 'UTM_ZONE': '18',
 'GRID_CELL_SIZE_PANCHROMATIC': '15.00',
 'GRID_CELL_SIZE_REFLECTIVE': '30.00',
 'GRID_CELL_SIZE_THERMAL': '30.00',
 'ORIENTATION': 'NORTH_UP',
 'RESAMPLING_OPTION': 'CUBIC_CONVOLUTION'}

Create a STAC Item from a scene#

Now that we have metadata for the scene let’s use it to create a STAC Item.

We can use the help method to see the signature of the __init__ method on pystac.Item. You can also call help directly on pystac.Item for broader documentation, or check the API docs for Item here.

[13]:
help(pystac.Item.__init__)
Help on function __init__ in module pystac.item:

__init__(self, id: str, geometry: Union[Dict[str, Any], NoneType], bbox: Union[List[float], NoneType], datetime: Union[datetime.datetime, NoneType], properties: Dict[str, Any], stac_extensions: Union[List[str], NoneType] = None, href: Union[str, NoneType] = None, collection: Union[str, pystac.collection.Collection, NoneType] = None, extra_fields: Union[Dict[str, Any], NoneType] = None)
    Initialize self.  See help(type(self)) for accurate signature.

We can see we’ll need at least an id, geometry, bbox, and datetime. Properties is required, but can be an empty dictionary that we fill out on the Item once it’s created.

Item id#

For the Item’s id, we’ll use the scene ID. We’ll chop off the last 5 characters as they are repeated for each ID and so aren’t necessary:

[14]:
def get_item_id(metadata: Dict[str, Any]) -> str:
    return cast(str, metadata['LANDSAT_SCENE_ID'][:-5])
[15]:
item_id = get_item_id(metadata)
item_id
[15]:
'LC80150322020003'

Item datetime#

Here we parse the datetime of the Item from two metadata fields that describe the date and time the scene was captured:

[16]:
from dateutil.parser import parse

def get_datetime(metadata: Dict[str, Any]) -> datetime:
    return parse('%sT%s' % (metadata['DATE_ACQUIRED'], metadata['SCENE_CENTER_TIME']))
[17]:
item_datetime = get_datetime(metadata)
item_datetime
[17]:
datetime.datetime(2020, 1, 3, 15, 46, 15, 625235, tzinfo=tzutc())

Item bbox#

Here we read in the bounding box information from the scene and transform it into the format of the Item’s bbox property:

[18]:
def get_bbox(metadata: Dict[str, Any]) -> List[float]:
    coords = [[
        [float(metadata['CORNER_UL_LON_PRODUCT']), float(metadata['CORNER_UL_LAT_PRODUCT'])],
        [float(metadata['CORNER_UR_LON_PRODUCT']), float(metadata['CORNER_UR_LAT_PRODUCT'])],
        [float(metadata['CORNER_LR_LON_PRODUCT']), float(metadata['CORNER_LR_LAT_PRODUCT'])],
        [float(metadata['CORNER_LL_LON_PRODUCT']), float(metadata['CORNER_LL_LAT_PRODUCT'])],
        [float(metadata['CORNER_UL_LON_PRODUCT']), float(metadata['CORNER_UL_LAT_PRODUCT'])]
    ]]
    lats = [c[1] for c in coords[0]]
    lons = [c[0] for c in coords[0]]
    return [min(lons), min(lats), max(lons), max(lats)]
[19]:
item_bbox = get_bbox(metadata)
item_bbox
[19]:
[-77.83282, 39.23189, -75.02666, 41.41024]

Item geometry#

Getting the geometry of the scene is a little more tricky. The bounding box will be a axis-aligned rectangle of the area the scene occupies, but will not represent the true footprint of the image - Landsat 8 scenes are “tilted” according the the coordinate reference system, so there will be areas in the corner where no image data exists. When constructing a STAC Item it’s best if you have the Item geometry represent the true footprint of the assets.

To get the footprint of the scene we’ll read in another metadata file that lives alongside the MTL - the ANG.txt file. This function uses the ANG file and the bbox to construct the GeoJSON polygon that represents the footprint of the scene:

[20]:
def get_geometry(scene: Scene, bbox: List[float]) -> Dict[str, Any]:
    url = get_asset_url(scene, 'ANG.txt')
    sz = []
    coords = []
    ang_text = stac_io.read_text(url)
    for line in ang_text.split('\n'):
        if 'BAND01_NUM_L1T_LINES' in line or 'BAND01_NUM_L1T_SAMPS' in line:
            sz.append(float(line.split('=')[1]))
        if 'BAND01_L1T_IMAGE_CORNER_LINES' in line or 'BAND01_L1T_IMAGE_CORNER_SAMPS' in line:
            coords.append([float(l) for l in line.split('=')[1].strip().strip('()').split(',')])
        if len(coords) == 2:
            break
    dlon = bbox[2] - bbox[0]
    dlat = bbox[3] - bbox[1]
    lons = [c/sz[1] * dlon + bbox[0] for c in coords[1]]
    lats = [((sz[0] - c)/sz[0]) * dlat + bbox[1] for c in coords[0]]
    coordinates = [[
        [lons[0], lats[0]], [lons[1], lats[1]], [lons[2], lats[2]], [lons[3], lats[3]], [lons[0], lats[0]]
    ]]

    return {'type': 'Polygon', 'coordinates': coordinates}
[21]:
item_geometry = get_geometry(scene, item_bbox)
item_geometry
[21]:
{'type': 'Polygon',
 'coordinates': [[[-77.24189342621209, 41.40875922921183],
   [-75.02888361862159, 40.97145077644256],
   [-75.62135920205826, 39.23299959871144],
   [-77.83174166996123, 39.67752690559828],
   [-77.24189342621209, 41.40875922921183]]]}

This would be a good time to check our work - we can print out the GeoJSON and use geojson.io to check and make sure we’re using scenes that overlap our location. If this footprint is somewhere unexpected in the world, make sure the Lat/Long coordinates are correct and in the right order!

[22]:
import json

print(json.dumps(item_geometry, indent=2))
{
  "type": "Polygon",
  "coordinates": [
    [
      [
        -77.24189342621209,
        41.40875922921183
      ],
      [
        -75.02888361862159,
        40.97145077644256
      ],
      [
        -75.62135920205826,
        39.23299959871144
      ],
      [
        -77.83174166996123,
        39.67752690559828
      ],
      [
        -77.24189342621209,
        41.40875922921183
      ]
    ]
  ]
}

Create the item#

Now that we have the required attributes for an Item we can create it:

[23]:
item = pystac.Item(id=item_id,
                  datetime=item_datetime,
                  geometry=item_geometry,
                  bbox=item_bbox,
                  properties={})

PySTAC has a validate method on STAC objects, which you can use to make sure you’re constructing things correctly. If there’s an issue the following line will throw an exception:

[24]:
item.validate()
[24]:
['https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json']

Add Ground Sample Distance to common metadata#

We’ll add the Ground Sample Distance that is defined as part of the Item Common Metadata. We define this on the Item level as 30 meters, which is the GSD for most of the bands of Landsat 8. However, there are some bands that have a different resolution; we will account for this by setting the GSD explicitly for each of those bands below.

[25]:
item.common_metadata.gsd = 30.0

Adding the EO extension#

STAC has a rich set of extensions that allow STAC objects to encode information that is not part of the core spec but is used widely and standardized. An example of this is the eo extension, which encapsulates data that that represents a snapshot of the earth for a single date and time.

We can enable the eo extension for this item by using the ext property that exists on all STAC objects:

[26]:
eo_ext = EOExtension.ext(item, add_if_missing=True)

Add cloud cover#

Here we add cloud cover from the metadata as part of the eo extension.

[27]:
def get_cloud_cover(metadata: Dict[str, Any]) -> float:
    return float(metadata['CLOUD_COVER'])
[28]:
eo_ext.cloud_cover = get_cloud_cover(metadata)
eo_ext.cloud_cover
[28]:
100.0

Adding assets#

STAC Items contain a list of Assets, which are a list of files that relate to the Item. In our case we’ll be cataloging each file related to the scene, including the Landsat 8 band files as well as the metadata files associated with the scene.

Here we define a dictionary that describes the band assets. We use the eo extension’s Band class to encapsulate information about the band each file represents, and also specify the Ground Sample Distance of each band:

[29]:
from pystac.extensions.eo import Band

class BandInfo(TypedDict):
    band: Band
    gsd: float


landsat_band_info: Dict[str, BandInfo] = {
    'B1': {
        'band': Band.create(name="B1", common_name="coastal", center_wavelength=0.48, full_width_half_max=0.02),
        'gsd': 30.0
    },
    'B2': {
        'band': Band.create(name="B2", common_name="blue", center_wavelength=0.44, full_width_half_max=0.06),
        'gsd': 30.0
    },
    'B3': {
        'band': Band.create(name="B3", common_name="green", center_wavelength=0.56, full_width_half_max=0.06),
        'gsd': 30.0
    },
    'B4': {
        'band': Band.create(name="B4", common_name="red", center_wavelength=0.65, full_width_half_max=0.04),
        'gsd': 30.0
    },
    'B5': {
        'band': Band.create(name="B5", common_name="nir", center_wavelength=0.86, full_width_half_max=0.03),
        'gsd': 30.0
    },
    'B6': {
        'band': Band.create(name="B6", common_name="swir16", center_wavelength=1.6, full_width_half_max=0.08),
        'gsd': 30.0
    },
    'B7': {
        'band': Band.create(name="B7", common_name="swir22", center_wavelength=2.2, full_width_half_max=0.2),
        'gsd': 30.0
    },
    'B8': {
        'band': Band.create(name="B8", common_name="pan", center_wavelength=0.59, full_width_half_max=0.18),
        'gsd': 15.0
    },
    'B9': {
        'band': Band.create(name="B9", common_name="cirrus", center_wavelength=1.37, full_width_half_max=0.02),
        'gsd': 30.0
    },
    'B10': {
        'band': Band.create(name="B10", common_name="lwir11", center_wavelength=10.9, full_width_half_max=0.8),
        'gsd': 100.0
    },
    'B11': {
        'band': Band.create(name="B11", common_name="lwir12", center_wavelength=12.0, full_width_half_max=1.0),
        'gsd': 100.0
    }
}

There are also other non-band assets associated with a scene, and we specify the Asset’s URL and media type here, along with the key we will refer to each asset by:

[30]:
def get_other_assets(scene: Scene) -> Dict[str, Dict[str, Any]]:
    return {
    'thumbnail': {
        'href': get_asset_url(scene, 'thumb_large.jpg'),
        'media_type': pystac.MediaType.JPEG
    },
    'index': {
        'href': get_asset_url(scene, 'index.html'),
        'media_type': 'application/html'
    },
    'ANG': {
        'href': get_asset_url(scene, 'ANG.txt'),
        'media_type': 'text/plain'
    },
    'MTL': {
        'href': get_asset_url(scene, 'MTL.txt'),
        'media_type': 'text/plain'
    },
    'BQA': {
        'href': get_asset_url(scene, 'BQA.TIF'),
        'media_type': pystac.MediaType.GEOTIFF
    }
}

With this information we can now define a method that adds all the relevant assets for a scene and add them to an item:

[31]:
def add_assets(item: pystac.Item, scene: Scene) -> None:
    # Add bands
    for band_id, band_info in landsat_band_info.items():
        band_url = get_asset_url(scene, '{}.TIF'.format(band_id))
        asset = pystac.Asset(href=band_url, media_type=pystac.MediaType.COG)
        bands = [band_info['band']]
        asset_eo_ext = EOExtension.ext(asset)
        asset_eo_ext.bands = bands
        item.add_asset(band_id, asset)

        # If this asset has a different GSD than the item, set it on the asset
        if band_info['gsd'] != item.common_metadata.gsd:
            asset.common_metadata.gsd = band_info["gsd"]

    # Add other assets
    for asset_id, asset_info in get_other_assets(scene).items():
        item.add_asset(asset_id,
                       pystac.Asset(href=asset_info['href'], media_type=asset_info['media_type']))
[32]:
add_assets(item, scene)

The logic for the Assets is such that if the gsd of the Asset is different from the Item’s GSD (30 meters), the Asset’s GSD will be specified directly on the Asset. We can see this by comparing the dict encoding of the Assets for band 10 and band 3:

[33]:
item.assets['B10'].to_dict()
[33]:
{'href': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/LC08_L1GT_015032_20200103_20200103_01_RT_B10.TIF',
 'type': <MediaType.COG: 'image/tiff; application=geotiff; profile=cloud-optimized'>,
 'eo:bands': [{'name': 'B10',
   'common_name': 'lwir11',
   'center_wavelength': 10.9,
   'full_width_half_max': 0.8}],
 'gsd': 100.0}
[34]:
item.assets['B3'].to_dict()
[34]:
{'href': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/LC08_L1GT_015032_20200103_20200103_01_RT_B3.TIF',
 'type': <MediaType.COG: 'image/tiff; application=geotiff; profile=cloud-optimized'>,
 'eo:bands': [{'name': 'B3',
   'common_name': 'green',
   'center_wavelength': 0.56,
   'full_width_half_max': 0.06}]}

Here we see the thumbnail asset, which does not include the band information for the eo extension as it does not represent any of the Item’s bands:

[35]:
item.assets['thumbnail'].to_dict()
[35]:
{'href': 'https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/015/032/LC08_L1GT_015032_20200103_20200103_01_RT/LC08_L1GT_015032_20200103_20200103_01_RT_thumb_large.jpg',
 'type': <MediaType.JPEG: 'image/jpeg'>}

Add projection information#

We can specify the EPSG code for the scene as part of the projection extension. The below method figures out the correct UTM Zone EPSG code based on the center latitude of the scene:

[36]:
def get_epsg(metadata: Dict[str, Any], min_lat: float, max_lat: float) -> Optional[int]:
    if 'UTM_ZONE' in metadata:
        center_lat = (min_lat + max_lat)/2.0
        return int(('326' if center_lat > 0 else '327') + metadata['UTM_ZONE'])
    else:
        return None
[ ]:
proj_ext = ProjectionExtension.ext(item, add_if_missing=True)
assert item.bbox is not None
proj_ext.epsg = get_epsg(metadata, item.bbox[1], item.bbox[3])
proj_ext.epsg

Add view geometry information#

The View Geometry extension specifies information related to angles of sensors and other radiance angles that affect the view of resulting data. The Landsat 8 metadata specifies two of these parameters, so we add them to our Item:

[37]:
def get_view_info(metadata: Dict[str, Any]) -> Dict[str, float]:
    return { 'sun_azimuth': float(metadata['SUN_AZIMUTH']),
             'sun_elevation': float(metadata['SUN_ELEVATION']) }
[38]:
view_ext = ViewExtension.ext(item, add_if_missing=True)
view_info = get_view_info(metadata)
view_ext.sun_azimuth = view_info['sun_azimuth']
view_ext.sun_elevation = view_info['sun_elevation']
item.properties
[38]:
{'datetime': '2020-01-03T15:46:15.625235Z',
 'gsd': 30.0,
 'eo:cloud_cover': 100.0,
 'view:sun_azimuth': 158.89720783,
 'view:sun_elevation': 23.89789093}

Now that we’ve added all the metadata to the item, let’s check the validator to make sure we’ve specified everything correctly. The validation logic will take into account the new extensions that have been enabled and validate against the proper schemas for those extensions.

[39]:
item.validate()
[39]:
['https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json',
 'https://stac-extensions.github.io/eo/v1.0.0/schema.json',
 'https://stac-extensions.github.io/view/v1.0.0/schema.json']

Building the Collection#

Now that we know how to build an item for a scene, let’s build the collection that will contain all the Items.

If we look at the __init__ method for pystac.Collection, we can see what properties are required:

[40]:
help(pystac.Collection.__init__)
Help on function __init__ in module pystac.collection:

__init__(self, id: str, description: str, extent: pystac.collection.Extent, title: Union[str, NoneType] = None, stac_extensions: Union[List[str], NoneType] = None, href: Union[str, NoneType] = None, extra_fields: Union[Dict[str, Any], NoneType] = None, catalog_type: Union[pystac.catalog.CatalogType, NoneType] = None, license: str = 'proprietary', keywords: Union[List[str], NoneType] = None, providers: Union[List[ForwardRef('Provider_Type')], NoneType] = None, summaries: Union[pystac.summaries.Summaries, NoneType] = None)
    Initialize self.  See help(type(self)) for accurate signature.

Collection id#

We’ll use the location name we defined above in the ID for our Collection:

[41]:
collection_id = '{}-landsat-8'.format(location_name)
collection_id
[41]:
'Philly-landsat-8'

Collection title#

Here we set a simple title for our collection.

[42]:
collection_title = '2020 Landsat 8 images over {}'.format(location_name)
collection_title
[42]:
'2020 Landsat 8 images over Philly'

Collection description#

Here we give a brief description of the Collection. If this were a real Collection that were being published, I’d recommend putting a much more detailed description to ensure anyone using your STAC knows what they are working with!

Notice we are using Markdown to write the description. The description field can be Markdown to help tools that render information about STAC to display the information in a more readable way.

[43]:
collection_description = '''### {} Landsat 8

A collection of Landsat 8 scenes around {} in 2020.
'''.format(location_name, location_name)
print(collection_description)
### Philly Landsat 8

A collection of Landsat 8 scenes around Philly in 2020.

Collection extent#

A Collection specifies the spatial and temporal extent of all the item it contains. Since Landsat 8 spans the globe, we’ll simply put a global extent here. We’ll also specify an open-ended time interval that starts with the first datetime for scenes hosted by AWS.

Towards the end of the notebook, we’ll use a method to easily scope this down to cover the times and space the Items occupy once we’ve added all the items.

[44]:
from datetime import datetime

spatial_extent = pystac.SpatialExtent([[-180., -90., 180., 90.]])
temporal_extent = pystac.TemporalExtent([[datetime(2013, 6, 1), None]])
collection_extent = pystac.Extent(spatial_extent, temporal_extent)
[45]:
collection = pystac.Collection(id=collection_id,
                     title=collection_title,
                     description=collection_description,
                     extent=collection_extent)

We can now look at our Collection as a dict to check our values.

[46]:
collection.to_dict()
[46]:
{'type': 'Collection',
 'id': 'Philly-landsat-8',
 'stac_version': '1.0.0',
 'description': '### Philly Landsat 8\n\nA collection of Landsat 8 scenes around Philly in 2020.\n',
 'links': [{'rel': <RelType.ROOT: 'root'>,
   'href': None,
   'type': <MediaType.JSON: 'application/json'>,
   'title': '2020 Landsat 8 images over Philly'}],
 'stac_extensions': [],
 'title': '2020 Landsat 8 images over Philly',
 'extent': {'spatial': {'bbox': [[-180, -90, 180, 90]]},
  'temporal': {'interval': [['2013-06-01T00:00:00Z', None]]}},
 'license': 'proprietary'}

Set the license#

Notice the license above is proprietary. This is the default in PySTAC if no license is specified; however Landsat 8 is certainly not proprietary (thankfully!), so let’s change the license to the correct SPDX string for public domain data:

[47]:
collection_license = 'PDDL-1.0'

Set the providers#

A collection will specify the providers of the data, including what role they have played. We can set our provider information by instantiating pystac.Provider objects:

[48]:
collection.providers = [
    pystac.Provider(name='USGS', roles=[pystac.ProviderRole.PRODUCER], url='https://landsat.usgs.gov/'),
    pystac.Provider(name='Planet Labs', roles=[pystac.ProviderRole.PROCESSOR], url='https://github.com/landsat-pds/landsat_ingestor'),
    pystac.Provider(name='AWS', roles=[pystac.ProviderRole.HOST], url='https://landsatonaws.com/')
]

Create items for each scene#

We created an Item for a single scene above. This method consolidates that logic into a single method that can construct an Item from a scene, so we can create an Item for every scene in our subset:

[49]:
def item_from_scene(scene: Scene) -> pystac.Item:
    mtl_url = get_asset_url(scene, 'MTL.txt')
    metadata = get_metadata(mtl_url)

    bbox = get_bbox(metadata)
    item = pystac.Item(id=get_item_id(metadata),
                       datetime=get_datetime(metadata),
                       geometry=get_geometry(scene, bbox),
                       bbox=bbox,
                       properties={})

    item.common_metadata.gsd = 30.0

    item_eo_ext = EOExtension.ext(item, add_if_missing=True)
    item_eo_ext.cloud_cover = get_cloud_cover(metadata)

    add_assets(item, scene)

    item_proj_ext = ProjectionExtension.ext(item, add_if_missing=True)
    assert item.bbox is not None
    item_proj_ext.epsg = get_epsg(metadata, item.bbox[1], item.bbox[3])

    item_view_ext = ViewExtension.ext(item, add_if_missing=True)
    view_info = get_view_info(metadata)
    item_view_ext.sun_azimuth = view_info['sun_azimuth']
    item_view_ext.sun_elevation = view_info['sun_elevation']

    item.validate()

    return item

Here we create an item per scene and add it to our collection. Since this is reading multiple metadata files per scene from the internet, it may take a little bit to run:

[50]:
for scene in location_scenes:
    item = item_from_scene(scene)
    collection.add_item(item)

Reset collection extent based on items#

Now that we’ve added all the item we can use the update_extent_from_items method on the Collection to set the extent based on the contained items:

[51]:
collection.update_extent_from_items()
collection.extent.to_dict()
[51]:
{'spatial': {'bbox': [[-77.88298, 39.23073, -73.46322, 41.41025]]},
 'temporal': {'interval': [['2020-01-03T15:46:15.625235Z',
    '2020-12-29T15:40:09.950514Z']]}}

Set the HREFs for everything in the catalog#

We’ve been building up our Collection and Items in memory. This has been convenient as it allows us not to think about file paths as we construct our Catalog. However, a STAC is not valid without any HREFs!

We can use the normalize_hrefs method to set all the HREFs in the entire STAC based on a root directory. This will use the STAC Best Practices recommendations for STAC file layout for each Catalog, Collection and Item in the STAC.

Here we use that method and set the root directory to a subdirectory of our user’s home directory:

[52]:
from pathlib import Path

root_path = str(Path.home() / '{}-landsat-stac'.format(location_name))

collection.normalize_hrefs(root_path)

Now that we have all the Collection’s data set and HREFs in place we can validate the entire STAC using validate_all, which recursively crawls through a catalog and validates every STAC object in the catalog:

[53]:
collection.validate_all()

Write the catalog locally#

Now that we have our complete, validated STAC in memory, let’s write it out. This is as simple as calling save on the Collection. We need to specify the type of catalog in order to property write out links - these types are described again in the STAC Best Practices documentation.

We’ll use the “self contained” type, which uses relative paths and does not specify absolute “self” links to any object. This makes the catalog more portable, as it remains valid even if you copy the STAC to new locations.

[54]:
collection.save(pystac.CatalogType.SELF_CONTAINED)

Now that we’ve written our STAC out we probably want to view it. We can use the describe method to print out a simple representation of the catalog:

[55]:
collection.describe()
* <Collection id=Philly-landsat-8>
  * <Item id=LC80150322020003>
  * <Item id=LC80140322020012>
  * <Item id=LC80150322020019>
  * <Item id=LC80140322020012>
  * <Item id=LC80150322020019>
  * <Item id=LC80140322020028>
  * <Item id=LC80150322020035>
  * <Item id=LC80140322020028>
  * <Item id=LC80140322020044>
  * <Item id=LC80150322020051>
  * <Item id=LC80150322020051>
  * <Item id=LC80150322020051>
  * <Item id=LC80150322020051>
  * <Item id=LC80140322020060>
  * <Item id=LC80150322020067>
  * <Item id=LC80140322020060>
  * <Item id=LC80150322020067>
  * <Item id=LC80140322020076>
  * <Item id=LC80150322020083>
  * <Item id=LC80140322020076>
  * <Item id=LC80140322020092>
  * <Item id=LC80150322020099>
  * <Item id=LC80140322020092>
  * <Item id=LC80140322020108>
  * <Item id=LC80140322020108>
  * <Item id=LC80150322020115>
  * <Item id=LC80150322020099>
  * <Item id=LC80140322020124>
  * <Item id=LC80150322020131>
  * <Item id=LC80140322020124>
  * <Item id=LC80140322020140>
  * <Item id=LC80150322020147>
  * <Item id=LC80150322020131>
  * <Item id=LC80140322020140>
  * <Item id=LC80140322020156>
  * <Item id=LC80150322020147>
  * <Item id=LC80140322020156>
  * <Item id=LC80150322020163>
  * <Item id=LC80140322020172>
  * <Item id=LC80150322020179>
  * <Item id=LC80150322020163>
  * <Item id=LC80140322020188>
  * <Item id=LC80140322020172>
  * <Item id=LC80150322020179>
  * <Item id=LC80150322020195>
  * <Item id=LC80150322020211>
  * <Item id=LC80150322020195>
  * <Item id=LC80140322020188>
  * <Item id=LC80140322020204>
  * <Item id=LC80150322020227>
  * <Item id=LC80140322020220>
  * <Item id=LC80150322020243>
  * <Item id=LC80140322020236>
  * <Item id=LC80140322020252>
  * <Item id=LC80150322020243>
  * <Item id=LC80150322020243>
  * <Item id=LC80140322020236>
  * <Item id=LC80150322020259>
  * <Item id=LC80150322020259>
  * <Item id=LC80140322020252>
  * <Item id=LC80140322020268>
  * <Item id=LC80150322020275>
  * <Item id=LC80150322020275>
  * <Item id=LC80140322020268>
  * <Item id=LC80140322020284>
  * <Item id=LC80140322020284>
  * <Item id=LC80150322020291>
  * <Item id=LC80140322020300>
  * <Item id=LC80140322020332>
  * <Item id=LC80150322020339>
  * <Item id=LC80150322020323>
  * <Item id=LC80140322020332>
  * <Item id=LC80140322020348>
  * <Item id=LC80150322020339>
  * <Item id=LC80140322020348>
  * <Item id=LC80150322020355>
  * <Item id=LC80140322020364>

We can also use the to_dict method on individual STAC objects in order to see the data, as we’ve been doing in the tutorial:

[56]:
collection.to_dict()
[56]:
{'type': 'Collection',
 'id': 'Philly-landsat-8',
 'stac_version': '1.0.0',
 'description': '### Philly Landsat 8\n\nA collection of Landsat 8 scenes around Philly in 2020.\n',
 'links': [{'rel': <RelType.ROOT: 'root'>,
   'href': './collection.json',
   'type': <MediaType.JSON: 'application/json'>,
   'title': '2020 Landsat 8 images over Philly'},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020003/LC80150322020003.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020012/LC80140322020012.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020019/LC80150322020019.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020012/LC80140322020012.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020019/LC80150322020019.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020028/LC80140322020028.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020035/LC80150322020035.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020028/LC80140322020028.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020044/LC80140322020044.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020051/LC80150322020051.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020051/LC80150322020051.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020051/LC80150322020051.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020051/LC80150322020051.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020060/LC80140322020060.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020067/LC80150322020067.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020060/LC80140322020060.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020067/LC80150322020067.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020076/LC80140322020076.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020083/LC80150322020083.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020076/LC80140322020076.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020092/LC80140322020092.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020099/LC80150322020099.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020092/LC80140322020092.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020108/LC80140322020108.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020108/LC80140322020108.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020115/LC80150322020115.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020099/LC80150322020099.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020124/LC80140322020124.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020131/LC80150322020131.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020124/LC80140322020124.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020140/LC80140322020140.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020147/LC80150322020147.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020131/LC80150322020131.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020140/LC80140322020140.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020156/LC80140322020156.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020147/LC80150322020147.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020156/LC80140322020156.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020163/LC80150322020163.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020172/LC80140322020172.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020179/LC80150322020179.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020163/LC80150322020163.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020188/LC80140322020188.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020172/LC80140322020172.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020179/LC80150322020179.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020195/LC80150322020195.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020211/LC80150322020211.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020195/LC80150322020195.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020188/LC80140322020188.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020204/LC80140322020204.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020227/LC80150322020227.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020220/LC80140322020220.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020243/LC80150322020243.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020236/LC80140322020236.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020252/LC80140322020252.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020243/LC80150322020243.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020243/LC80150322020243.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020236/LC80140322020236.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020259/LC80150322020259.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020259/LC80150322020259.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020252/LC80140322020252.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020268/LC80140322020268.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020275/LC80150322020275.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020275/LC80150322020275.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020268/LC80140322020268.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020284/LC80140322020284.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020284/LC80140322020284.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020291/LC80150322020291.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020300/LC80140322020300.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020332/LC80140322020332.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020339/LC80150322020339.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020323/LC80150322020323.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020332/LC80140322020332.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020348/LC80140322020348.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020339/LC80150322020339.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020348/LC80140322020348.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80150322020355/LC80150322020355.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.ITEM: 'item'>,
   'href': './LC80140322020364/LC80140322020364.json',
   'type': <MediaType.JSON: 'application/json'>},
  {'rel': <RelType.SELF: 'self'>,
   'href': '/Users/jduckworth/Philly-landsat-stac/collection.json',
   'type': <MediaType.JSON: 'application/json'>}],
 'stac_extensions': [],
 'title': '2020 Landsat 8 images over Philly',
 'extent': {'spatial': {'bbox': [[-77.88298, 39.23073, -73.46322, 41.41025]]},
  'temporal': {'interval': [['2020-01-03T15:46:15.625235Z',
     '2020-12-29T15:40:09.950514Z']]}},
 'license': 'proprietary',
 'providers': [{'name': 'USGS',
   'roles': ['producer'],
   'url': 'https://landsat.usgs.gov/'},
  {'name': 'Planet Labs',
   'roles': ['processor'],
   'url': 'https://github.com/landsat-pds/landsat_ingestor'},
  {'name': 'AWS', 'roles': ['host'], 'url': 'https://landsatonaws.com/'}]}

However, if we want to browse our STAC more interactively, we can use the stac-browser tool to read our local STAC.

We can use this simple Python server (copied from this gist) to serve our our directory at port 5555:

[57]:
import os
from http.server import HTTPServer, SimpleHTTPRequestHandler

os.chdir(root_path)

class CORSRequestHandler(SimpleHTTPRequestHandler):
    def end_headers(self) -> None:
        self.send_header('Access-Control-Allow-Origin', '*')
        self.send_header('Access-Control-Allow-Methods', 'GET')
        self.send_header('Cache-Control', 'no-store, no-cache, must-revalidate')
        return super(CORSRequestHandler, self).end_headers()


with HTTPServer(('localhost', 5555), CORSRequestHandler) as httpd:
    httpd.serve_forever()
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
/var/folders/td/xzlhhvqn6gn7_zmrjc0zfb6c0000gn/T/ipykernel_13162/2713200350.py in <module>
     13
     14 with HTTPServer(('localhost', 5555), CORSRequestHandler) as httpd:
---> 15     httpd.serve_forever()

~/.pyenv/versions/3.8.9/lib/python3.8/socketserver.py in serve_forever(self, poll_interval)
    230
    231                 while not self.__shutdown_request:
--> 232                     ready = selector.select(poll_interval)
    233                     # bpo-35017: shutdown() called during select(), exit immediately.
    234                     if self.__shutdown_request:

~/.pyenv/versions/3.8.9/lib/python3.8/selectors.py in select(self, timeout)
    413         ready = []
    414         try:
--> 415             fd_event_list = self._selector.poll(timeout)
    416         except InterruptedError:
    417             return ready

KeyboardInterrupt:

Now you can follow the stac-browser instructions for starting a stac-browser instance and point it at http://localhost:5555/collection.json to serve out the STAC!

To quit the server, use the Kernel -> Interrupt menu option.

Acknowledgements#

Credit to sat-stac-landsat from which a lot of this code was based.