Create and manipulate SpaceNet Vegas STAC#

This tutorial shows how to create and manipulate STACs using pystac.

  • Create (in memory) a pystac catalog of SpaceNet 2 imagery from the Las Vegas AOI using data hosted in a public s3 bucket

  • Set relative paths for all STAC object

  • Normalize links from a root directory and save the STAC there

[12]:
import sys

sys.path.append("..")

You may need install the following packages that are not included in the Python 3 standard library. If you do not have any of these installed, you can do do with pip:

boto3: pip install boto3
botocore: pip install botocore
rasterio: pip install rasterio
shapely: pip install Shapely
rio-cogeo: pip install rio-cogeo
[13]:
from datetime import datetime
from os.path import basename, join

import boto3
import rasterio
import pystac
from shapely.geometry import GeometryCollection, box, shape, mapping

Create SpaceNet Vegas STAC#

Initialize a STAC for the SpaceNet 2 dataset

[14]:
spacenet = pystac.Catalog(id="spacenet", description="SpaceNet 2 STAC")

We do not yet know the spatial extent of the Vegas AOI. We will need to determine it when we download all of the images. As a placeholder we will create a spatial extent of null values.

[15]:
sp_extent = pystac.SpatialExtent([None, None, None, None])

The capture date for SpaceNet 2 Vegas imagery is October 22, 2015. Create a python datetime object using that date

[16]:
capture_date = datetime.strptime("2015-10-22", "%Y-%m-%d")
tmp_extent = pystac.TemporalExtent([(capture_date, None)])

Create an Extent object that will define both the spatial and temporal extents of the Vegas collection

[17]:
extent = pystac.Extent(sp_extent, tmp_extent)

Create a collection that will encompass the Vegas data and add to the spacenet catalog

[18]:
vegas = pystac.Collection(
    id="vegas", description="Vegas SpaceNet 2 dataset", extent=extent
)
spacenet.add_child(vegas)
[18]:
[19]:
spacenet.describe()
* <Catalog id=spacenet>
    * <Collection id=vegas>

Find the locations of SpaceNet images. In order to make this example quicker, we will limit the number of scenes that we use to 10.

[20]:
client = boto3.client("s3")
scenes = client.list_objects(
    Bucket="spacenet-dataset",
    Prefix="spacenet/SN2_buildings/train/AOI_2_Vegas/PS-RGB/",
    MaxKeys=20,
)
scenes = [s["Key"] for s in scenes["Contents"] if s["Key"].endswith(".tif")][0:10]

For each scene, create and item with a defined bounding box. Each item will include the geotiff as an asset. We will add labels in the next section.

[21]:
for scene in scenes:
    uri = join("s3://spacenet-dataset/", scene)
    params = {}
    params["id"] = basename(uri).split(".")[0]
    with rasterio.open(uri) as src:
        params["bbox"] = list(src.bounds)
        params["geometry"] = mapping(box(*params["bbox"]))
    params["datetime"] = capture_date
    params["properties"] = {}
    i = pystac.Item(**params)
    i.add_asset(
        key="image",
        asset=pystac.Asset(
            href=uri, title="Geotiff", media_type=pystac.MediaType.GEOTIFF
        ),
    )
    vegas.add_item(i)

Now reset the spatial extent of the Vegas collection using the geometry objects from from the items we just added.

[22]:
bounds = [
    list(
        GeometryCollection(
            [shape(s.geometry) for s in spacenet.get_items(recursive=True)]
        ).bounds
    )
]
vegas.extent.spatial = pystac.SpatialExtent(bounds)

Currently, this STAC only exists in memory. We need to set all of the paths based on the root directory we want to save off that catalog too, and then save a “self contained” catalog, which will have all links be relative and contain no ‘self’ links. We can do this by using the normalize method to set the HREFs of all of our STAC objects. We’ll then validate the catalog, and then save:

[23]:
spacenet.normalize_hrefs("spacenet-stac")
[24]:
spacenet.validate_all()
[24]:
10
[25]:
spacenet.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)