{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# How to create STAC Catalogs \n", "## STAC Community Sprint, Arlington, November 7th 2019" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook runs through some of the basics of using PySTAC to create a static STAC. It was part of a 30 minute presentation at the [community STAC sprint](https://github.com/radiantearth/community-sprints/tree/master/11052019-arlignton-va) in Arlington, VA in November 2019, updated to work with current PySTAC." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial will require the `boto3`, `rasterio`, and `shapely` libraries:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install boto3 rasterio shapely pystac --quiet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can import pystac and access most of the functionality we need with the single import:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import pystac" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a catalog from a local file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To give us some material to work with, lets download a single image from the [Spacenet 5 challenge](https://www.topcoder.com/challenges/30099956). We'll use a temporary directory to save off our single-item STAC." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import os\n", "import urllib.request\n", "from tempfile import TemporaryDirectory\n", "\n", "tmp_dir = TemporaryDirectory()\n", "img_path = os.path.join(tmp_dir.name, \"image.tif\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('/tmp/tmpdsdpun_y/image.tif', )" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "url = (\n", " \"https://spacenet-dataset.s3.amazonaws.com/\"\n", " \"spacenet/SN5_roads/train/AOI_7_Moscow/MS/\"\n", " \"SN5_roads_train_AOI_7_Moscow_MS_chip996.tif\"\n", ")\n", "urllib.request.urlretrieve(url, img_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We want to create a Catalog. Let's check the docs for `Catalog` to see what information we'll need." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0;31mInit signature:\u001b[0m\n", "\u001b[0mpystac\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mCatalog\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mid\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdescription\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mtitle\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mstac_extensions\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[str]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mextra_fields\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Any]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mhref\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mcatalog_type\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'CatalogType'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mABSOLUTE_PUBLISHED\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m \n", "A PySTAC Catalog represents a STAC catalog in memory.\n", "\n", "A Catalog is a :class:`~pystac.STACObject` that may contain children,\n", "which are instances of :class:`~pystac.Catalog` or :class:`~pystac.Collection`,\n", "as well as :class:`~pystac.Item` s.\n", "\n", "Args:\n", " id : Identifier for the catalog. Must be unique within the STAC.\n", " description : Detailed multi-line description to fully explain the catalog.\n", " `CommonMark 0.29 syntax `_ MAY be used for rich\n", " text representation.\n", " title : Optional short descriptive one-line title for the catalog.\n", " stac_extensions : Optional list of extensions the Catalog implements.\n", " href : Optional HREF for this catalog, which be set as the\n", " catalog's self link's HREF.\n", " catalog_type : Optional catalog type for this catalog. Must\n", " be one of the values in :class:`~pystac.CatalogType`.\n", "\u001b[0;31mFile:\u001b[0m ~/pystac/pystac/catalog.py\n", "\u001b[0;31mType:\u001b[0m ABCMeta\n", "\u001b[0;31mSubclasses:\u001b[0m Collection" ] } ], "source": [ "?pystac.Catalog" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's just give an ID and a description. We don't have to worry about the HREF right now; that will be set later." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "catalog = pystac.Catalog(id=\"test-catalog\", description=\"Tutorial catalog.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are no children or items in the catalog, since we haven't added anything yet." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "print(list(catalog.get_children()))\n", "print(list(catalog.get_items()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll now create an Item to represent the image. Check the pydocs to see what you need to supply:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0;31mInit signature:\u001b[0m\n", "\u001b[0mpystac\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mItem\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mid\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mgeometry\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Any]]'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mbbox\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[float]]'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdatetime\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Datetime]'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mproperties\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Dict[str, Any]'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mstart_datetime\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Datetime]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mend_datetime\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Datetime]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mstac_extensions\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[str]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mhref\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mcollection\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Union[str, Collection]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mextra_fields\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Any]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0massets\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Asset]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m \n", "An Item is the core granular entity in a STAC, containing the core metadata\n", "that enables any client to search or crawl online catalogs of spatial 'assets' -\n", "satellite imagery, derived data, DEM's, etc.\n", "\n", "Args:\n", " id : Provider identifier. Must be unique within the STAC.\n", " geometry : Defines the full footprint of the asset represented by this\n", " item, formatted according to\n", " `RFC 7946, section 3.1 (GeoJSON) `_.\n", " bbox : Bounding Box of the asset represented by this item\n", " using either 2D or 3D geometries. The length of the array must be 2*n\n", " where n is the number of dimensions. Could also be None in the case of a\n", " null geometry.\n", " datetime : datetime associated with this item. If None,\n", " a start_datetime and end_datetime must be supplied.\n", " properties : A dictionary of additional metadata for the item.\n", " start_datetime : Optional start datetime, part of common metadata. This value\n", " will override any `start_datetime` key in properties.\n", " end_datetime : Optional end datetime, part of common metadata. This value\n", " will override any `end_datetime` key in properties.\n", " stac_extensions : Optional list of extensions the Item implements.\n", " href : Optional HREF for this item, which be set as the item's\n", " self link's HREF.\n", " collection : The Collection or Collection ID that this item\n", " belongs to.\n", " extra_fields : Extra fields that are part of the top-level JSON\n", " properties of the Item.\n", " assets : A dictionary mapping string keys to :class:`~pystac.Asset` objects. All\n", " :class:`~pystac.Asset` values in the dictionary will have their\n", " :attr:`~pystac.Asset.owner` attribute set to the created Item.\n", "\u001b[0;31mFile:\u001b[0m ~/pystac/pystac/item.py\n", "\u001b[0;31mType:\u001b[0m ABCMeta\n", "\u001b[0;31mSubclasses:\u001b[0m " ] } ], "source": [ "?pystac.Item" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using [rasterio](https://rasterio.readthedocs.io/en/stable/), we can pull out the bounding box of the image to use for the image metadata. If the image contained a NoData border, we would ideally pull out the footprint and save it as the geometry; in this case, we're working with a small chip that most likely has no NoData values." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "import rasterio\n", "from shapely.geometry import Polygon, mapping\n", "\n", "\n", "def get_bbox_and_footprint(raster_uri):\n", " with rasterio.open(raster_uri) as ds:\n", " bounds = ds.bounds\n", " bbox = [bounds.left, bounds.bottom, bounds.right, bounds.top]\n", " footprint = Polygon(\n", " [\n", " [bounds.left, bounds.bottom],\n", " [bounds.left, bounds.top],\n", " [bounds.right, bounds.top],\n", " [bounds.right, bounds.bottom],\n", " ]\n", " )\n", "\n", " return (bbox, mapping(footprint))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[37.6616853489879, 55.73478197572927, 37.66573047610874, 55.73882710285011]\n", "{'type': 'Polygon', 'coordinates': (((37.6616853489879, 55.73478197572927), (37.6616853489879, 55.73882710285011), (37.66573047610874, 55.73882710285011), (37.66573047610874, 55.73478197572927), (37.6616853489879, 55.73478197572927)),)}\n" ] } ], "source": [ "bbox, footprint = get_bbox_and_footprint(img_path)\n", "print(bbox)\n", "print(footprint)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We're also using `datetime.utcnow()` to supply the required datetime property for our Item. Since this is a required property, you might often find yourself making up a time to fill in if you don't know the exact capture time." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "from datetime import datetime\n", "\n", "item = pystac.Item(\n", " id=\"local-image\",\n", " geometry=footprint,\n", " bbox=bbox,\n", " datetime=datetime.utcnow(),\n", " properties={},\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We haven't added it to a catalog yet, so it's parent isn't set. Once we add it to the catalog, we can see it correctly links to it's parent." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "assert item.get_parent() is None" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "
    \n", " \n", " \n", " \n", "
  • \n", " rel\n", " \"item\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " href\n", " None\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " type\n", " \"application/json\"\n", "
  • \n", " \n", " \n", " \n", "
\n", "
\n", "
" ], "text/plain": [ ">" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "catalog.add_item(item)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "
    \n", " \n", " \n", " \n", "
  • \n", " type\n", " \"Catalog\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " id\n", " \"test-catalog\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " stac_version\n", " \"1.0.0\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " description\n", " \"Tutorial catalog.\"\n", "
  • \n", " \n", " \n", " \n", " \n", "
  • \n", " \n", " links\n", " [] 1 items\n", " \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 0\n", "
        \n", " \n", " \n", " \n", "
      • \n", " rel\n", " \"item\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " href\n", " None\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " type\n", " \"application/json\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
  • \n", " \n", " \n", "
\n", "
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item.get_parent()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`describe()` is a useful method on `Catalog` - but be careful when using it on large catalogs, as it will walk the entire tree of the STAC." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* \n", " * \n" ] } ], "source": [ "catalog.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding Assets\n", "\n", "We've created an Item, but there aren't any assets associated with it. Let's create one:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0;31mInit signature:\u001b[0m\n", "\u001b[0mpystac\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mAsset\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mhref\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mtitle\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdescription\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mmedia_type\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mroles\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[str]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mextra_fields\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Any]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m->\u001b[0m \u001b[0;34m'None'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m \n", "An object that contains a link to data associated with an Item or Collection that\n", "can be downloaded or streamed.\n", "\n", "Args:\n", " href : Link to the asset object. Relative and absolute links are both\n", " allowed.\n", " title : Optional displayed title for clients and users.\n", " description : A description of the Asset providing additional details,\n", " such as how it was processed or created. CommonMark 0.29 syntax MAY be used\n", " for rich text representation.\n", " media_type : Optional description of the media type. Registered Media Types\n", " are preferred. See :class:`~pystac.MediaType` for common media types.\n", " roles : Optional, Semantic roles (i.e. thumbnail, overview,\n", " data, metadata) of the asset.\n", " extra_fields : Optional, additional fields for this asset. This is used\n", " by extensions as a way to serialize and deserialize properties on asset\n", " object JSON.\n", "\u001b[0;31mFile:\u001b[0m ~/pystac/pystac/asset.py\n", "\u001b[0;31mType:\u001b[0m type\n", "\u001b[0;31mSubclasses:\u001b[0m " ] } ], "source": [ "?pystac.Asset" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "item.add_asset(\n", " key=\"image\", asset=pystac.Asset(href=img_path, media_type=pystac.MediaType.GEOTIFF)\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At any time we can call `to_dict()` on STAC objects to see how the STAC JSON is shaping up. Notice the asset is now set:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"type\": \"Feature\",\n", " \"stac_version\": \"1.0.0\",\n", " \"id\": \"local-image\",\n", " \"properties\": {\n", " \"datetime\": \"2023-10-12T15:35:17.290343Z\"\n", " },\n", " \"geometry\": {\n", " \"type\": \"Polygon\",\n", " \"coordinates\": [\n", " [\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ]\n", " ]\n", " ]\n", " },\n", " \"links\": [\n", " {\n", " \"rel\": \"root\",\n", " \"href\": null,\n", " \"type\": \"application/json\"\n", " },\n", " {\n", " \"rel\": \"parent\",\n", " \"href\": null,\n", " \"type\": \"application/json\"\n", " }\n", " ],\n", " \"assets\": {\n", " \"image\": {\n", " \"href\": \"/tmp/tmpdsdpun_y/image.tif\",\n", " \"type\": \"image/tiff; application=geotiff\"\n", " }\n", " },\n", " \"bbox\": [\n", " 37.6616853489879,\n", " 55.73478197572927,\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " \"stac_extensions\": []\n", "}\n" ] } ], "source": [ "import json\n", "\n", "print(json.dumps(item.to_dict(), indent=4))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the link `href` properties are `null`. This is OK, as we're working with the STAC in memory. Next, we'll talk about writing the catalog out, and how to set those HREFs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saving the catalog" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As the JSON above indicates, there's no HREFs set on these in-memory items. PySTAC uses the `self` link on STAC objects to track where the file lives. Because we haven't set them, they evaluate to `None`:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "True\n" ] } ], "source": [ "print(catalog.get_self_href() is None)\n", "print(item.get_self_href() is None)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to set them, we can use `normalize_hrefs`. This method will create a normalized set of HREFs for each STAC object in the catalog, according to the [best practices document](https://github.com/radiantearth/stac-spec/blob/v0.8.1/best-practices.md#catalog-layout)'s recommendations on how to lay out a catalog." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "catalog.normalize_hrefs(os.path.join(tmp_dir.name, \"stac\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we've normalized to a root directory (the temporary directory), we see that the `self` links are set:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/tmp/tmpdsdpun_y/stac/catalog.json\n", "/tmp/tmpdsdpun_y/stac/local-image/local-image.json\n" ] } ], "source": [ "print(catalog.get_self_href())\n", "print(item.get_self_href())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now call `save` on the catalog, which will recursively save all the STAC objects to their respective self HREFs.\n", "\n", "Save requires a `CatalogType` to be set. You can review the [API docs](https://pystac.readthedocs.io/en/stable/api.html#catalogtype) on `CatalogType` to see what each type means (unfortunately `help` doesn't show docstrings for attributes)." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "catalog.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/tmp/tmpdsdpun_y/stac/catalog.json\n", "\n", "/tmp/tmpdsdpun_y/stac/local-image:\n", "local-image.json\n" ] } ], "source": [ "!ls {tmp_dir.name}/stac/*" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"type\": \"Catalog\",\n", " \"id\": \"test-catalog\",\n", " \"stac_version\": \"1.0.0\",\n", " \"description\": \"Tutorial catalog.\",\n", " \"links\": [\n", " {\n", " \"rel\": \"root\",\n", " \"href\": \"./catalog.json\",\n", " \"type\": \"application/json\"\n", " },\n", " {\n", " \"rel\": \"item\",\n", " \"href\": \"./local-image/local-image.json\",\n", " \"type\": \"application/json\"\n", " }\n", " ]\n", "}\n" ] } ], "source": [ "with open(catalog.self_href) as f:\n", " print(f.read())" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"type\": \"Feature\",\n", " \"stac_version\": \"1.0.0\",\n", " \"id\": \"local-image\",\n", " \"properties\": {\n", " \"datetime\": \"2023-10-12T15:35:17.290343Z\"\n", " },\n", " \"geometry\": {\n", " \"type\": \"Polygon\",\n", " \"coordinates\": [\n", " [\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ]\n", " ]\n", " ]\n", " },\n", " \"links\": [\n", " {\n", " \"rel\": \"root\",\n", " \"href\": \"../catalog.json\",\n", " \"type\": \"application/json\"\n", " },\n", " {\n", " \"rel\": \"parent\",\n", " \"href\": \"../catalog.json\",\n", " \"type\": \"application/json\"\n", " }\n", " ],\n", " \"assets\": {\n", " \"image\": {\n", " \"href\": \"/tmp/tmpdsdpun_y/image.tif\",\n", " \"type\": \"image/tiff; application=geotiff\"\n", " }\n", " },\n", " \"bbox\": [\n", " 37.6616853489879,\n", " 55.73478197572927,\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " \"stac_extensions\": []\n", "}\n" ] } ], "source": [ "with open(item.self_href) as f:\n", " print(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, all links are saved with relative paths. That's because we used `catalog_type=CatalogType.SELF_CONTAINED`. If we save an Absolute Published catalog, we'll see absolute paths:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "catalog.save(catalog_type=pystac.CatalogType.ABSOLUTE_PUBLISHED)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the links included in the STAC item are all absolute:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"type\": \"Feature\",\n", " \"stac_version\": \"1.0.0\",\n", " \"id\": \"local-image\",\n", " \"properties\": {\n", " \"datetime\": \"2023-10-12T15:35:17.290343Z\"\n", " },\n", " \"geometry\": {\n", " \"type\": \"Polygon\",\n", " \"coordinates\": [\n", " [\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ]\n", " ]\n", " ]\n", " },\n", " \"links\": [\n", " {\n", " \"rel\": \"root\",\n", " \"href\": \"/tmp/tmpdsdpun_y/stac/catalog.json\",\n", " \"type\": \"application/json\"\n", " },\n", " {\n", " \"rel\": \"parent\",\n", " \"href\": \"/tmp/tmpdsdpun_y/stac/catalog.json\",\n", " \"type\": \"application/json\"\n", " },\n", " {\n", " \"rel\": \"self\",\n", " \"href\": \"/tmp/tmpdsdpun_y/stac/local-image/local-image.json\",\n", " \"type\": \"application/json\"\n", " }\n", " ],\n", " \"assets\": {\n", " \"image\": {\n", " \"href\": \"/tmp/tmpdsdpun_y/image.tif\",\n", " \"type\": \"image/tiff; application=geotiff\"\n", " }\n", " },\n", " \"bbox\": [\n", " 37.6616853489879,\n", " 55.73478197572927,\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " \"stac_extensions\": []\n", "}\n" ] } ], "source": [ "with open(item.get_self_href()) as f:\n", " print(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the Asset HREF is absolute in both cases. We can make the Asset HREF relative to the STAC Item by using `.make_all_asset_hrefs_relative()`:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "catalog.make_all_asset_hrefs_relative()\n", "catalog.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"type\": \"Feature\",\n", " \"stac_version\": \"1.0.0\",\n", " \"id\": \"local-image\",\n", " \"properties\": {\n", " \"datetime\": \"2023-10-12T15:35:17.290343Z\"\n", " },\n", " \"geometry\": {\n", " \"type\": \"Polygon\",\n", " \"coordinates\": [\n", " [\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " [\n", " 37.66573047610874,\n", " 55.73478197572927\n", " ],\n", " [\n", " 37.6616853489879,\n", " 55.73478197572927\n", " ]\n", " ]\n", " ]\n", " },\n", " \"links\": [\n", " {\n", " \"rel\": \"root\",\n", " \"href\": \"../catalog.json\",\n", " \"type\": \"application/json\"\n", " },\n", " {\n", " \"rel\": \"parent\",\n", " \"href\": \"../catalog.json\",\n", " \"type\": \"application/json\"\n", " }\n", " ],\n", " \"assets\": {\n", " \"image\": {\n", " \"href\": \"../../image.tif\",\n", " \"type\": \"image/tiff; application=geotiff\"\n", " }\n", " },\n", " \"bbox\": [\n", " 37.6616853489879,\n", " 55.73478197572927,\n", " 37.66573047610874,\n", " 55.73882710285011\n", " ],\n", " \"stac_extensions\": []\n", "}\n" ] } ], "source": [ "with open(item.get_self_href()) as f:\n", " print(f.read())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating an Item that implements the EO extension\n", "\n", "In the code above our item only implemented the core STAC Item specification. With [extensions](https://github.com/radiantearth/stac-spec/tree/v0.9.0/extensions) we can record more information and add additional functionality to the Item. Given that we know this is a World View 3 image that has earth observation data, we can enable the [eo extension](https://github.com/radiantearth/stac-spec/tree/v0.8.1/extensions/eo) to add band information." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To add eo information to an item we'll need to specify some more data. First, let's define the bands of World View 3:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "from pystac.extensions.eo import Band\n", "\n", "# From: https://www.spaceimagingme.com/downloads/sensors/datasheets/DG_WorldView3_DS_2014.pdf\n", "\n", "wv3_bands = [\n", " Band.create(\n", " name=\"Coastal\", description=\"Coastal: 400 - 450 nm\", common_name=\"coastal\"\n", " ),\n", " Band.create(name=\"Blue\", description=\"Blue: 450 - 510 nm\", common_name=\"blue\"),\n", " Band.create(name=\"Green\", description=\"Green: 510 - 580 nm\", common_name=\"green\"),\n", " Band.create(\n", " name=\"Yellow\", description=\"Yellow: 585 - 625 nm\", common_name=\"yellow\"\n", " ),\n", " Band.create(name=\"Red\", description=\"Red: 630 - 690 nm\", common_name=\"red\"),\n", " Band.create(\n", " name=\"Red Edge\", description=\"Red Edge: 705 - 745 nm\", common_name=\"rededge\"\n", " ),\n", " Band.create(\n", " name=\"Near-IR1\", description=\"Near-IR1: 770 - 895 nm\", common_name=\"nir08\"\n", " ),\n", " Band.create(\n", " name=\"Near-IR2\", description=\"Near-IR2: 860 - 1040 nm\", common_name=\"nir09\"\n", " ),\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we used the `.create` method create new band information." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now create an Item, enable the eo extension, add the band information and add it to our catalog:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "eo_item = pystac.Item(\n", " id=\"local-image-eo\",\n", " geometry=footprint,\n", " bbox=bbox,\n", " datetime=datetime.utcnow(),\n", " properties={},\n", ")\n", "eo_item.ext.add(\"eo\")\n", "eo_item.ext.eo.bands = wv3_bands" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are also [common metadata](https://github.com/radiantearth/stac-spec/blob/v0.9.0/item-spec/common-metadata.md) fields that we can use to capture additional information about the WorldView 3 imagery:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "eo_item.common_metadata.platform = \"Maxar\"\n", "eo_item.common_metadata.instruments = [\"WorldView3\"]\n", "eo_item.common_metadata.gsd = 0.3" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "
    \n", " \n", " \n", " \n", "
  • \n", " type\n", " \"Feature\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " stac_version\n", " \"1.0.0\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " id\n", " \"local-image-eo\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " properties\n", "
      \n", " \n", " \n", "
    • \n", " \n", " eo:bands\n", " [] 8 items\n", " \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 0\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Coastal\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"coastal\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Coastal: 400 - 450 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 1\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Blue\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"blue\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Blue: 450 - 510 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 2\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Green\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"green\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Green: 510 - 580 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 3\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Yellow\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"yellow\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Yellow: 585 - 625 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 4\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Red\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"red\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Red: 630 - 690 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 5\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Red Edge\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"rededge\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Red Edge: 705 - 745 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 6\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Near-IR1\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"nir08\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Near-IR1: 770 - 895 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 7\n", "
          \n", " \n", " \n", " \n", "
        • \n", " name\n", " \"Near-IR2\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " common_name\n", " \"nir09\"\n", "
        • \n", " \n", " \n", " \n", " \n", " \n", "
        • \n", " description\n", " \"Near-IR2: 860 - 1040 nm\"\n", "
        • \n", " \n", " \n", " \n", "
        \n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
    • \n", " \n", " \n", " \n", " \n", "
    • \n", " platform\n", " \"Maxar\"\n", "
    • \n", " \n", " \n", " \n", " \n", "
    • \n", " \n", " instruments\n", " [] 1 items\n", " \n", " \n", "
        \n", " \n", " \n", " \n", "
      • \n", " 0\n", " \"WorldView3\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", " \n", "
    • \n", " \n", " \n", " \n", " \n", "
    • \n", " gsd\n", " 0.3\n", "
    • \n", " \n", " \n", " \n", " \n", " \n", "
    • \n", " datetime\n", " \"2023-10-12T15:35:17.781985Z\"\n", "
    • \n", " \n", " \n", " \n", "
    \n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " geometry\n", "
      \n", " \n", " \n", " \n", "
    • \n", " type\n", " \"Polygon\"\n", "
    • \n", " \n", " \n", " \n", " \n", "
    • \n", " \n", " coordinates\n", " [] 1 items\n", " \n", " \n", "
        \n", " \n", " \n", "
      • \n", " \n", " 0\n", " [] 5 items\n", " \n", " \n", "
          \n", " \n", " \n", "
        • \n", " \n", " 0\n", " [] 2 items\n", " \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 0\n", " 37.6616853489879\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 1\n", " 55.73478197572927\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
        • \n", " \n", " \n", "
        \n", " \n", "
          \n", " \n", " \n", "
        • \n", " \n", " 1\n", " [] 2 items\n", " \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 0\n", " 37.6616853489879\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 1\n", " 55.73882710285011\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
        • \n", " \n", " \n", "
        \n", " \n", "
          \n", " \n", " \n", "
        • \n", " \n", " 2\n", " [] 2 items\n", " \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 0\n", " 37.66573047610874\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 1\n", " 55.73882710285011\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
        • \n", " \n", " \n", "
        \n", " \n", "
          \n", " \n", " \n", "
        • \n", " \n", " 3\n", " [] 2 items\n", " \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 0\n", " 37.66573047610874\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 1\n", " 55.73478197572927\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
        • \n", " \n", " \n", "
        \n", " \n", "
          \n", " \n", " \n", "
        • \n", " \n", " 4\n", " [] 2 items\n", " \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 0\n", " 37.6616853489879\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
            \n", " \n", " \n", " \n", "
          • \n", " 1\n", " 55.73478197572927\n", "
          • \n", " \n", " \n", " \n", "
          \n", " \n", "
        • \n", " \n", " \n", "
        \n", " \n", "
      • \n", " \n", " \n", "
      \n", " \n", "
    • \n", " \n", " \n", "
    \n", "
  • \n", " \n", " \n", " \n", " \n", "
  • \n", " \n", " links\n", " [] 0 items\n", " \n", " \n", "
  • \n", " \n", " \n", " \n", " \n", "
  • \n", " assets\n", "
      \n", " \n", "
    \n", "
  • \n", " \n", " \n", " \n", " \n", "
  • \n", " \n", " bbox\n", " [] 4 items\n", " \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 0\n", " 37.6616853489879\n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 1\n", " 55.73478197572927\n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 2\n", " 37.66573047610874\n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 3\n", " 55.73882710285011\n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
  • \n", " \n", " \n", " \n", "
  • \n", " \n", " stac_extensions\n", " [] 1 items\n", " \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 0\n", " \"https://stac-extensions.github.io/eo/v1.1.0/schema.json\"\n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
  • \n", " \n", " \n", "
\n", "
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eo_item" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the eo extension to add bands to the assets we add to the item:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "asset = pystac.Asset(href=img_path, media_type=pystac.MediaType.GEOTIFF)\n", "eo_item.add_asset(\"image\", asset)\n", "asset.ext.eo.bands = wv3_bands" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we look at the asset, we can see the appropriate band indexes are set:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "
    \n", " \n", " \n", " \n", "
  • \n", " href\n", " \"/tmp/tmpdsdpun_y/image.tif\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " type\n", " \"image/tiff; application=geotiff\"\n", "
  • \n", " \n", " \n", " \n", " \n", "
  • \n", " \n", " eo:bands\n", " [] 8 items\n", " \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 0\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Coastal\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"coastal\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Coastal: 400 - 450 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 1\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Blue\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"blue\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Blue: 450 - 510 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 2\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Green\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"green\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Green: 510 - 580 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 3\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Yellow\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"yellow\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Yellow: 585 - 625 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 4\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Red\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"red\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Red: 630 - 690 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 5\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Red Edge\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"rededge\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Red Edge: 705 - 745 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 6\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Near-IR1\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"nir08\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Near-IR1: 770 - 895 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
      \n", " \n", " \n", " \n", "
    • \n", " 7\n", "
        \n", " \n", " \n", " \n", "
      • \n", " name\n", " \"Near-IR2\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " common_name\n", " \"nir09\"\n", "
      • \n", " \n", " \n", " \n", " \n", " \n", "
      • \n", " description\n", " \"Near-IR2: 860 - 1040 nm\"\n", "
      • \n", " \n", " \n", " \n", "
      \n", "
    • \n", " \n", " \n", " \n", "
    \n", " \n", "
  • \n", " \n", " \n", "
\n", "
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "asset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's clear the in-memory catalog, add the EO item, and save to a new STAC:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "catalog.clear_items()\n", "list(catalog.get_items())" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "catalog.add_item(eo_item)\n", "list(catalog.get_items())" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "catalog.normalize_and_save(\n", " root_href=os.path.join(tmp_dir.name, \"stac-eo\"),\n", " catalog_type=pystac.CatalogType.SELF_CONTAINED,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, if we read the catalog from the filesystem, PySTAC recognizes that the item implements eo and so use it's functionality, e.g. getting the bands off the asset:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "catalog2 = pystac.read_file(os.path.join(tmp_dir.name, \"stac-eo\", \"catalog.json\"))" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assert isinstance(catalog2, pystac.Catalog)\n", "list(catalog2.get_items())" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "item: pystac.Item = next(catalog2.get_items(recursive=True))" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "assert item.ext.has(\"eo\")" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ,\n", " ]" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item.assets[\"image\"].ext.eo.bands" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Collections\n", "\n", "Collections are a subtype of Catalog that have some additional properties to make them more searchable. They also can define common properties so that items in the collection don't have to duplicate common data for each item. Let's create a collection to hold common properties between two images from the Spacenet 5 challenge.\n", "\n", "First we'll get another image, and it's bbox and footprint:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('/tmp/tmpdsdpun_y/image.tif', )" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "url2 = (\n", " \"https://spacenet-dataset.s3.amazonaws.com/\"\n", " \"spacenet/SN5_roads/train/AOI_7_Moscow/MS/\"\n", " \"SN5_roads_train_AOI_7_Moscow_MS_chip997.tif\"\n", ")\n", "img_path2 = os.path.join(tmp_dir.name, \"image.tif\")\n", "urllib.request.urlretrieve(url2, img_path2)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "bbox2, footprint2 = get_bbox_and_footprint(img_path2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can take a look at the pydocs for Collection to see what information we need to supply in order to satisfy the spec." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0;31mInit signature:\u001b[0m\n", "\u001b[0mpystac\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mCollection\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mid\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mdescription\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mextent\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Extent'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mtitle\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mstac_extensions\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[str]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mhref\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[str]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mextra_fields\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Any]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mcatalog_type\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[CatalogType]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mlicense\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'str'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'proprietary'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mkeywords\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[str]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mproviders\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[List[Provider]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0msummaries\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Summaries]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0massets\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Asset]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m \n", "A Collection extends the Catalog spec with additional metadata that helps\n", "enable discovery.\n", "\n", "Args:\n", " id : Identifier for the collection. Must be unique within the STAC.\n", " description : Detailed multi-line description to fully explain the\n", " collection. `CommonMark 0.29 syntax `_ MAY\n", " be used for rich text representation.\n", " extent : Spatial and temporal extents that describe the bounds of\n", " all items contained within this Collection.\n", " title : Optional short descriptive one-line title for the\n", " collection.\n", " stac_extensions : Optional list of extensions the Collection\n", " implements.\n", " href : Optional HREF for this collection, which be set as the\n", " collection's self link's HREF.\n", " catalog_type : Optional catalog type for this catalog. Must\n", " be one of the values in :class`~pystac.CatalogType`.\n", " license : Collection's license(s) as a\n", " `SPDX License identifier `_,\n", " `various`, or `proprietary`. If collection includes\n", " data with multiple different licenses, use `various` and add a link for\n", " each. Defaults to 'proprietary'.\n", " keywords : Optional list of keywords describing the collection.\n", " providers : Optional list of providers of this Collection.\n", " summaries : An optional map of property summaries,\n", " either a set of values or statistics such as a range.\n", " extra_fields : Extra fields that are part of the top-level\n", " JSON properties of the Collection.\n", " assets : A dictionary mapping string keys to :class:`~pystac.Asset` objects. All\n", " :class:`~pystac.Asset` values in the dictionary will have their\n", " :attr:`~pystac.Asset.owner` attribute set to the created Collection.\n", "\u001b[0;31mFile:\u001b[0m ~/pystac/pystac/collection.py\n", "\u001b[0;31mType:\u001b[0m ABCMeta\n", "\u001b[0;31mSubclasses:\u001b[0m " ] } ], "source": [ "?pystac.Collection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Beyond what a Catalog requires, a Collection requires a license, and an `Extent` that describes the range of space and time that the items it hold occupy." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[0;31mInit signature:\u001b[0m\n", "\u001b[0mpystac\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mExtent\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mspatial\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'SpatialExtent'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mtemporal\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'TemporalExtent'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m \u001b[0mextra_fields\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Optional[Dict[str, Any]]'\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\n", "\u001b[0;34m\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mDocstring:\u001b[0m \n", "Describes the spatiotemporal extents of a Collection.\n", "\n", "Args:\n", " spatial : Potential spatial extent covered by the collection.\n", " temporal : Potential temporal extent covered by the collection.\n", " extra_fields : Dictionary containing additional top-level fields defined on the\n", " Extent object.\n", "\u001b[0;31mFile:\u001b[0m ~/pystac/pystac/collection.py\n", "\u001b[0;31mType:\u001b[0m type\n", "\u001b[0;31mSubclasses:\u001b[0m " ] } ], "source": [ "?pystac.Extent" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An Extent is comprised of a SpatialExtent and a TemporalExtent. These hold one or more bounding boxes and time intervals, respectively, that completely cover the items contained in the collections.\n", "\n", "Let's start with creating two new items - these will be core Items. We can set these items to implement the `eo` extension by specifying them in the `stac_extensions`." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "collection_item = pystac.Item(\n", " id=\"local-image-col-1\",\n", " geometry=footprint,\n", " bbox=bbox,\n", " datetime=datetime.utcnow(),\n", " properties={},\n", ")\n", "\n", "collection_item.common_metadata.gsd = 0.3\n", "collection_item.common_metadata.platform = \"Maxar\"\n", "collection_item.common_metadata.instruments = [\"WorldView3\"]\n", "\n", "asset = pystac.Asset(href=img_path, media_type=pystac.MediaType.GEOTIFF)\n", "collection_item.add_asset(\"image\", asset)\n", "asset.ext.add(\"eo\")\n", "asset.ext.eo.bands = wv3_bands\n", "\n", "collection_item2 = pystac.Item(\n", " id=\"local-image-col-2\",\n", " geometry=footprint2,\n", " bbox=bbox2,\n", " datetime=datetime.utcnow(),\n", " properties={},\n", ")\n", "\n", "collection_item2.common_metadata.gsd = 0.3\n", "collection_item2.common_metadata.platform = \"Maxar\"\n", "collection_item2.common_metadata.instruments = [\"WorldView3\"]\n", "\n", "asset2 = pystac.Asset(href=img_path, media_type=pystac.MediaType.GEOTIFF)\n", "collection_item2.add_asset(\"image\", asset2)\n", "asset2.ext.add(\"eo\")\n", "asset2.ext.eo.bands = [\n", " band for band in wv3_bands if band.name in [\"Red\", \"Green\", \"Blue\"]\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use our two items' metadata to find out what the proper bounds are:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "from shapely.geometry import shape\n", "\n", "unioned_footprint = shape(footprint).union(shape(footprint2))\n", "collection_bbox = list(unioned_footprint.bounds)\n", "spatial_extent = pystac.SpatialExtent(bboxes=[collection_bbox])" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "collection_interval = sorted([collection_item.datetime, collection_item2.datetime])\n", "temporal_extent = pystac.TemporalExtent(intervals=[collection_interval])" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [], "source": [ "collection_extent = pystac.Extent(spatial=spatial_extent, temporal=temporal_extent)" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "collection = pystac.Collection(\n", " id=\"wv3-images\",\n", " description=\"Spacenet 5 images over Moscow\",\n", " extent=collection_extent,\n", " license=\"CC-BY-SA-4.0\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now if we add our items to our Collection, and our Collection to our Catalog, we get the following STAC that can be saved:" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[>,\n", " >]" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "collection.add_items([collection_item, collection_item2])" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "
    \n", " \n", " \n", " \n", "
  • \n", " rel\n", " \"child\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " href\n", " \"./wv3-images/collection.json\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " type\n", " \"application/json\"\n", "
  • \n", " \n", " \n", " \n", "
\n", "
\n", "
" ], "text/plain": [ ">" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "catalog.clear_items()\n", "catalog.clear_children()\n", "catalog.add_child(collection)" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* \n", " * \n", " * \n", " * \n" ] } ], "source": [ "catalog.describe()" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "catalog.normalize_and_save(\n", " root_href=os.path.join(tmp_dir.name, \"stac-collection\"),\n", " catalog_type=pystac.CatalogType.SELF_CONTAINED,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cleanup\n", "\n", "Don't forget to clean up the temporary directory!" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [], "source": [ "tmp_dir.cleanup()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a STAC of imagery from Spacenet 5 data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's take what we've learned and create a Catalog with more data in it.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Allowing PySTAC to read from AWS S3\n", "\n", "PySTAC aims to be virtually zero-dependency (notwithstanding the why-isn't-this-in-stdlib datetime-util), so it doesn't have the ability to read from or write to anything but the local file system. However, we can hook into PySTAC's IO in the following way. Learn more about how to customize I/O in STAC from the [documentation](https://pystac.readthedocs.io/en/stable/concepts.html#i-o-in-pystac):" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "from typing import Union, Any\n", "from urllib.parse import urlparse\n", "\n", "import boto3\n", "from pystac import Link\n", "from pystac.stac_io import DefaultStacIO\n", "\n", "\n", "class CustomStacIO(DefaultStacIO):\n", " def __init__(self):\n", " self.s3 = boto3.resource(\"s3\")\n", " super().__init__()\n", "\n", " def read_text(self, source: Union[str, Link], *args: Any, **kwargs: Any) -> str:\n", " parsed = urlparse(source)\n", " if parsed.scheme == \"s3\":\n", " bucket = parsed.netloc\n", " key = parsed.path[1:]\n", "\n", " obj = self.s3.Object(bucket, key)\n", " return obj.get()[\"Body\"].read().decode(\"utf-8\")\n", " else:\n", " return super().read_text(source, *args, **kwargs)\n", "\n", " def write_text(\n", " self, dest: Union[str, Link], txt: str, *args: Any, **kwargs: Any\n", " ) -> None:\n", " parsed = urlparse(dest)\n", " if parsed.scheme == \"s3\":\n", " bucket = parsed.netloc\n", " key = parsed.path[1:]\n", " self.s3.Object(bucket, key).put(Body=txt, ContentEncoding=\"utf-8\")\n", " else:\n", " super().write_text(dest, txt, *args, **kwargs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll need a utility to list keys for reading the lists of files from S3:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [], "source": [ "# From https://alexwlchan.net/2017/07/listing-s3-keys/\n", "from botocore import UNSIGNED\n", "from botocore.config import Config\n", "\n", "\n", "def get_s3_keys(bucket, prefix):\n", " \"\"\"Generate all the keys in an S3 bucket.\"\"\"\n", " s3 = boto3.client(\"s3\", config=Config(signature_version=UNSIGNED))\n", " kwargs = {\"Bucket\": bucket, \"Prefix\": prefix}\n", " while True:\n", " resp = s3.list_objects_v2(**kwargs)\n", " for obj in resp[\"Contents\"]:\n", " yield obj[\"Key\"]\n", "\n", " try:\n", " kwargs[\"ContinuationToken\"] = resp[\"NextContinuationToken\"]\n", " except KeyError:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's make a STAC of imagery over Moscow as part of the Spacenet 5 challenge. As a first step, we can list out the imagery and extract IDs from each of the chips." ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "moscow_training_chip_uris = list(\n", " get_s3_keys(\n", " bucket=\"spacenet-dataset\", prefix=\"spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/\"\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [], "source": [ "import re\n", "\n", "chip_id_to_data = {}\n", "\n", "\n", "def get_chip_id(uri):\n", " return re.search(r\".*\\_chip(\\d+)\\.\", uri).group(1)\n", "\n", "\n", "for uri in moscow_training_chip_uris:\n", " chip_id = get_chip_id(uri)\n", " chip_id_to_data[chip_id] = {\"img\": \"s3://spacenet-dataset/{}\".format(uri)}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this tutorial, we'll only take a subset of the data." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [], "source": [ "chip_id_to_data = dict(list(chip_id_to_data.items())[:10])" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'0': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip0.tif'},\n", " '1': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1.tif'},\n", " '10': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip10.tif'},\n", " '100': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip100.tif'},\n", " '1000': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1000.tif'},\n", " '1001': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1001.tif'},\n", " '1002': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1002.tif'},\n", " '1003': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1003.tif'},\n", " '1004': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1004.tif'},\n", " '1005': {'img': 's3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1005.tif'}}" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "chip_id_to_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's turn each of those chips into a STAC Item that represents the image." ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [], "source": [ "chip_id_to_items = {}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll create core `Item`s for our imagery, but mark them with the `eo` extension as we did above, and store the `eo` data in a `Collection`.\n", "\n", "Note that the image CRS is in WGS:84 (Lat/Lng). If it wasn't, we'd have to reproject the footprint to WGS:84 in order to be compliant with the spec (which can easily be done with [pyproj](https://github.com/pyproj4/pyproj)).\n", "\n", "Here we're taking advantage of `rasterio`'s ability to read S3 URIs, which only grabs the GeoTIFF metadata and does not pull the whole file down." ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip0.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip10.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip100.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1000.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1001.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1002.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1003.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1004.tif\n", "Processing s3://spacenet-dataset/spacenet/SN5_roads/train/AOI_7_Moscow/PS-MS/SN5_roads_train_AOI_7_Moscow_PS-MS_chip1005.tif\n" ] } ], "source": [ "import os\n", "\n", "os.environ[\"AWS_NO_SIGN_REQUEST\"] = \"true\"\n", "\n", "for chip_id in chip_id_to_data:\n", " img_uri = chip_id_to_data[chip_id][\"img\"]\n", " print(\"Processing {}\".format(img_uri))\n", " bbox, footprint = get_bbox_and_footprint(img_uri)\n", "\n", " item = pystac.Item(\n", " id=\"img_{}\".format(chip_id),\n", " geometry=footprint,\n", " bbox=bbox,\n", " datetime=datetime.utcnow(),\n", " properties={},\n", " )\n", "\n", " item.common_metadata.gsd = 0.3\n", " item.common_metadata.platform = \"Maxar\"\n", " item.common_metadata.instruments = [\"WorldView3\"]\n", "\n", " item.ext.add(\"eo\")\n", " item.ext.eo.bands = wv3_bands\n", " asset = pystac.Asset(href=img_uri, media_type=pystac.MediaType.COG)\n", " item.add_asset(key=\"ps-ms\", asset=asset)\n", " asset.ext.eo.bands = wv3_bands\n", " chip_id_to_items[chip_id] = item" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating the Collection\n", "\n", "All of these images are over Moscow. In Spacenet 5, we have a couple cities that have imagery; a good way to separate these collections of imagery. We can store all of the common `eo` metadata in the collection." ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [], "source": [ "from shapely.geometry import shape, MultiPolygon\n", "\n", "footprints = list(map(lambda i: shape(i.geometry).envelope, chip_id_to_items.values()))\n", "collection_bbox = MultiPolygon(footprints).bounds\n", "spatial_extent = pystac.SpatialExtent(bboxes=[collection_bbox])" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "datetimes = sorted(list(map(lambda i: i.datetime, chip_id_to_items.values())))\n", "temporal_extent = pystac.TemporalExtent(intervals=[[datetimes[0], datetimes[-1]]])" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [], "source": [ "collection_extent = pystac.Extent(spatial=spatial_extent, temporal=temporal_extent)" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [], "source": [ "collection = pystac.Collection(\n", " id=\"wv3-images\",\n", " description=\"Spacenet 5 images over Moscow\",\n", " extent=collection_extent,\n", " license=\"CC-BY-SA-4.0\",\n", ")" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[>,\n", " >,\n", " >,\n", " >,\n", " >,\n", " >,\n", " >,\n", " >,\n", " >,\n", " >]" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "collection.add_items(chip_id_to_items.values())" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n" ] } ], "source": [ "collection.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we can create a Catalog and add the collection." ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", "
    \n", " \n", " \n", " \n", "
  • \n", " rel\n", " \"child\"\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " href\n", " None\n", "
  • \n", " \n", " \n", " \n", " \n", " \n", "
  • \n", " type\n", " \"application/json\"\n", "
  • \n", " \n", " \n", " \n", "
\n", "
\n", "
" ], "text/plain": [ ">" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "catalog = pystac.Catalog(id=\"spacenet5\", description=\"Spacenet 5 Data (Test)\")\n", "catalog.add_child(collection)" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n", " * \n" ] } ], "source": [ "catalog.describe()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.1" } }, "nbformat": 4, "nbformat_minor": 4 }