{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quickstart\n", "\n", "This notebook is a quick introduction to using PySTAC for reading an existing STAC catalog. For more in-depth examples check out the other tutorials.\n", "\n", "## Dependencies\n", "\n", "- PySTAC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading a Catalog\n", "\n", "[A STAC Catalog](https://github.com/radiantearth/stac-spec/tree/master/catalog-spec) is used to group other STAC objects like Items, Collections, or even other Catalogs.\n", "\n", "We will be using a small example catalog adapted from the [example Landsat Collection](https://github.com/geotrellis/geotrellis-server/tree/977bad7a64c409341479c281c8c72222008861fd/stac-example/catalog/landsat-stac-collection) in the [GeoTrellis](https://geotrellis.io) repository. All STAC Items and Collections can be found in the [docs/example-catalog](https://github.com/stac-utils/pystac/tree/main/docs/example-catalog) directory of this repo; all Assets are hosted in the Landsat S3 bucket.\n", "\n", "First, we import the PySTAC classes we will be working with." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import shutil\n", "import tempfile\n", "from pathlib import Path\n", "\n", "from pystac import Catalog, get_stac_version" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we read the example catalog and print some basic metadata." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ID: landsat-stac-collection-catalog\n", "Title: STAC for Landsat data\n", "Description: STAC for Landsat data\n" ] } ], "source": [ "root_catalog = Catalog.from_file(\"./example-catalog/catalog.json\")\n", "print(f\"ID: {root_catalog.id}\")\n", "print(f\"Title: {root_catalog.title or 'N/A'}\")\n", "print(f\"Description: {root_catalog.description or 'N/A'}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Note that we do not print the \"stac_version\" here. PySTAC automatically updates any Catalogs to the most recent supported STAC version and will automatically write this to the JSON object during serialization.*\n", "\n", "Let's confirm the latest STAC Spec version supported by PySTAC." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.0.0\n" ] } ], "source": [ "print(get_stac_version())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Crawling Child Catalogs/Collections\n", "\n", "[STAC Collections](https://github.com/radiantearth/stac-spec/tree/master/collection-spec) are used to group related Items and provide aggregate or summary metadata for those Items.\n", "\n", "STAC Catalogs may have many nested layers of Catalogs or Collections within the top-level collection. Our example catalog has one Collection within the main Catalog at [landsat-8-l1/collection.json](./example-catalog/landsat-8-l1/collection.json). We can list the Collections in a given Catalog using the [Catalog.get_collections](https://pystac.readthedocs.io/en/latest/api.html#pystac.Catalog.get_collections) method. This method returns an iterable of PySTAC [Collection](https://pystac.readthedocs.io/en/latest/api.html#collection) instances, which we will turn into a `list`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of collections: 1\n", "Collections IDs:\n", "- landsat-8-l1\n" ] } ], "source": [ "collections = list(root_catalog.get_collections())\n", "\n", "print(f\"Number of collections: {len(collections)}\")\n", "print(\"Collections IDs:\")\n", "for collection in collections:\n", " print(f\"- {collection.id}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's grab that Collection as a PySTAC [Collection](https://pystac.readthedocs.io/en/latest/api.html#collection) instance using the [Catalog.get_child method](https://pystac.readthedocs.io/en/latest/api.html#pystac.Catalog.get_child) so we can look at it in more detail. This method gets a child Catalog or Collection by ID, so we'll use the Collection ID that we printed above. Since this method returns `None` if no child exists with the given ID, we'll check to make sure we actually got the `Collection`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "collection = root_catalog.get_child(\"landsat-8-l1\")\n", "assert collection is not None" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Crawling Items\n", "\n", "[STAC Items](https://github.com/radiantearth/stac-spec/tree/master/item-spec) are the fundamental building blocks of a STAC Catalog. Each Item represents a single spatiotemporal resource (e.g. a satellite scene).\n", "\n", "Both Catalogs and Collections may have Items associated with them. Let's crawl our catalog, starting at the root, to see what Items we have. The [Catalog.get_items method](https://pystac.readthedocs.io/en/latest/api.html#pystac.Catalog.get_items) provides a convenient way of recursively listing all Items associated with a Catalog and all of its sub-Catalogs by including the `recursive=True` option." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of items: 4\n", "- LC80140332018166LGN00\n", "- LC80150322018141LGN00\n", "- LC80150332018189LGN00\n", "- LC80300332018166LGN00\n" ] } ], "source": [ "items = list(root_catalog.get_items(recursive=True))\n", "\n", "print(f\"Number of items: {len(items)}\")\n", "for item in items:\n", " print(f\"- {item.id}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These IDs are not very descriptive; in the next section, we will take a look at how we can access the rich metadata associated with each Item." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Item Metadata\n", "\n", "Items can have *a lot* of metadata. This can be a bit overwhelming at first, but break the metadata fields down into a few categories:\n", "\n", "- Core Item Metadata\n", "- Common Metadata\n", "- STAC Extensions\n", "\n", "We will walk through each of these metadata categories in the following sections. \n", "\n", "First, let's grab one of the Items using the [Catalog.get_items method](https://pystac.readthedocs.io/en/latest/api.html#pystac.Catalog.get_items). We will use `recursive=True` to recursively crawl all child Catalogs and/or Collections to find the Item." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "item = next(root_catalog.get_items(\"LC80140332018166LGN00\", recursive=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Core Item Metadata\n", "\n", "The core Item metadata fields include spatiotemporal information and the ID of the collection to which the Item belongs. These fields are all at the top level of the Item JSON and we can access them through attributes on the [PySTAC Item](https://pystac.readthedocs.io/en/latest/api.html#item) instance." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'type': 'Polygon',\n", " 'coordinates': [[[-76.12180471942207, 39.95810181489563],\n", " [-73.94910518227414, 39.55117185146004],\n", " [-74.49564725552679, 37.826064511480496],\n", " [-76.66550404911956, 38.240699151776084],\n", " [-76.12180471942207, 39.95810181489563]]]}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item.geometry" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[-76.66703, 37.82561, -73.94861, 39.95958]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item.bbox" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "datetime.datetime(2018, 6, 15, 15, 39, 9, tzinfo=tzutc())" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item.datetime" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'landsat-8-l1'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item.collection_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want the actual `Collection` instance instead of just the ID, we can use the [Item.get_collection](https://pystac.readthedocs.io/en/latest/api.html#pystac.Item.get_collection) method." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "