{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Create and manipulate SpaceNet Vegas STAC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial shows how to create and manipulate STACs using pystac.\n", "\n", "- Create (in memory) a pystac catalog of [SpaceNet 2 imagery from the Las Vegas AOI](https://spacenetchallenge.github.io/AOI_Lists/AOI_2_Vegas.html) using data hosted in a public s3 bucket\n", "- Set relative paths for all STAC object\n", "- Normalize links from a root directory and save the STAC there" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "import sys\n", "\n", "sys.path.append(\"..\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may need install the following packages that are not included in the Python 3 standard library. If you do not have any of these installed, you can do do with pip:\n", "\n", "[boto3](https://pypi.org/project/boto3/): `pip install boto3` \n", "[botocore](https://pypi.org/project/botocore/): `pip install botocore` \n", "[rasterio](https://pypi.org/project/rasterio/): `pip install rasterio` \n", "[shapely](https://pypi.org/project/Shapely/): `pip install Shapely` \n", "[rio-cogeo](https://github.com/cogeotiff/rio-cogeo): `pip install rio-cogeo`" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "from datetime import datetime\n", "from os.path import basename, join\n", "\n", "import boto3\n", "import rasterio\n", "import pystac\n", "from shapely.geometry import GeometryCollection, box, shape, mapping" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create SpaceNet Vegas STAC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize a STAC for the SpaceNet 2 dataset" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "spacenet = pystac.Catalog(id=\"spacenet\", description=\"SpaceNet 2 STAC\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We do not yet know the spatial extent of the Vegas AOI. We will need to determine it when we download all of the images. As a placeholder we will create a spatial extent of null values." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "sp_extent = pystac.SpatialExtent([None, None, None, None])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The capture date for SpaceNet 2 Vegas imagery is October 22, 2015. Create a python datetime object using that date" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "capture_date = datetime.strptime(\"2015-10-22\", \"%Y-%m-%d\")\n", "tmp_extent = pystac.TemporalExtent([(capture_date, None)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an Extent object that will define both the spatial and temporal extents of the Vegas collection" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "extent = pystac.Extent(sp_extent, tmp_extent)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a collection that will encompass the Vegas data and add to the spacenet catalog" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "
\n", " \n", "
\n", "
" ], "text/plain": [ ">" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vegas = pystac.Collection(\n", " id=\"vegas\", description=\"Vegas SpaceNet 2 dataset\", extent=extent\n", ")\n", "spacenet.add_child(vegas)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* \n", " * \n" ] } ], "source": [ "spacenet.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find the locations of SpaceNet images. In order to make this example quicker, we will limit the number of scenes that we use to 10." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "client = boto3.client(\"s3\")\n", "scenes = client.list_objects(\n", " Bucket=\"spacenet-dataset\",\n", " Prefix=\"spacenet/SN2_buildings/train/AOI_2_Vegas/PS-RGB/\",\n", " MaxKeys=20,\n", ")\n", "scenes = [s[\"Key\"] for s in scenes[\"Contents\"] if s[\"Key\"].endswith(\".tif\")][0:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For each scene, create and item with a defined bounding box. Each item will include the geotiff as an asset. We will add labels in the next section." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "for scene in scenes:\n", " uri = join(\"s3://spacenet-dataset/\", scene)\n", " params = {}\n", " params[\"id\"] = basename(uri).split(\".\")[0]\n", " with rasterio.open(uri) as src:\n", " params[\"bbox\"] = list(src.bounds)\n", " params[\"geometry\"] = mapping(box(*params[\"bbox\"]))\n", " params[\"datetime\"] = capture_date\n", " params[\"properties\"] = {}\n", " i = pystac.Item(**params)\n", " i.add_asset(\n", " key=\"image\",\n", " asset=pystac.Asset(\n", " href=uri, title=\"Geotiff\", media_type=pystac.MediaType.GEOTIFF\n", " ),\n", " )\n", " vegas.add_item(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now reset the spatial extent of the Vegas collection using the geometry objects from from the items we just added." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "bounds = [\n", " list(\n", " GeometryCollection(\n", " [shape(s.geometry) for s in spacenet.get_items(recursive=True)]\n", " ).bounds\n", " )\n", "]\n", "vegas.extent.spatial = pystac.SpatialExtent(bounds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Currently, this STAC only exists in memory. We need to set all of the paths based on the root directory we want to save off that catalog too, and then save a \"self contained\" catalog, which will have all links be relative and contain no 'self' links. We can do this by using the `normalize` method to set the HREFs of all of our STAC objects. We'll then validate the catalog, and then save:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "spacenet.normalize_hrefs(\"spacenet-stac\")" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "spacenet.validate_all()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "spacenet.save(catalog_type=pystac.CatalogType.SELF_CONTAINED)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 4 }