{ "cells": [ { "cell_type": "markdown", "id": "be2c57c1-798a-4eaf-b2b8-41c261b657d1", "metadata": {}, "source": [ "# How to read data from STAC\n", "\n", "This notebook shows how to read the data in from a STAC asset using [xarray](https://docs.xarray.dev/en/stable/) and a little hidden helper library called [xpystac](https://pypi.org/project/xpystac/).\n", "\n", "## tl;dr\n", "\n", "For any PySTAC object that can be represented as an ndimensional dataset you can read the data using the following command:\n", "\n", "```python\n", "xr.open_dataset(object)\n", "```\n", "\n", "## Dependencies\n", "\n", "There are lots of optional dependencies depending on where and how the data you are interested in are stored. Here are some of the libraries that you will probably need:\n", "\n", "- dask - to delay data loading until access\n", "- fsspec - to access data from remote storage\n", "- pystac - STAC object structures\n", "- xarray, rioxarray - data structures\n", "- xpystac, stackstac - helper for loading pystac into xarray objects" ] }, { "cell_type": "code", "execution_count": 1, "id": "11dddb09-6313-4822-90ba-26eb6e5c143b", "metadata": {}, "outputs": [], "source": [ "!pip install adlfs dask 'fsspec[http]' planetary_computer stackstac xarray xpystac zarr --quiet" ] }, { "cell_type": "markdown", "id": "ad3fb6dc-3529-47bd-a5b3-f5260f23db88", "metadata": {}, "source": [ "Despite all these install instructions, the import block is very straightforward" ] }, { "cell_type": "code", "execution_count": 2, "id": "2a8afebd-b397-4e7a-b448-0f59cc030e66", "metadata": {}, "outputs": [], "source": [ "import pystac\n", "import xarray as xr" ] }, { "cell_type": "markdown", "id": "6b24745c-b2d5-43d6-9c7e-66458b3a88e3", "metadata": {}, "source": [ "## Examples\n", "\n", "Here are a few examples of the different types of objects that you can open in xarray." ] }, { "cell_type": "markdown", "id": "30da7cfd-2861-4095-b15b-9952a7d824d9", "metadata": {}, "source": [ "### COGs\n", "\n", "Read all the data from the COGs referenced by the assets on an item." ] }, { "cell_type": "code", "execution_count": 3, "id": "c77432e6-8b0d-44d2-a947-ec74a529b8cb", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset>\n", "Dimensions: (time: 1, y: 7802, x: 7762, band: 19)\n", "Coordinates: (12/32)\n", " * time (time) datetime64[ns] 2023-04-08T23:37:51.63...\n", " id (time) <U31 ...\n", " * x (x) float64 3.774e+05 3.774e+05 ... 6.102e+05\n", " * y (y) float64 -3.713e+06 ... -3.947e+06\n", " proj:shape object ...\n", " sci:doi <U16 ...\n", " ... ...\n", " raster:bands (band) object ...\n", " classification:bitfields (band) object ...\n", " common_name (band) object ...\n", " center_wavelength (band) object ...\n", " full_width_half_max (band) object ...\n", " epsg int64 ...\n", "Dimensions without coordinates: band\n", "Data variables: (12/19)\n", " qa (time, y, x) float64 ...\n", " red (time, y, x) float64 ...\n", " blue (time, y, x) float64 ...\n", " drad (time, y, x) float64 ...\n", " emis (time, y, x) float64 ...\n", " emsd (time, y, x) float64 ...\n", " ... ...\n", " swir16 (time, y, x) float64 ...\n", " swir22 (time, y, x) float64 ...\n", " coastal (time, y, x) float64 ...\n", " qa_pixel (time, y, x) float64 ...\n", " qa_radsat (time, y, x) float64 ...\n", " qa_aerosol (time, y, x) float64 ...\n", "Attributes:\n", " spec: RasterSpec(epsg=32656, bounds=(377370.0, -3947130.0, 610230....\n", " crs: epsg:32656\n", " transform: | 30.00, 0.00, 377370.00|\\n| 0.00,-30.00,-3713070.00|\\n| 0.0...\n", " resolution: 30.0
<xarray.Dataset>\n", "Dimensions: (time: 14965, y: 584, x: 284, nv: 2)\n", "Coordinates:\n", " lat (y, x) float32 ...\n", " lon (y, x) float32 ...\n", " * time (time) datetime64[ns] 1980-01-01T12:00:00 ... 20...\n", " * x (x) float32 -5.802e+06 -5.801e+06 ... -5.519e+06\n", " * y (y) float32 -3.9e+04 -4e+04 ... -6.21e+05 -6.22e+05\n", "Dimensions without coordinates: nv\n", "Data variables:\n", " dayl (time, y, x) float32 ...\n", " lambert_conformal_conic int16 ...\n", " prcp (time, y, x) float32 ...\n", " srad (time, y, x) float32 ...\n", " swe (time, y, x) float32 ...\n", " time_bnds (time, nv) datetime64[ns] ...\n", " tmax (time, y, x) float32 ...\n", " tmin (time, y, x) float32 ...\n", " vp (time, y, x) float32 ...\n", " yearday (time) int16 ...\n", "Attributes:\n", " Conventions: CF-1.6\n", " Version_data: Daymet Data Version 4.0\n", " Version_software: Daymet Software Version 4.0\n", " citation: Please see http://daymet.ornl.gov/ for current Daymet ...\n", " references: Please see http://daymet.ornl.gov/ for current informa...\n", " source: Daymet Software Version 4.0\n", " start_year: 1980" ], "text/plain": [ "
<xarray.Dataset>\n", "Dimensions: (time: 23741, lat: 600, lon: 1440)\n", "Coordinates:\n", " * lat (lat) float64 -59.88 -59.62 -59.38 -59.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 0.125 0.375 0.625 0.875 ... 359.1 359.4 359.6 359.9\n", " * time (time) datetime64[us] 1950-01-01T12:00:00 ... 2014-12-31T12:00:00\n", "Data variables:\n", " hurs (time, lat, lon) float32 ...\n", " huss (time, lat, lon) float32 ...\n", " pr (time, lat, lon) float32 ...\n", " rlds (time, lat, lon) float32 ...\n", " rsds (time, lat, lon) float32 ...\n", " sfcWind (time, lat, lon) float32 ...\n", " tas (time, lat, lon) float32 ...\n", " tasmax (time, lat, lon) float32 ...\n", " tasmin (time, lat, lon) float32 ...\n", "Attributes: (12/22)\n", " Conventions: CF-1.7\n", " activity: NEX-GDDP-CMIP6\n", " cmip6_institution_id: CSIRO-ARCCSS\n", " cmip6_license: CC-BY-SA 4.0\n", " cmip6_source_id: ACCESS-CM2\n", " contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr. Bridget...\n", " ... ...\n", " scenario: historical\n", " source: BCSD\n", " title: ACCESS-CM2, r1i1p1f1, historical, global downscale...\n", " tracking_id: 16d27564-470f-41ea-8077-f4cc3efa5bfe\n", " variant_label: r1i1p1f1\n", " version: 1.0