{ "cells": [ { "cell_type": "markdown", "id": "be2c57c1-798a-4eaf-b2b8-41c261b657d1", "metadata": {}, "source": [ "# How to read data from STAC\n", "\n", "This notebook shows how to read the data in from a STAC asset using [xarray](https://docs.xarray.dev/en/stable/) and a little hidden helper library called [xpystac](https://pypi.org/project/xpystac/).\n", "\n", "## tl;dr\n", "\n", "For any PySTAC asset that can be represented as an xarray dataset you can read the data using the following command:\n", "\n", "```python\n", "xr.open_dataset(asset)\n", "```\n", "\n", "If you want to load multiple assets from the same item and/or many items at once use:\n", "\n", "\n", "```python\n", "odc.stac.load([items])\n", "```\n", "\n", "## Dependencies\n", "\n", "There are lots of optional dependencies depending on where and how the data you are interested in are stored. Here are some of the libraries that you will probably need:\n", "\n", "- dask - to delay data loading until access\n", "- pystac - STAC object structures\n", "- xarray, rioxarray - data structures\n", "- xpystac, odc-stac - helper for loading pystac into xarray objects" ] }, { "cell_type": "markdown", "id": "ad3fb6dc-3529-47bd-a5b3-f5260f23db88", "metadata": {}, "source": [ "Despite all these install instructions, the import block is very straightforward" ] }, { "cell_type": "code", "execution_count": 1, "id": "2a8afebd-b397-4e7a-b448-0f59cc030e66", "metadata": { "execution": { "iopub.execute_input": "2025-08-21T19:55:58.808745Z", "iopub.status.busy": "2025-08-21T19:55:58.808345Z", "iopub.status.idle": "2025-08-21T19:56:01.983744Z", "shell.execute_reply": "2025-08-21T19:56:01.983228Z", "shell.execute_reply.started": "2025-08-21T19:55:58.808709Z" } }, "outputs": [], "source": [ "import odc.stac\n", "import planetary_computer\n", "import rioxarray\n", "import xarray as xr\n", "\n", "import pystac" ] }, { "cell_type": "markdown", "id": "6b24745c-b2d5-43d6-9c7e-66458b3a88e3", "metadata": {}, "source": [ "## Examples\n", "\n", "Here are a few examples of the different types of objects that you can open in xarray." ] }, { "cell_type": "markdown", "id": "30da7cfd-2861-4095-b15b-9952a7d824d9", "metadata": {}, "source": [ "### COGs\n", "\n", "Read all the data from the COGs referenced by the assets on an item." ] }, { "cell_type": "code", "execution_count": 2, "id": "c77432e6-8b0d-44d2-a947-ec74a529b8cb", "metadata": { "execution": { "iopub.execute_input": "2025-08-21T19:56:03.945808Z", "iopub.status.busy": "2025-08-21T19:56:03.945372Z", "iopub.status.idle": "2025-08-21T19:56:05.448439Z", "shell.execute_reply": "2025-08-21T19:56:05.448060Z", "shell.execute_reply.started": "2025-08-21T19:56:03.945773Z" } }, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 2GB\n",
"Dimensions: (y: 7801, x: 7761, time: 1)\n",
"Coordinates:\n",
" * y (y) float64 62kB -3.713e+06 -3.713e+06 ... -3.947e+06\n",
" * x (x) float64 62kB 3.774e+05 3.774e+05 ... 6.102e+05 6.102e+05\n",
" spatial_ref int32 4B 32656\n",
" * time (time) datetime64[ns] 8B 2023-04-08T23:37:51.630731\n",
"Data variables: (12/19)\n",
" qa (time, y, x) int16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" red (time, y, x) uint16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" blue (time, y, x) uint16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" drad (time, y, x) int16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" emis (time, y, x) int16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" emsd (time, y, x) int16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" ... ...\n",
" swir16 (time, y, x) uint16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" swir22 (time, y, x) uint16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" coastal (time, y, x) uint16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" qa_pixel (time, y, x) uint16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" qa_radsat (time, y, x) uint16 121MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray>\n",
" qa_aerosol (time, y, x) uint8 61MB dask.array<chunksize=(1, 1024, 1024), meta=np.ndarray><xarray.DataArray 'blue' ()> Size: 8B\n",
"array(13940.01841915)\n",
"Coordinates:\n",
" spatial_ref int32 4B 32656<xarray.DataArray (band: 1, y: 122, x: 122)> Size: 30kB\n",
"[14884 values with dtype=uint16]\n",
"Coordinates:\n",
" * band (band) int64 8B 1\n",
" * x (x) float64 976B 3.783e+05 3.802e+05 ... 6.074e+05 6.093e+05\n",
" * y (y) float64 976B -3.714e+06 -3.716e+06 ... -3.946e+06\n",
" spatial_ref int64 8B 0\n",
"Attributes:\n",
" AREA_OR_POINT: Point\n",
" _FillValue: 0\n",
" scale_factor: 1.0\n",
" add_offset: 0.0<xarray.DataArray ()> Size: 8B\n",
"array(14472.21996775)\n",
"Coordinates:\n",
" spatial_ref int64 8B 0<xarray.Dataset> Size: 69GB\n",
"Dimensions: (time: 14965, y: 584, x: 284, nv: 2)\n",
"Coordinates:\n",
" lat (y, x) float32 663kB dask.array<chunksize=(584, 284), meta=np.ndarray>\n",
" lon (y, x) float32 663kB dask.array<chunksize=(584, 284), meta=np.ndarray>\n",
" * time (time) datetime64[ns] 120kB 1980-01-01T12:00:00 ...\n",
" * x (x) float32 1kB -5.802e+06 ... -5.519e+06\n",
" * y (y) float32 2kB -3.9e+04 -4e+04 ... -6.22e+05\n",
"Dimensions without coordinates: nv\n",
"Data variables:\n",
" dayl (time, y, x) float32 10GB dask.array<chunksize=(365, 584, 284), meta=np.ndarray>\n",
" lambert_conformal_conic int16 2B ...\n",
" prcp (time, y, x) float32 10GB dask.array<chunksize=(365, 584, 284), meta=np.ndarray>\n",
" srad (time, y, x) float32 10GB dask.array<chunksize=(365, 584, 284), meta=np.ndarray>\n",
" swe (time, y, x) float32 10GB dask.array<chunksize=(365, 584, 284), meta=np.ndarray>\n",
" time_bnds (time, nv) datetime64[ns] 239kB dask.array<chunksize=(365, 2), meta=np.ndarray>\n",
" tmax (time, y, x) float32 10GB dask.array<chunksize=(365, 584, 284), meta=np.ndarray>\n",
" tmin (time, y, x) float32 10GB dask.array<chunksize=(365, 584, 284), meta=np.ndarray>\n",
" vp (time, y, x) float32 10GB dask.array<chunksize=(365, 584, 284), meta=np.ndarray>\n",
" yearday (time) int16 30kB dask.array<chunksize=(365,), meta=np.ndarray>\n",
"Attributes:\n",
" Conventions: CF-1.6\n",
" Version_data: Daymet Data Version 4.0\n",
" Version_software: Daymet Software Version 4.0\n",
" citation: Please see http://daymet.ornl.gov/ for current Daymet ...\n",
" references: Please see http://daymet.ornl.gov/ for current informa...\n",
" source: Daymet Software Version 4.0\n",
" start_year: 1980<xarray.Dataset> Size: 738GB\n",
"Dimensions: (time: 23741, lat: 600, lon: 1440)\n",
"Coordinates:\n",
" * lat (lat) float64 5kB -59.88 -59.62 -59.38 -59.12 ... 89.38 89.62 89.88\n",
" * lon (lon) float64 12kB 0.125 0.375 0.625 0.875 ... 359.4 359.6 359.9\n",
" * time (time) datetime64[us] 190kB 1950-01-01T12:00:00 ... 2014-12-31T1...\n",
"Data variables:\n",
" hurs (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" huss (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" pr (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" rlds (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" rsds (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" sfcWind (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" tas (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" tasmax (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
" tasmin (time, lat, lon) float32 82GB dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n",
"Attributes: (12/22)\n",
" Conventions: CF-1.7\n",
" activity: NEX-GDDP-CMIP6\n",
" cmip6_institution_id: CSIRO-ARCCSS\n",
" cmip6_license: CC-BY-SA 4.0\n",
" cmip6_source_id: ACCESS-CM2\n",
" contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr. Bridget...\n",
" ... ...\n",
" scenario: historical\n",
" source: BCSD\n",
" title: ACCESS-CM2, r1i1p1f1, historical, global downscale...\n",
" tracking_id: 16d27564-470f-41ea-8077-f4cc3efa5bfe\n",
" variant_label: r1i1p1f1\n",
" version: 1.0