diff --git a/README.md b/README.md index 74977d0..fa429a1 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ thumbnail -# Scikit-learn with Argo Observations Cookbook +# Scikit-learn on Argo Observations [![nightly-build](https://github.com/song-sangmin/sklearn-argo/actions/workflows/nightly-build.yaml/badge.svg)](https://github.com/song-sangmin/sklearn-argo/actions/workflows/nightly-build.yaml) [![Binder](https://binder.projectpythia.org/badge_logo.svg)](https://binder.projectpythia.org/v2/gh/song-sangmin/sklearn-argo/main?labpath=notebooks) @@ -18,7 +18,7 @@ This Project Pythia Cookbook covers two objectives: ## Authors -[Song Sangmin (@song-sangmin), [Second Author](@second-author), etc. _Acknowledge primary content authors here_ +[Song Sangmin](@song-sangmin), [Michael Chen](@second-author). ### Contributors @@ -30,17 +30,16 @@ This Project Pythia Cookbook covers two objectives: (State one or more sections that will comprise the notebook. E.g., _This cookbook is broken up into two main sections - "Foundations" and "Example Workflows."_ Then, describe each section below.) -This cookbook is broken up into __ sections: +This cookbook is broken up into two main sections. -1. Accessing BGC-Argo data with Argopy -2. Applying scikit-learn to prepare data for machine learning -3. Using sckikit-learn to develop regression models (e.g. RFR, XGBoost) +1. Argo Foundations +2. Scikit-learn Workflows -### Section 1: Accessing BGC-Argo ( Replace with the title of this section, e.g. "Foundations" ) +### Section 1: Argo Foundations (Add content for this section, e.g., "The foundational content includes ... ") -### Section 2: Scikit-learn ( Replace with the title of this section, e.g. "Example workflows" ) +### Section 2: Scikit-learn Workflows (Add content for this section, e.g., "Example workflows include ... ") diff --git a/environment.yml b/environment.yml index db2263d..e658246 100644 --- a/environment.yml +++ b/environment.yml @@ -8,6 +8,7 @@ dependencies: - python=3.9 - numpy - scipy + - pip # Data packages - pandas @@ -34,3 +35,5 @@ dependencies: - hyperopt - dataframe_image + - pip: + - argovisHelpers==0.0.26 diff --git a/notebooks/argo-access.ipynb b/notebooks/argo-access.ipynb index dad9f26..85e829a 100644 --- a/notebooks/argo-access.ipynb +++ b/notebooks/argo-access.ipynb @@ -17,9 +17,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Project Pythia Notebook Template\n", - "\n", - "Next, title your notebook appropriately with a top-level Markdown header, `#`. Do not use this level header anywhere else in the notebook. Our book build process will use this title in the navbar, table of contents, etc. Keep it short, keep it descriptive. Follow this with a `---` cell to visually distinguish the transition to the prerequisites section." + "# Accessing Argo Data" ] }, { @@ -34,13 +32,17 @@ "metadata": {}, "source": [ "## Overview\n", - "If you have an introductory paragraph, lead with it here! Keep it short and tied to your material, then be sure to continue into the required list of topics below,\n", "\n", - "1. This is a numbered list of the specific topics\n", - "1. These should map approximately to your main sections of content\n", - "1. Or each second-level, `##`, header in your notebook\n", - "1. Keep the size and scope of your notebook in check\n", - "1. And be sure to let the reader know up front the important concepts they'll be leaving with" + "Building upon previous notebook, [Introduction to Argo](notebooks/argo-introduction.ipynb), we next explore how to access Argo data in a few different ways. \n", + "\n", + "1. Data formats for Argo profiles\n", + "2. Downloading [monthly snapshots](http://www.argodatamgt.org/Access-to-data/Argo-DOI-Digital-Object-Identifier) using Argo DOI's\n", + "3. Using [Argovis](https://argovis.colorado.edu/argo) for API-based queries \n", + "4. Using the [GO-BGC Toolbox](https://github.com/go-bgc/workshop-python)\n", + "5. Using [Argopy](https://argopy.readthedocs.io/en/latest/user-guide/fetching-argo-data/index.html), a dedicated Python package\n", + "\n", + "After going through this notebook, you will be able to retrieve Argo data of interest within a certain time frame, geographical location, or by platform identifier. There are many ways of working with Argo data which are not described here. \n", + "Further information on Argo access can be found on the [Argo website](https://argo.ucsd.edu/data/)." ] }, { @@ -48,23 +50,16 @@ "metadata": {}, "source": [ "## Prerequisites\n", - "This section was inspired by [this template](https://github.com/alan-turing-institute/the-turing-way/blob/master/book/templates/chapter-template/chapter-landing-page.md) of the wonderful [The Turing Way](https://the-turing-way.netlify.app) Jupyter Book.\n", - "\n", - "Following your overview, tell your reader what concepts, packages, or other background information they'll **need** before learning your material. Tie this explicitly with links to other pages here in Foundations or to relevant external resources. Remove this body text, then populate the Markdown table, denoted in this cell with `|` vertical brackets, below, and fill out the information following. In this table, lay out prerequisite concepts by explicitly linking to other Foundations material or external resources, or describe generally helpful concepts.\n", "\n", "Label the importance of each concept explicitly as **helpful/necessary**.\n", "\n", "| Concepts | Importance | Notes |\n", "| --- | --- | --- |\n", - "| [Intro to Cartopy](https://foundations.projectpythia.org/core/cartopy/cartopy.html) | Necessary | |\n", - "| [Understanding of NetCDF](https://foundations.projectpythia.org/core/data-formats/netcdf-cf.html) | Helpful | Familiarity with metadata structure |\n", - "| Project management | Helpful | |\n", + "| [Intro to Numpy](https://numpy.org/learn/) | Necessary | |\n", + "| [Intro to NetCDF](https://foundations.projectpythia.org/core/data-formats/netcdf-cf.html) | Necessary | Familiarity with metadata structure |\n", + "| [Intro to Xarray](https://foundations.projectpythia.org/core/xarray.html) | Necessary | |\n", "\n", - "- **Time to learn**: estimate in minutes. For a rough idea, use 5 mins per subsection, 10 if longer; add these up for a total. Safer to round up and overestimate.\n", - "- **System requirements**:\n", - " - Populate with any system, version, or non-Python software requirements if necessary\n", - " - Otherwise use the concepts table above and the Imports section below to describe required packages as necessary\n", - " - If no extra requirements, remove the **System requirements** point altogether" + "- **Time to learn**: 20 min\n" ] }, { @@ -84,18 +79,624 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "ename": "ModuleNotFoundError", + "evalue": "No module named 'argovisHelpers'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", + "Cell \u001b[0;32mIn[1], line 14\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mmatplotlib\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mcolors\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mmcolors\u001b[39;00m\n\u001b[1;32m 12\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mseaborn\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01msns\u001b[39;00m\n\u001b[0;32m---> 14\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01margovisHelpers\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m helpers \u001b[38;5;28;01mas\u001b[39;00m avh\n", + "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'argovisHelpers'" + ] + } + ], + "source": [ + "# Import packages\n", + "import sys\n", + "import os\n", + "import numpy as np\n", + "import pandas as pd\n", + "import scipy\n", + "import xarray as xr\n", + "from datetime import datetime, timedelta\n", + "\n", + "import matplotlib.pyplot as plt\n", + "import matplotlib.colors as mcolors\n", + "import seaborn as sns\n", + "\n", + "from argovisHelpers import helpers as avh" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Snapshots" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "# # Base filepath. Need for Argo GDAC function.\n", + "root = '/Users/sangminsong/Library/CloudStorage/OneDrive-UW/Code/2024_Pythia/'\n", + "profile_dir = root + 'SOCCOM_GO-BGC_LoResQC_LIAR_28Aug2023_netcdf/'" + ] + }, + { + "cell_type": "code", + "execution_count": 6, "metadata": {}, "outputs": [], "source": [ - "import sys" + "DSdict = {}\n", + "for filename in os.listdir(profile_dir):\n", + " if filename.endswith(\".nc\"):\n", + " fp = profile_dir + filename\n", + " single_dataset = xr.open_dataset(fp, decode_times=False)\n", + " DSdict[filename[0:7]] = single_dataset\n", + "# DSdict['5906030']" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
<xarray.Dataset> Size: 2MB\n",
+       "Dimensions:              (N_PROF: 117, N_LEVELS: 69, NPARAMETER: 42)\n",
+       "Dimensions without coordinates: N_PROF, N_LEVELS, NPARAMETER\n",
+       "Data variables: (12/60)\n",
+       "    Cruise               |S11 11B ...\n",
+       "    Station              (N_PROF) int32 468B ...\n",
+       "    Lon                  (N_PROF) float64 936B ...\n",
+       "    Lat                  (N_PROF) float64 936B ...\n",
+       "    Lat_QF               (N_PROF) |S1 117B ...\n",
+       "    Lat_QFA              (N_PROF) float64 936B ...\n",
+       "    ...                   ...\n",
+       "    Type                 |S1 1B ...\n",
+       "    mon_day_yr           (N_PROF) |S10 1kB ...\n",
+       "    hh_mm                (N_PROF) |S5 585B ...\n",
+       "    Parameters           (NPARAMETER) |S19 798B ...\n",
+       "    JULD                 (N_PROF) float64 936B ...\n",
+       "    REFERENCE_DATE_TIME  object 8B ...\n",
+       "Attributes:\n",
+       "    Comments:  \\n//0\\n//<Encoding>UTF-8</Encoding>\\n//File updated on 08/26/2...
" + ], + "text/plain": [ + " Size: 2MB\n", + "Dimensions: (N_PROF: 117, N_LEVELS: 69, NPARAMETER: 42)\n", + "Dimensions without coordinates: N_PROF, N_LEVELS, NPARAMETER\n", + "Data variables: (12/60)\n", + " Cruise |S11 11B ...\n", + " Station (N_PROF) int32 468B ...\n", + " Lon (N_PROF) float64 936B ...\n", + " Lat (N_PROF) float64 936B ...\n", + " Lat_QF (N_PROF) |S1 117B ...\n", + " Lat_QFA (N_PROF) float64 936B ...\n", + " ... ...\n", + " Type |S1 1B ...\n", + " mon_day_yr (N_PROF) |S10 1kB ...\n", + " hh_mm (N_PROF) |S5 585B ...\n", + " Parameters (NPARAMETER) |S19 798B ...\n", + " JULD (N_PROF) float64 936B ...\n", + " REFERENCE_DATE_TIME object 8B ...\n", + "Attributes:\n", + " Comments: \\n//0\\n//UTF-8\\n//File updated on 08/26/2..." + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "DSdict['5906007']\n" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'/Users/sangminsong/Library/CloudStorage/OneDrive-UW/Code/2024_Pythia/SOCCOM_GO-BGC_LoResQC_LIAR_28Aug2023_netcdf/'" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "profile_dir" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [ + { + "ename": "ValueError", + "evalue": "unrecognized chunk manager dask - must be one of: []", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "Cell \u001b[0;32mIn[14], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mxr\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mopen_mfdataset\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m/Users/sangminsong/Library/CloudStorage/OneDrive-UW/Code/2024_Pythia/SOCCOM_GO-BGC_LoResQC_LIAR_28Aug2023_netcdf/*.nc\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", + "File \u001b[0;32m/opt/homebrew/Caskroom/mambaforge/base/envs/sklearn-argo-dev/lib/python3.9/site-packages/xarray/backends/api.py:1054\u001b[0m, in \u001b[0;36mopen_mfdataset\u001b[0;34m(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs)\u001b[0m\n\u001b[1;32m 1051\u001b[0m open_ \u001b[38;5;241m=\u001b[39m open_dataset\n\u001b[1;32m 1052\u001b[0m getattr_ \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mgetattr\u001b[39m\n\u001b[0;32m-> 1054\u001b[0m datasets \u001b[38;5;241m=\u001b[39m [open_(p, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mopen_kwargs) \u001b[38;5;28;01mfor\u001b[39;00m p \u001b[38;5;129;01min\u001b[39;00m paths]\n\u001b[1;32m 1055\u001b[0m closers \u001b[38;5;241m=\u001b[39m [getattr_(ds, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m_close\u001b[39m\u001b[38;5;124m\"\u001b[39m) \u001b[38;5;28;01mfor\u001b[39;00m ds \u001b[38;5;129;01min\u001b[39;00m datasets]\n\u001b[1;32m 1056\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m preprocess \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n", + "File \u001b[0;32m/opt/homebrew/Caskroom/mambaforge/base/envs/sklearn-argo-dev/lib/python3.9/site-packages/xarray/backends/api.py:1054\u001b[0m, in \u001b[0;36m\u001b[0;34m(.0)\u001b[0m\n\u001b[1;32m 1051\u001b[0m open_ \u001b[38;5;241m=\u001b[39m open_dataset\n\u001b[1;32m 1052\u001b[0m getattr_ \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mgetattr\u001b[39m\n\u001b[0;32m-> 1054\u001b[0m datasets \u001b[38;5;241m=\u001b[39m [\u001b[43mopen_\u001b[49m\u001b[43m(\u001b[49m\u001b[43mp\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mopen_kwargs\u001b[49m\u001b[43m)\u001b[49m \u001b[38;5;28;01mfor\u001b[39;00m p \u001b[38;5;129;01min\u001b[39;00m paths]\n\u001b[1;32m 1055\u001b[0m closers \u001b[38;5;241m=\u001b[39m [getattr_(ds, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m_close\u001b[39m\u001b[38;5;124m\"\u001b[39m) \u001b[38;5;28;01mfor\u001b[39;00m ds \u001b[38;5;129;01min\u001b[39;00m datasets]\n\u001b[1;32m 1056\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m preprocess \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n", + "File \u001b[0;32m/opt/homebrew/Caskroom/mambaforge/base/envs/sklearn-argo-dev/lib/python3.9/site-packages/xarray/backends/api.py:577\u001b[0m, in \u001b[0;36mopen_dataset\u001b[0;34m(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)\u001b[0m\n\u001b[1;32m 570\u001b[0m overwrite_encoded_chunks \u001b[38;5;241m=\u001b[39m kwargs\u001b[38;5;241m.\u001b[39mpop(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124moverwrite_encoded_chunks\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mNone\u001b[39;00m)\n\u001b[1;32m 571\u001b[0m backend_ds \u001b[38;5;241m=\u001b[39m backend\u001b[38;5;241m.\u001b[39mopen_dataset(\n\u001b[1;32m 572\u001b[0m filename_or_obj,\n\u001b[1;32m 573\u001b[0m drop_variables\u001b[38;5;241m=\u001b[39mdrop_variables,\n\u001b[1;32m 574\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mdecoders,\n\u001b[1;32m 575\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[1;32m 576\u001b[0m )\n\u001b[0;32m--> 577\u001b[0m ds \u001b[38;5;241m=\u001b[39m \u001b[43m_dataset_from_backend_dataset\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 578\u001b[0m \u001b[43m \u001b[49m\u001b[43mbackend_ds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 579\u001b[0m \u001b[43m \u001b[49m\u001b[43mfilename_or_obj\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 580\u001b[0m \u001b[43m \u001b[49m\u001b[43mengine\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 581\u001b[0m \u001b[43m \u001b[49m\u001b[43mchunks\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 582\u001b[0m \u001b[43m \u001b[49m\u001b[43mcache\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 583\u001b[0m \u001b[43m \u001b[49m\u001b[43moverwrite_encoded_chunks\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 584\u001b[0m \u001b[43m \u001b[49m\u001b[43minline_array\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 585\u001b[0m \u001b[43m \u001b[49m\u001b[43mchunked_array_type\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 586\u001b[0m \u001b[43m \u001b[49m\u001b[43mfrom_array_kwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 587\u001b[0m \u001b[43m \u001b[49m\u001b[43mdrop_variables\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdrop_variables\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 588\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mdecoders\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 589\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 590\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 591\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m ds\n", + "File \u001b[0;32m/opt/homebrew/Caskroom/mambaforge/base/envs/sklearn-argo-dev/lib/python3.9/site-packages/xarray/backends/api.py:370\u001b[0m, in \u001b[0;36m_dataset_from_backend_dataset\u001b[0;34m(backend_ds, filename_or_obj, engine, chunks, cache, overwrite_encoded_chunks, inline_array, chunked_array_type, from_array_kwargs, **extra_tokens)\u001b[0m\n\u001b[1;32m 368\u001b[0m ds \u001b[38;5;241m=\u001b[39m backend_ds\n\u001b[1;32m 369\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 370\u001b[0m ds \u001b[38;5;241m=\u001b[39m \u001b[43m_chunk_ds\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 371\u001b[0m \u001b[43m \u001b[49m\u001b[43mbackend_ds\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 372\u001b[0m \u001b[43m \u001b[49m\u001b[43mfilename_or_obj\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 373\u001b[0m \u001b[43m \u001b[49m\u001b[43mengine\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 374\u001b[0m \u001b[43m \u001b[49m\u001b[43mchunks\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 375\u001b[0m \u001b[43m \u001b[49m\u001b[43moverwrite_encoded_chunks\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 376\u001b[0m \u001b[43m \u001b[49m\u001b[43minline_array\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 377\u001b[0m \u001b[43m \u001b[49m\u001b[43mchunked_array_type\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 378\u001b[0m \u001b[43m \u001b[49m\u001b[43mfrom_array_kwargs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 379\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mextra_tokens\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 380\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 382\u001b[0m ds\u001b[38;5;241m.\u001b[39mset_close(backend_ds\u001b[38;5;241m.\u001b[39m_close)\n\u001b[1;32m 384\u001b[0m \u001b[38;5;66;03m# Ensure source filename always stored in dataset object\u001b[39;00m\n", + "File \u001b[0;32m/opt/homebrew/Caskroom/mambaforge/base/envs/sklearn-argo-dev/lib/python3.9/site-packages/xarray/backends/api.py:318\u001b[0m, in \u001b[0;36m_chunk_ds\u001b[0;34m(backend_ds, filename_or_obj, engine, chunks, overwrite_encoded_chunks, inline_array, chunked_array_type, from_array_kwargs, **extra_tokens)\u001b[0m\n\u001b[1;32m 307\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m_chunk_ds\u001b[39m(\n\u001b[1;32m 308\u001b[0m backend_ds,\n\u001b[1;32m 309\u001b[0m filename_or_obj,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 316\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mextra_tokens,\n\u001b[1;32m 317\u001b[0m ):\n\u001b[0;32m--> 318\u001b[0m chunkmanager \u001b[38;5;241m=\u001b[39m \u001b[43mguess_chunkmanager\u001b[49m\u001b[43m(\u001b[49m\u001b[43mchunked_array_type\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 320\u001b[0m \u001b[38;5;66;03m# TODO refactor to move this dask-specific logic inside the DaskManager class\u001b[39;00m\n\u001b[1;32m 321\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(chunkmanager, DaskManager):\n", + "File \u001b[0;32m/opt/homebrew/Caskroom/mambaforge/base/envs/sklearn-argo-dev/lib/python3.9/site-packages/xarray/namedarray/parallelcompat.py:117\u001b[0m, in \u001b[0;36mguess_chunkmanager\u001b[0;34m(manager)\u001b[0m\n\u001b[1;32m 115\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(manager, \u001b[38;5;28mstr\u001b[39m):\n\u001b[1;32m 116\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m manager \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;129;01min\u001b[39;00m chunkmanagers:\n\u001b[0;32m--> 117\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[1;32m 118\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124munrecognized chunk manager \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mmanager\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m - must be one of: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mlist\u001b[39m(chunkmanagers)\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 119\u001b[0m )\n\u001b[1;32m 121\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m chunkmanagers[manager]\n\u001b[1;32m 122\u001b[0m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(manager, ChunkManagerEntrypoint):\n\u001b[1;32m 123\u001b[0m \u001b[38;5;66;03m# already a valid ChunkManager so just pass through\u001b[39;00m\n", + "\u001b[0;31mValueError\u001b[0m: unrecognized chunk manager dask - must be one of: []" + ] + } + ], + "source": [ + "xr.open_mfdataset(\"/Users/sangminsong/Library/CloudStorage/OneDrive-UW/Code/2024_Pythia/SOCCOM_GO-BGC_LoResQC_LIAR_28Aug2023_netcdf/*.nc\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Your first content section" + "## Argovis" ] }, { @@ -295,7 +896,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.8" + "version": "3.9.19" }, "nbdime-conflicts": { "local_diff": [ diff --git a/notebooks/argo-introduction.ipynb b/notebooks/argo-introduction.ipynb index dad9f26..e6148a9 100644 --- a/notebooks/argo-introduction.ipynb +++ b/notebooks/argo-introduction.ipynb @@ -4,22 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's start here! If you can directly link to an image relevant to your notebook, such as [canonical logos](https://github.com/numpy/numpy/blob/main/doc/source/_static/numpylogo.svg), do so here at the top of your notebook. You can do this with Markdown syntax,\n", - "\n", - "> `![](http://link.com/to/image.png \"image alt text\")`\n", - "\n", - "or edit this cell to see raw HTML `img` demonstration. This is preferred if you need to shrink your embedded image. **Either way be sure to include `alt` text for any embedded images to make your content more accessible.**\n", - "\n", - "\"Project" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Project Pythia Notebook Template\n", - "\n", - "Next, title your notebook appropriately with a top-level Markdown header, `#`. Do not use this level header anywhere else in the notebook. Our book build process will use this title in the navbar, table of contents, etc. Keep it short, keep it descriptive. Follow this with a `---` cell to visually distinguish the transition to the prerequisites section." + "# Introduction to Argo Observations" ] }, { @@ -36,11 +21,12 @@ "## Overview\n", "If you have an introductory paragraph, lead with it here! Keep it short and tied to your material, then be sure to continue into the required list of topics below,\n", "\n", - "1. This is a numbered list of the specific topics\n", - "1. These should map approximately to your main sections of content\n", - "1. Or each second-level, `##`, header in your notebook\n", - "1. Keep the size and scope of your notebook in check\n", - "1. And be sure to let the reader know up front the important concepts they'll be leaving with" + "Here, we introduce Argo profiling floats, which are autonomous instruments that operate remotely and sample the ocean interior continuously. \n", + "\n", + "1. What are Argo floats?\n", + "2. What data are available, and in what formats?\n", + " \n", + "In the next notebook, [Accessing Argo Data](notebooks/argo-access.ipynb), we will explore different ways of downloading and retrieving float profiles. \n" ] }, {