From 1564732bd5e4afe72f2ee4de6d64594b82099083 Mon Sep 17 00:00:00 2001 From: ghiggi Date: Fri, 24 Feb 2023 18:48:16 +0100 Subject: [PATCH] Fix indentation --- ...L0.readers.rst => disdrodb.l0.readers.rst} | 0 .../api/{disdrodb.L0.rst => disdrodb.l0.rst} | 0 .../api/{disdrodb.L1.rst => disdrodb.l1.rst} | 0 .../api/{disdrodb.L2.rst => disdrodb.l2.rst} | 0 docs/source/reader_preparation.ipynb | 1210 ++++++++--------- 5 files changed, 559 insertions(+), 651 deletions(-) rename docs/source/api/{disdrodb.L0.readers.rst => disdrodb.l0.readers.rst} (100%) rename docs/source/api/{disdrodb.L0.rst => disdrodb.l0.rst} (100%) rename docs/source/api/{disdrodb.L1.rst => disdrodb.l1.rst} (100%) rename docs/source/api/{disdrodb.L2.rst => disdrodb.l2.rst} (100%) diff --git a/docs/source/api/disdrodb.L0.readers.rst b/docs/source/api/disdrodb.l0.readers.rst similarity index 100% rename from docs/source/api/disdrodb.L0.readers.rst rename to docs/source/api/disdrodb.l0.readers.rst diff --git a/docs/source/api/disdrodb.L0.rst b/docs/source/api/disdrodb.l0.rst similarity index 100% rename from docs/source/api/disdrodb.L0.rst rename to docs/source/api/disdrodb.l0.rst diff --git a/docs/source/api/disdrodb.L1.rst b/docs/source/api/disdrodb.l1.rst similarity index 100% rename from docs/source/api/disdrodb.L1.rst rename to docs/source/api/disdrodb.l1.rst diff --git a/docs/source/api/disdrodb.L2.rst b/docs/source/api/disdrodb.l2.rst similarity index 100% rename from docs/source/api/disdrodb.L2.rst rename to docs/source/api/disdrodb.l2.rst diff --git a/docs/source/reader_preparation.ipynb b/docs/source/reader_preparation.ipynb index 9fab0c65..321322a1 100644 --- a/docs/source/reader_preparation.ipynb +++ b/docs/source/reader_preparation.ipynb @@ -2,28 +2,24 @@ "cells": [ { "cell_type": "markdown", - "id": "e1a34600", - "metadata": {}, "source": [ "# Step-by-step guide for DISDRODB reader preparation " - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "546ca031", - "metadata": {}, "source": [ "This notebook aims to guide you through creating the reader for the raw files logged by a disdrometer device. \n", "\n", "In first place, this notebook will provide you with functions that will display and enable to investigate the content of your raw data files.\n", "\n", "Successively, you will define a series of parameters defining the reader behaviour. These pieces of code will be consolidated in the [`reader_template.py`](https://github.com/ltelab/disdrodb/blob/main/disdrodb/L0/readers/reader_template.py) file to generate a DISDRODB L0 reader.\n" - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "1103b734", - "metadata": {}, "source": [ "In this notebook, we uses a lightweight dataset for illustratory purposes. You may use it and readapt it for exploring your own dataset, when preparing a new reader. \n", "\n", @@ -32,20 +28,18 @@ "* Step 1 : We set up the data within the correct directory structure\n", "* Step 2 : We start digging into the data to set up the transformation parameters.\n", "* Step 3 : We create the new reader" - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "7327d18c", - "metadata": {}, "source": [ "## Step 1: Set up the data within the correct directory structure" - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "68c9fee7", - "metadata": {}, "source": [ "For this example, you will find the sample data in the folder [`data`](https://github.com/ltelab/disdrodb/tree/main/data/DISDRODB) of the [disdrodb](https://github.com/ltelab/disdrodb/) repository. \n", "It corresponds to some measurements taken at two stations (`station_name_1` and `station_name_2`) during two days of a field campaign led by the EPFL LTE laboratory.\n", @@ -72,51 +66,38 @@ "```\n", "\n", "This structure fulfills the requirements described in the documentation to [Add a new reader](https://disdrodb.readthedocs.io/en/latest/readers.html#adding-a-new-reader).\n" - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "8dfbc56d", - "metadata": {}, "source": [ "## Step 2: Read and analyse the data" - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "51e00d83", - "metadata": {}, "source": [ "Once the dataset and metadata are set up in the correct directory structure, we can now start analysing our data. \n", "\n", "The objectives of Step 2 is to define the specifications to read the raw data into a dataframe and ensure that the dataframe columns match the DISDRODB standards.\n", "\n", "At the end, you should be able to generate Apache Parquet files from your input raw data. \n" - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "b0a1ff68", - "metadata": {}, "source": [ "--------------------------------------------------------------------\n", "Here we load the modules and packages required. *Nothing must be changed here*. " - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 2, - "id": "1053ef28", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "/home/ghiggi/Projects/disdrodb\n" - ] - } - ], "source": [ "# Define project root directory\n", "import os\n", @@ -125,27 +106,33 @@ " os.getcwd()\n", ") # something like /home/ghiggi/Projects/disdrodb\n", "print(root_path)" - ] + ], + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "/home/ghiggi/Projects/disdrodb\n" + ] + } + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 3, - "id": "efd4f3ed", - "metadata": {}, - "outputs": [], "source": [ "# If you didn't installed disdrodb, but you are running this tutorial within the cloned repository:\n", "import sys\n", "\n", "sys.path.insert(0, root_path)" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "code", "execution_count": 4, - "id": "1541060f", - "metadata": {}, - "outputs": [], "source": [ "import logging\n", "import pandas as pd\n", @@ -172,7 +159,7 @@ ")\n", "\n", "# L0A processing\n", - "from disdrodb.l0.L0A_processing import (\n", + "from disdrodb.l0.l0a_processing import (\n", " read_raw_data,\n", " read_raw_file_list,\n", " cast_column_dtypes,\n", @@ -180,7 +167,7 @@ ")\n", "\n", "# L0B processing\n", - "from disdrodb.l0.L0B_processing import (\n", + "from disdrodb.l0.l0b_processing import (\n", " retrieve_l0b_arrays,\n", " create_l0b_from_l0a,\n", " set_encodings,\n", @@ -191,12 +178,12 @@ "\n", "# Standards\n", "from disdrodb.l0.check_standards import check_sensor_name, check_l0a_column_names" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "markdown", - "id": "83f4cb20", - "metadata": {}, "source": [ "**1. Define paths and running parameters**\n", "\n", @@ -205,90 +192,82 @@ "NB:\n", "- In the real use case, the `DATA_SOURCE` and `CAMPAIGN_NAME`should be replaced by meaningul names ! \n", "- The `raw_dir` and `processed_dir` must end with the same `CAMPAIGN_NAME` (in upper case format)" - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 13, - "id": "b5141e00", - "metadata": {}, + "source": [ + "disdrodb_dir = os.path.join(root_path, \"data\", \"DISDRODB\")\n", + "raw_dir = os.path.join(disdrodb_dir, \"Raw\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n", + "processed_dir = os.path.join(disdrodb_dir, \"Processed\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n", + "assert os.path.exists(raw_dir), \"Raw directory does not exist\"\n", + "print(f\"raw_dir: {raw_dir}\")\n", + "print(f\"processed_dir: {processed_dir}\")" + ], "outputs": [ { - "name": "stdout", "output_type": "stream", + "name": "stdout", "text": [ "raw_dir: /home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME\n", "processed_dir: /home/ghiggi/Projects/disdrodb/data/DISDRODB/Processed/DATA_SOURCE/CAMPAIGN_NAME\n" ] } ], - "source": [ - "disdrodb_dir = os.path.join(root_path, \"data\", \"DISDRODB\")\n", - "raw_dir = os.path.join(disdrodb_dir, \"Raw\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n", - "processed_dir = os.path.join(disdrodb_dir, \"Processed\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n", - "assert os.path.exists(raw_dir), \"Raw directory does not exist\"\n", - "print(f\"raw_dir: {raw_dir}\")\n", - "print(f\"processed_dir: {processed_dir}\")" - ] + "metadata": {} }, { "cell_type": "markdown", - "id": "95a2efee", - "metadata": {}, "source": [ "Then we define the reader execution parameters. When the new reader will be created, these parameters will be become the reader function arguments. Please have a look [at the documentation](https://disdrodb.readthedocs.io/en/latest/readers.html#runing-a-reader) to get a full description. " - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 15, - "id": "fcef471a", - "metadata": {}, - "outputs": [], "source": [ "force = True\n", "parallel = False\n", "verbose = True\n", "debugging_mode = True\n", "sensor_name = \"OTT_Parsivel\"" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "markdown", - "id": "e1e69858", - "metadata": {}, "source": [ "**3. Selection of the station**\n", "\n", "In this example, we choose to implement and run the reader for station `station_name_1`. However, feel free to change the station name :)" - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 16, - "id": "34e43ceb", - "metadata": {}, - "outputs": [], "source": [ "station_name = \"station_name_1\"" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "markdown", - "id": "b0228129", - "metadata": {}, "source": [ "**2. Initialization**\n", "\n", "We initiate some checks, and get some variable. *Nothing must be changed here.*" - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 17, - "id": "31a948ae", - "metadata": {}, - "outputs": [], "source": [ "# Create directory structure\n", "create_initial_directory_structure(\n", @@ -298,43 +277,31 @@ " force=force,\n", " verbose=False,\n", ")" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "markdown", - "id": "de7d4a3c", - "metadata": {}, "source": [ "Please, be sure to run the cell above only one time. If it is run many times, the log file blocks the folder creation. " - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "id": "ef0074c9", - "metadata": {}, "source": [ "**4. Get the list of file to process**\n", "\n", "We now list all files that are in selected station.\n", "Here we need to specify the [glob pattern](https://en.wikipedia.org/wiki/Glob_(programming)) that enables to select all the relevant data files. \n", "Since the files in this case study are named like `file_