From 1564732bd5e4afe72f2ee4de6d64594b82099083 Mon Sep 17 00:00:00 2001
From: ghiggi <gionata.ghiggi@gmail.com>
Date: Fri, 24 Feb 2023 18:48:16 +0100
Subject: [PATCH] Fix indentation

---
 ...L0.readers.rst => disdrodb.l0.readers.rst} |    0
 .../api/{disdrodb.L0.rst => disdrodb.l0.rst}  |    0
 .../api/{disdrodb.L1.rst => disdrodb.l1.rst}  |    0
 .../api/{disdrodb.L2.rst => disdrodb.l2.rst}  |    0
 docs/source/reader_preparation.ipynb          | 1210 ++++++++---------
 5 files changed, 559 insertions(+), 651 deletions(-)
 rename docs/source/api/{disdrodb.L0.readers.rst => disdrodb.l0.readers.rst} (100%)
 rename docs/source/api/{disdrodb.L0.rst => disdrodb.l0.rst} (100%)
 rename docs/source/api/{disdrodb.L1.rst => disdrodb.l1.rst} (100%)
 rename docs/source/api/{disdrodb.L2.rst => disdrodb.l2.rst} (100%)

diff --git a/docs/source/api/disdrodb.L0.readers.rst b/docs/source/api/disdrodb.l0.readers.rst
similarity index 100%
rename from docs/source/api/disdrodb.L0.readers.rst
rename to docs/source/api/disdrodb.l0.readers.rst
diff --git a/docs/source/api/disdrodb.L0.rst b/docs/source/api/disdrodb.l0.rst
similarity index 100%
rename from docs/source/api/disdrodb.L0.rst
rename to docs/source/api/disdrodb.l0.rst
diff --git a/docs/source/api/disdrodb.L1.rst b/docs/source/api/disdrodb.l1.rst
similarity index 100%
rename from docs/source/api/disdrodb.L1.rst
rename to docs/source/api/disdrodb.l1.rst
diff --git a/docs/source/api/disdrodb.L2.rst b/docs/source/api/disdrodb.l2.rst
similarity index 100%
rename from docs/source/api/disdrodb.L2.rst
rename to docs/source/api/disdrodb.l2.rst
diff --git a/docs/source/reader_preparation.ipynb b/docs/source/reader_preparation.ipynb
index 9fab0c65..321322a1 100644
--- a/docs/source/reader_preparation.ipynb
+++ b/docs/source/reader_preparation.ipynb
@@ -2,28 +2,24 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "e1a34600",
-   "metadata": {},
    "source": [
     "# Step-by-step guide for DISDRODB reader preparation "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "546ca031",
-   "metadata": {},
    "source": [
     "This notebook aims to guide you through creating the reader for the raw files logged by a disdrometer device. \n",
     "\n",
     "In first place, this notebook will provide you with functions that will display and enable to investigate the content of your raw data files.\n",
     "\n",
     "Successively, you will define a series of parameters defining the reader behaviour. These pieces of code will be consolidated in the [`reader_template.py`](https://github.com/ltelab/disdrodb/blob/main/disdrodb/L0/readers/reader_template.py) file to generate a DISDRODB L0 reader.\n"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "1103b734",
-   "metadata": {},
    "source": [
     "In this notebook, we uses a lightweight dataset for illustratory purposes. You may use it and readapt it for exploring your own dataset, when preparing a new reader. \n",
     "\n",
@@ -32,20 +28,18 @@
     "* Step 1 : We set up the data within the correct directory structure\n",
     "* Step 2 : We start digging into the data to set up the transformation parameters.\n",
     "* Step 3 : We create the new reader"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "7327d18c",
-   "metadata": {},
    "source": [
     "## Step 1: Set up the data within the correct directory structure"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "68c9fee7",
-   "metadata": {},
    "source": [
     "For this example, you will find the sample data in the folder [`data`](https://github.com/ltelab/disdrodb/tree/main/data/DISDRODB) of the [disdrodb](https://github.com/ltelab/disdrodb/) repository. \n",
     "It corresponds to some measurements taken at two stations (`station_name_1` and `station_name_2`) during two days of a field campaign led by the EPFL LTE laboratory.\n",
@@ -72,51 +66,38 @@
     "```\n",
     "\n",
     "This structure fulfills the requirements described in the documentation to [Add a new reader](https://disdrodb.readthedocs.io/en/latest/readers.html#adding-a-new-reader).\n"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "8dfbc56d",
-   "metadata": {},
    "source": [
     "## Step 2: Read and analyse the data"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "51e00d83",
-   "metadata": {},
    "source": [
     "Once the dataset and metadata are set up in the correct directory structure, we can now start analysing our data. \n",
     "\n",
     "The objectives of Step 2 is to define the specifications to read the raw data into a dataframe and ensure that the dataframe columns match the DISDRODB standards.\n",
     "\n",
     "At the end, you should be able to generate Apache Parquet files from your input raw data. \n"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "b0a1ff68",
-   "metadata": {},
    "source": [
     "--------------------------------------------------------------------\n",
     "Here we load the modules and packages required. *Nothing must be changed here*. "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "1053ef28",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "/home/ghiggi/Projects/disdrodb\n"
-     ]
-    }
-   ],
    "source": [
     "# Define project root directory\n",
     "import os\n",
@@ -125,27 +106,33 @@
     "    os.getcwd()\n",
     ")  # something like /home/ghiggi/Projects/disdrodb\n",
     "print(root_path)"
-   ]
+   ],
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      "/home/ghiggi/Projects/disdrodb\n"
+     ]
+    }
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 3,
-   "id": "efd4f3ed",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "# If you didn't installed disdrodb, but you are running this tutorial within the cloned repository:\n",
     "import sys\n",
     "\n",
     "sys.path.insert(0, root_path)"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 4,
-   "id": "1541060f",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "import logging\n",
     "import pandas as pd\n",
@@ -172,7 +159,7 @@
     ")\n",
     "\n",
     "# L0A processing\n",
-    "from disdrodb.l0.L0A_processing import (\n",
+    "from disdrodb.l0.l0a_processing import (\n",
     "    read_raw_data,\n",
     "    read_raw_file_list,\n",
     "    cast_column_dtypes,\n",
@@ -180,7 +167,7 @@
     ")\n",
     "\n",
     "# L0B processing\n",
-    "from disdrodb.l0.L0B_processing import (\n",
+    "from disdrodb.l0.l0b_processing import (\n",
     "    retrieve_l0b_arrays,\n",
     "    create_l0b_from_l0a,\n",
     "    set_encodings,\n",
@@ -191,12 +178,12 @@
     "\n",
     "# Standards\n",
     "from disdrodb.l0.check_standards import check_sensor_name, check_l0a_column_names"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "83f4cb20",
-   "metadata": {},
    "source": [
     "**1. Define paths and running parameters**\n",
     "\n",
@@ -205,90 +192,82 @@
     "NB:\n",
     "- In the real use case, the `DATA_SOURCE` and `CAMPAIGN_NAME`should be replaced by meaningul names ! \n",
     "- The `raw_dir` and `processed_dir` must end with the same `CAMPAIGN_NAME` (in upper case format)"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 13,
-   "id": "b5141e00",
-   "metadata": {},
+   "source": [
+    "disdrodb_dir = os.path.join(root_path, \"data\", \"DISDRODB\")\n",
+    "raw_dir = os.path.join(disdrodb_dir, \"Raw\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n",
+    "processed_dir = os.path.join(disdrodb_dir, \"Processed\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n",
+    "assert os.path.exists(raw_dir), \"Raw directory does not exist\"\n",
+    "print(f\"raw_dir: {raw_dir}\")\n",
+    "print(f\"processed_dir: {processed_dir}\")"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "raw_dir: /home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME\n",
       "processed_dir: /home/ghiggi/Projects/disdrodb/data/DISDRODB/Processed/DATA_SOURCE/CAMPAIGN_NAME\n"
      ]
     }
    ],
-   "source": [
-    "disdrodb_dir = os.path.join(root_path, \"data\", \"DISDRODB\")\n",
-    "raw_dir = os.path.join(disdrodb_dir, \"Raw\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n",
-    "processed_dir = os.path.join(disdrodb_dir, \"Processed\", \"DATA_SOURCE\", \"CAMPAIGN_NAME\")\n",
-    "assert os.path.exists(raw_dir), \"Raw directory does not exist\"\n",
-    "print(f\"raw_dir: {raw_dir}\")\n",
-    "print(f\"processed_dir: {processed_dir}\")"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "95a2efee",
-   "metadata": {},
    "source": [
     "Then we define the reader execution parameters. When the new reader will be created, these parameters will be become the reader function arguments. Please have a look [at the documentation](https://disdrodb.readthedocs.io/en/latest/readers.html#runing-a-reader) to get a full description. "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 15,
-   "id": "fcef471a",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "force = True\n",
     "parallel = False\n",
     "verbose = True\n",
     "debugging_mode = True\n",
     "sensor_name = \"OTT_Parsivel\""
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "e1e69858",
-   "metadata": {},
    "source": [
     "**3. Selection of the station**\n",
     "\n",
     "In this example, we choose  to implement and run the reader for station `station_name_1`. However, feel free to change the station name :)"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 16,
-   "id": "34e43ceb",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "station_name = \"station_name_1\""
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "b0228129",
-   "metadata": {},
    "source": [
     "**2. Initialization**\n",
     "\n",
     "We initiate some checks, and get some variable. *Nothing must be changed here.*"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 17,
-   "id": "31a948ae",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "# Create directory structure\n",
     "create_initial_directory_structure(\n",
@@ -298,43 +277,31 @@
     "    force=force,\n",
     "    verbose=False,\n",
     ")"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "de7d4a3c",
-   "metadata": {},
    "source": [
     "Please, be sure to run the cell above only one time. If it is run many times, the log file blocks the folder creation.  "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "ef0074c9",
-   "metadata": {},
    "source": [
     "**4. Get the list of file to process**\n",
     "\n",
     "We now list all files that are in selected station.\n",
     "Here we need to specify the [glob pattern](https://en.wikipedia.org/wiki/Glob_(programming)) that enables to select all the relevant data files. \n",
     "Since the files in this case study are named like `file<XXX>_<TIME>.dat.gz`, we define the glob pattern `\"*.dat*\"`. Note that also `\"*.dat.gz\"` or `\"file*.dat.gz\"` would have worked.\n"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 18,
-   "id": "5bace37d",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      " -  - 2 files to process in /home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME/data/station_name_1\n",
-      "['/home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME/data/station_name_1/file60_20180817.dat.gz', '/home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME/data/station_name_1/file60_20180818.dat.gz']\n"
-     ]
-    }
-   ],
    "source": [
     "glob_pattern = \"*.dat*\"\n",
     "\n",
@@ -347,36 +314,42 @@
     ")\n",
     "\n",
     "print(file_list)"
-   ]
+   ],
+   "outputs": [
+    {
+     "output_type": "stream",
+     "name": "stdout",
+     "text": [
+      " -  - 2 files to process in /home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME/data/station_name_1\n",
+      "['/home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME/data/station_name_1/file60_20180817.dat.gz', '/home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME/data/station_name_1/file60_20180818.dat.gz']\n"
+     ]
+    }
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "1ba6bb6f",
-   "metadata": {},
    "source": [
     "🚨 The `glob_pattern` variable definition will be transferred into the [`reader_template.py`](https://github.com/ltelab/disdrodb/blob/main/disdrodb/L0/readers/reader_template.py) file at the end of this notebook.\n",
     "\n",
     "Remember that the `glob_pattern` variable depends on the file extensions of your dataset !!!"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "ddb42a48",
-   "metadata": {},
    "source": [
     "**5. Retrieve metadata from YAML files**\n",
     "\n",
     "We now load the metadata file of the station.\n",
     "\n",
     "If the name of the station is not correctly defined, an error message is raised."
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 19,
-   "id": "293c8a91",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "# Retrieve metadata\n",
     "attrs = read_metadata(campaign_dir=raw_dir, station_name=station_name)\n",
@@ -384,33 +357,145 @@
     "# Retrieve sensor name\n",
     "sensor_name = attrs[\"sensor_name\"]\n",
     "check_sensor_name(sensor_name)"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "fa1cc737",
-   "metadata": {},
    "source": [
     "**5. Load the one file into a dataframe**\n",
     "\n",
     "In the  `reader_kwargs` dictionary, you may set [any arguments](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) that need to be passed to read the raw text file into a `pandas.DataFrame`."
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 20,
-   "id": "06e74312",
-   "metadata": {},
+   "source": [
+    "reader_kwargs = {}\n",
+    "\n",
+    "# - Define delimiter\n",
+    "reader_kwargs[\"delimiter\"] = \",\"\n",
+    "\n",
+    "# - Avoid first column to become df index !!!\n",
+    "reader_kwargs[\"index_col\"] = False\n",
+    "\n",
+    "# Since column names are expected to be passed explicitly, header is set to None\n",
+    "reader_kwargs[\"header\"] = None\n",
+    "\n",
+    "# - Number of rows to be skipped at the beginning of the file\n",
+    "reader_kwargs[\"skiprows\"] = None\n",
+    "\n",
+    "# - Define behaviour when encountering bad lines\n",
+    "reader_kwargs[\"on_bad_lines\"] = \"skip\"\n",
+    "\n",
+    "# - Define reader engine\n",
+    "#   - C engine is faster\n",
+    "#   - Python engine is more feature-complete\n",
+    "reader_kwargs[\"engine\"] = \"python\"\n",
+    "\n",
+    "# - Define on-the-fly decompression of on-disk data\n",
+    "#   - Available: gzip, bz2, zip\n",
+    "reader_kwargs[\"compression\"] = \"infer\"\n",
+    "\n",
+    "# - Strings to recognize as NA/NaN and replace with standard NA flags\n",
+    "#   - Already included: ‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’,\n",
+    "#                       ‘-NaN’, ‘-nan’, ‘1.#IND’, ‘1.#QNAN’, ‘<NA>’, ‘N/A’,\n",
+    "#                       ‘NA’, ‘NULL’, ‘NaN’, ‘n/a’, ‘nan’, ‘null’\n",
+    "reader_kwargs[\"na_values\"] = [\"na\", \"\", \"error\"]\n",
+    "\n",
+    "\n",
+    "# -----------------------------------------------------------\n",
+    "# Select first file\n",
+    "filepath = file_list[0]\n",
+    "\n",
+    "# Try to read the raw file\n",
+    "df_raw = read_raw_data(filepath, column_names=None, reader_kwargs=reader_kwargs)\n",
+    "# Print the dataframe\n",
+    "print(f\"Dataframe for the file {os.path.basename(filepath)} :\")\n",
+    "display(df_raw)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "Dataframe for the file file60_20180817.dat.gz :\n"
      ]
     },
     {
+     "output_type": "display_data",
      "data": {
+      "text/plain": [
+       "          0          1           2                    3    4   5         6   \\\n",
+       "0     362511  4612.0301  00847.4977  01-08-2018 12:44:30  NaN  OK  0000.000   \n",
+       "1     362512  4612.0301  00847.4978  01-08-2018 12:45:01  NaN  OK  0000.000   \n",
+       "2     362513  4612.0301  00847.4985  01-08-2018 12:45:30  NaN  OK  0000.000   \n",
+       "3     362514  4612.0305  00847.4990  01-08-2018 12:46:01  NaN  OK  0000.000   \n",
+       "4     362515  4612.0303  00847.4992  01-08-2018 12:46:31  NaN  OK  0000.000   \n",
+       "...      ...        ...         ...                  ...  ...  ..       ...   \n",
+       "4736  367249  4612.0313  00847.4956  03-08-2018 04:13:25  NaN  OK  0000.000   \n",
+       "4737  367250  4612.0313  00847.4955  03-08-2018 04:13:56  NaN  OK  0000.000   \n",
+       "4738  367251  4612.0313  00847.4955  03-08-2018 04:14:26  NaN  OK  0000.000   \n",
+       "4739  367252  4612.0313  00847.4954  03-08-2018 04:14:55  NaN  OK  0000.000   \n",
+       "4740  367253  4612.0313  00847.4954  03-08-2018 04:15:25  NaN  OK  0000.000   \n",
+       "\n",
+       "           7   8   9   ...   14    15    16 17       18   19  \\\n",
+       "0     0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
+       "1     0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
+       "2     0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
+       "3     0056.49  00  00  ...  035  0.05  24.9  0  005.649  000   \n",
+       "4     0056.49  00  00  ...  034  0.06  24.9  0  005.649  000   \n",
+       "...       ...  ..  ..  ...  ...   ...   ... ..      ...  ...   \n",
+       "4736  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
+       "4737  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
+       "4738  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
+       "4739  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
+       "4740  0056.71  00  00  ...  015  0.07  24.9  0  005.671  000   \n",
+       "\n",
+       "                                                     20  \\\n",
+       "0     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "1     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "2     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "3     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "...                                                 ...   \n",
+       "4736  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4737  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4738  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4739  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4740  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "\n",
+       "                                                     21  \\\n",
+       "0     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "1     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "2     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "3     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "...                                                 ...   \n",
+       "4736  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4737  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4738  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4739  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4740  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "\n",
+       "                                                     22 23  \n",
+       "0     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "1     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "2     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "3     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "4     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "...                                                 ... ..  \n",
+       "4736  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "4737  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "4738  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "4739  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "4740  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "\n",
+       "[4741 rows x 24 columns]"
+      ],
       "text/html": [
        "<div>\n",
        "<style scoped>\n",
@@ -722,134 +807,24 @@
        "</table>\n",
        "<p>4741 rows × 24 columns</p>\n",
        "</div>"
-      ],
-      "text/plain": [
-       "          0          1           2                    3    4   5         6   \\\n",
-       "0     362511  4612.0301  00847.4977  01-08-2018 12:44:30  NaN  OK  0000.000   \n",
-       "1     362512  4612.0301  00847.4978  01-08-2018 12:45:01  NaN  OK  0000.000   \n",
-       "2     362513  4612.0301  00847.4985  01-08-2018 12:45:30  NaN  OK  0000.000   \n",
-       "3     362514  4612.0305  00847.4990  01-08-2018 12:46:01  NaN  OK  0000.000   \n",
-       "4     362515  4612.0303  00847.4992  01-08-2018 12:46:31  NaN  OK  0000.000   \n",
-       "...      ...        ...         ...                  ...  ...  ..       ...   \n",
-       "4736  367249  4612.0313  00847.4956  03-08-2018 04:13:25  NaN  OK  0000.000   \n",
-       "4737  367250  4612.0313  00847.4955  03-08-2018 04:13:56  NaN  OK  0000.000   \n",
-       "4738  367251  4612.0313  00847.4955  03-08-2018 04:14:26  NaN  OK  0000.000   \n",
-       "4739  367252  4612.0313  00847.4954  03-08-2018 04:14:55  NaN  OK  0000.000   \n",
-       "4740  367253  4612.0313  00847.4954  03-08-2018 04:15:25  NaN  OK  0000.000   \n",
-       "\n",
-       "           7   8   9   ...   14    15    16 17       18   19  \\\n",
-       "0     0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
-       "1     0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
-       "2     0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
-       "3     0056.49  00  00  ...  035  0.05  24.9  0  005.649  000   \n",
-       "4     0056.49  00  00  ...  034  0.06  24.9  0  005.649  000   \n",
-       "...       ...  ..  ..  ...  ...   ...   ... ..      ...  ...   \n",
-       "4736  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
-       "4737  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
-       "4738  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
-       "4739  0056.71  00  00  ...  015  0.06  24.9  0  005.671  000   \n",
-       "4740  0056.71  00  00  ...  015  0.07  24.9  0  005.671  000   \n",
-       "\n",
-       "                                                     20  \\\n",
-       "0     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "1     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "2     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "3     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "...                                                 ...   \n",
-       "4736  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4737  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4738  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4739  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4740  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "\n",
-       "                                                     21  \\\n",
-       "0     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "1     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "2     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "3     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "...                                                 ...   \n",
-       "4736  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4737  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4738  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4739  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4740  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "\n",
-       "                                                     22 23  \n",
-       "0     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "1     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "2     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "3     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "4     000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "...                                                 ... ..  \n",
-       "4736  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "4737  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "4738  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "4739  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "4740  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "\n",
-       "[4741 rows x 24 columns]"
       ]
      },
-     "metadata": {},
-     "output_type": "display_data"
+     "metadata": {}
     }
    ],
-   "source": [
-    "reader_kwargs = {}\n",
-    "\n",
-    "# - Define delimiter\n",
-    "reader_kwargs[\"delimiter\"] = \",\"\n",
-    "\n",
-    "# - Avoid first column to become df index !!!\n",
-    "reader_kwargs[\"index_col\"] = False\n",
-    "\n",
-    "# Since column names are expected to be passed explicitly, header is set to None\n",
-    "reader_kwargs[\"header\"] = None\n",
-    "\n",
-    "# - Number of rows to be skipped at the beginning of the file\n",
-    "reader_kwargs[\"skiprows\"] = None\n",
-    "\n",
-    "# - Define behaviour when encountering bad lines\n",
-    "reader_kwargs[\"on_bad_lines\"] = \"skip\"\n",
-    "\n",
-    "# - Define reader engine\n",
-    "#   - C engine is faster\n",
-    "#   - Python engine is more feature-complete\n",
-    "reader_kwargs[\"engine\"] = \"python\"\n",
-    "\n",
-    "# - Define on-the-fly decompression of on-disk data\n",
-    "#   - Available: gzip, bz2, zip\n",
-    "reader_kwargs[\"compression\"] = \"infer\"\n",
-    "\n",
-    "# - Strings to recognize as NA/NaN and replace with standard NA flags\n",
-    "#   - Already included: ‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’,\n",
-    "#                       ‘-NaN’, ‘-nan’, ‘1.#IND’, ‘1.#QNAN’, ‘<NA>’, ‘N/A’,\n",
-    "#                       ‘NA’, ‘NULL’, ‘NaN’, ‘n/a’, ‘nan’, ‘null’\n",
-    "reader_kwargs[\"na_values\"] = [\"na\", \"\", \"error\"]\n",
-    "\n",
-    "\n",
-    "# -----------------------------------------------------------\n",
-    "# Select first file\n",
-    "filepath = file_list[0]\n",
-    "\n",
-    "# Try to read the raw file\n",
-    "df_raw = read_raw_data(filepath, column_names=None, reader_kwargs=reader_kwargs)\n",
-    "# Print the dataframe\n",
-    "print(f\"Dataframe for the file {os.path.basename(filepath)} :\")\n",
-    "display(df_raw)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 22,
-   "id": "8467d8cc",
-   "metadata": {},
+   "source": [
+    "print(\"Column names:\", df_raw.columns)\n",
+    "print(\"Row Index:\", df_raw.index)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "Column names: Int64Index([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,\n",
       "            17, 18, 19, 20, 21, 22, 23],\n",
@@ -858,37 +833,30 @@
      ]
     }
    ],
-   "source": [
-    "print(\"Column names:\", df_raw.columns)\n",
-    "print(\"Row Index:\", df_raw.index)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "c15137c3",
-   "metadata": {},
    "source": [
     "Here we expect the `df_raw` to have: \n",
     "- numeric column names (i.e.  `Int64Index`) \n",
     "- numeric row index (i.e. `RangeIndex`)  \n",
     "\n",
     "If the structure of the dataframe looks fine (no header and no row index), we are on the good track ! \n"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "4abbee26",
-   "metadata": {},
    "source": [
     "Depending on the schema of your data, this `reader_kwargs` dictionary may be fairly different from the one above. \n",
     "\n",
     "> 🚨 The `reader_kwargs` dictionary will be transferred to the [`reader_template.py`](https://github.com/ltelab/disdrodb/blob/main/disdrodb/L0/readers/reader_template.py) file at the end of this notebook. "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "d840a872",
-   "metadata": {},
    "source": [
     "**6. Data exploration**\n",
     "\n",
@@ -899,26 +867,27 @@
     "* Do not assign column names to the dataframe columns yet\n",
     "* Do not assign a dtype to the dataframe columns yet\n",
     "* Possibly look at multiple files ;)\n"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "e2fac36b",
-   "metadata": {},
    "source": [
     "We print the content first 3 rows :\n",
     " (*Feel free to change the value of n to see more/less rows*)"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 23,
-   "id": "e93889f2",
-   "metadata": {},
+   "source": [
+    "print_df_first_n_rows(df_raw, n=2, column_names=False)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - Column 0 :\n",
       "      ['362511' '362512' '362513']\n",
@@ -977,18 +946,46 @@
      ]
     }
    ],
-   "source": [
-    "print_df_first_n_rows(df_raw, n=2, column_names=False)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 24,
-   "id": "c7f0b258",
-   "metadata": {},
+   "source": [
+    "df_raw.head(3)"
+   ],
    "outputs": [
     {
+     "output_type": "execute_result",
      "data": {
+      "text/plain": [
+       "       0          1           2                    3    4   5         6   \\\n",
+       "0  362511  4612.0301  00847.4977  01-08-2018 12:44:30  NaN  OK  0000.000   \n",
+       "1  362512  4612.0301  00847.4978  01-08-2018 12:45:01  NaN  OK  0000.000   \n",
+       "2  362513  4612.0301  00847.4985  01-08-2018 12:45:30  NaN  OK  0000.000   \n",
+       "\n",
+       "        7   8   9   ...   14    15    16 17       18   19  \\\n",
+       "0  0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
+       "1  0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
+       "2  0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
+       "\n",
+       "                                                  20  \\\n",
+       "0  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "1  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "2  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "\n",
+       "                                                  21  \\\n",
+       "0  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "1  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "2  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "\n",
+       "                                                  22 23  \n",
+       "0  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "1  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "2  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
+       "\n",
+       "[3 rows x 24 columns]"
+      ],
       "text/html": [
        "<div>\n",
        "<style scoped>\n",
@@ -1108,62 +1105,31 @@
        "</table>\n",
        "<p>3 rows × 24 columns</p>\n",
        "</div>"
-      ],
-      "text/plain": [
-       "       0          1           2                    3    4   5         6   \\\n",
-       "0  362511  4612.0301  00847.4977  01-08-2018 12:44:30  NaN  OK  0000.000   \n",
-       "1  362512  4612.0301  00847.4978  01-08-2018 12:45:01  NaN  OK  0000.000   \n",
-       "2  362513  4612.0301  00847.4985  01-08-2018 12:45:30  NaN  OK  0000.000   \n",
-       "\n",
-       "        7   8   9   ...   14    15    16 17       18   19  \\\n",
-       "0  0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
-       "1  0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
-       "2  0056.49  00  00  ...  035  0.06  24.9  0  005.649  000   \n",
-       "\n",
-       "                                                  20  \\\n",
-       "0  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "1  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "2  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "\n",
-       "                                                  21  \\\n",
-       "0  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "1  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "2  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "\n",
-       "                                                  22 23  \n",
-       "0  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "1  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "2  000,000,000,000,000,000,000,000,000,000,000,00...  0  \n",
-       "\n",
-       "[3 rows x 24 columns]"
       ]
      },
-     "execution_count": 24,
      "metadata": {},
-     "output_type": "execute_result"
+     "execution_count": 24
     }
    ],
-   "source": [
-    "df_raw.head(3)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "a0d34466",
-   "metadata": {},
    "source": [
     "We print the content of n rows picked randomly : "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 25,
-   "id": "470c735b",
-   "metadata": {},
+   "source": [
+    "print_df_random_n_rows(df_raw, n=6, with_column_names=False)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "- Column 0 : ['365205' '363869' '366700' '366371' '366659' '363330']\n",
       "- Column 1 : ['4612.0319' '4612.0293' '4612.0293' '4612.0312' '4612.0305' '4612.0328']\n",
@@ -1209,85 +1175,79 @@
      ]
     }
    ],
-   "source": [
-    "print_df_random_n_rows(df_raw, n=6, with_column_names=False)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "a1c012d8",
-   "metadata": {},
    "source": [
     "Get the number of column :"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 26,
-   "id": "1f7cb7cd",
-   "metadata": {},
+   "source": [
+    "len(df_raw.columns)"
+   ],
    "outputs": [
     {
+     "output_type": "execute_result",
      "data": {
       "text/plain": [
        "24"
       ]
      },
-     "execution_count": 26,
      "metadata": {},
-     "output_type": "execute_result"
+     "execution_count": 26
     }
    ],
-   "source": [
-    "len(df_raw.columns)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "2a29ab66",
-   "metadata": {},
    "source": [
     "Look at unique values for a single column :"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 27,
-   "id": "afefbf9e",
-   "metadata": {},
+   "source": [
+    "print_df_columns_unique_values(df_raw, column_indices=11, column_names=False)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - Column 11 :\n",
       "      ['0824', '0906', '1363', '1397', '2921', '3203', '3326', '3816', '4465', '9999']\n"
      ]
     }
    ],
-   "source": [
-    "print_df_columns_unique_values(df_raw, column_indices=11, column_names=False)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "1fd596dd",
-   "metadata": {},
    "source": [
     "Look at unique values for a few columns :\n",
     "\n",
     "Note: Use `column_indices=None` to get the unique values for all columns"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 28,
-   "id": "c2a2141d",
-   "metadata": {},
+   "source": [
+    "print_df_columns_unique_values(df_raw, column_indices=slice(10, 12), column_names=False)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - Column 10 :\n",
       "      ['-9.999', '02.669', '04.241', '04.745', '04.826', '04.879', '05.430', '06.095', '06.220', '07.415', '08.436', '08.489', '08.506', '08.724', '08.956', '09.079', '09.894', '10.057', '10.567', '11.705', '12.097', '12.390', '12.923', '13.114', '13.407', '13.684', '14.324', '15.060', '16.530', '16.636', '16.668', '17.194', '17.382', '17.829', '17.918', '18.334', '18.655', '19.526', '20.329', '21.134', '21.426', '23.098', '23.664', '23.760', '24.472', '25.473', '25.957', '29.270', '31.271', '32.255', '33.844', '36.196']\n",
@@ -1296,25 +1256,26 @@
      ]
     }
    ],
-   "source": [
-    "print_df_columns_unique_values(df_raw, column_indices=slice(10, 12), column_names=False)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "520d948d",
-   "metadata": {},
    "source": [
     "Get the unique values as dictionary"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 29,
-   "id": "f6e178a1",
-   "metadata": {},
+   "source": [
+    "get_df_columns_unique_values_dict(\n",
+    "    df_raw, column_indices=slice(10, 12), column_names=False\n",
+    ")"
+   ],
    "outputs": [
     {
+     "output_type": "execute_result",
      "data": {
       "text/plain": [
        "{'Column 10': ['-9.999',\n",
@@ -1381,42 +1342,37 @@
        "  '9999']}"
       ]
      },
-     "execution_count": 29,
      "metadata": {},
-     "output_type": "execute_result"
+     "execution_count": 29
     }
    ],
-   "source": [
-    "get_df_columns_unique_values_dict(\n",
-    "    df_raw, column_indices=slice(10, 12), column_names=False\n",
-    ")"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "ec2b51d7",
-   "metadata": {},
    "source": [
     "**7. Columns name**\n",
     "\n",
     "Now we have validated the content of our data. It's time to care about its structure (column names). "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "045d1826",
-   "metadata": {},
    "source": [
     "The function `infer_df_str_column_names()` tries to guess the column name based on string patterns according to `L0A_encodings.yml` and the type of sensor."
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 30,
-   "id": "d8959784",
-   "metadata": {},
+   "source": [
+    "infer_df_str_column_names(df_raw, sensor_name=sensor_name)"
+   ],
    "outputs": [
     {
+     "output_type": "execute_result",
      "data": {
       "text/plain": [
        "{0: [],\n",
@@ -1445,47 +1401,40 @@
        " 23: ['sensor_status']}"
       ]
      },
-     "execution_count": 30,
      "metadata": {},
-     "output_type": "execute_result"
+     "execution_count": 30
     }
    ],
-   "source": [
-    "infer_df_str_column_names(df_raw, sensor_name=sensor_name)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "7e34acf1",
-   "metadata": {},
    "source": [
     "This can help us to define later the `column_names` list.\n",
     "\n",
     "As reference, here is the list of valid columns name (taken from `L0A_encodings.yml`):"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 31,
-   "id": "612d7c6f",
-   "metadata": {},
+   "source": [
+    "print_valid_L0_column_names(sensor_name)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "['rainfall_rate_32bit', 'rainfall_accumulated_32bit', 'weather_code_synop_4680', 'weather_code_synop_4677', 'weather_code_metar_4678', 'weather_code_nws', 'reflectivity_32bit', 'mor_visibility', 'sample_interval', 'laser_amplitude', 'number_particles', 'sensor_temperature', 'sensor_serial_number', 'firmware_iop', 'firmware_dsp', 'sensor_heating_current', 'sensor_battery_voltage', 'sensor_status', 'start_time', 'sensor_time', 'sensor_date', 'station_name', 'station_number', 'rainfall_amount_absolute_32bit', 'error_code', 'rainfall_rate_16bit', 'rainfall_rate_12bit', 'rainfall_accumulated_16bit', 'reflectivity_16bit', 'raw_drop_concentration', 'raw_drop_average_velocity', 'raw_drop_number']\n"
      ]
     }
    ],
-   "source": [
-    "print_valid_L0_column_names(sensor_name)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "b97019e1",
-   "metadata": {},
    "source": [
     "It's time now to define our current column names : \n",
     "\n",
@@ -1493,14 +1442,12 @@
     "* get information from the disdrometer user guide and the data logger employed. \n",
     "* use `infer_df_str_column_names()` to help you\n",
     "* analyse the content column after column with `print_df_columns_unique_values()`  "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 32,
-   "id": "5063d300",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "column_names = [\n",
     "    \"unknown1\",\n",
@@ -1528,33 +1475,34 @@
     "    \"raw_drop_number\",\n",
     "    \"unknown6\",\n",
     "]"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "e0e04424",
-   "metadata": {},
    "source": [
     "> 🚨 The `column_names` list will be transferred  to the [reader_template.py](https://github.com/ltelab/disdrodb/blob/main/disdrodb/L0/readers/reader_template.py) file at the end of this notebook. "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "1d74b116",
-   "metadata": {},
    "source": [
     "Check the validity of your definition "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 33,
-   "id": "28b1845c",
-   "metadata": {},
+   "source": [
+    "check_column_names(column_names, sensor_name)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "The following columns do no met the DISDRODB standards: ['unknown2', 'timestep', 'unknown4', 'unknown1', 'unknown6', 'unknown3', 'unknown5'].\n",
       "Please remove such columns within the df_sanitizer_fun\n",
@@ -1563,60 +1511,55 @@
      ]
     }
    ],
-   "source": [
-    "check_column_names(column_names, sensor_name)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "591f7651",
-   "metadata": {},
    "source": [
     "Ok, fair enough.\n",
     "There are columns that need to be removed, and we need to also define a column \"time\" with dtype `datetime` to meet the DISDRODB standards.\n",
     "\n",
     "These points will be addressed in Section 9 of this notebook ! "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "a400794a",
-   "metadata": {},
    "source": [
     "**8. Read the dataframe with correct columns name**\n",
     "\n",
     "We can now create a new dataframe with the columns name :"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 34,
-   "id": "0d1ffc85",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "df = read_raw_data(\n",
     "    filepath=filepath, column_names=column_names, reader_kwargs=reader_kwargs\n",
     ")"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "907c6ebc",
-   "metadata": {},
    "source": [
     "And print the dataframe column names : "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 35,
-   "id": "7e829beb",
-   "metadata": {},
+   "source": [
+    "print_df_column_names(df)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - Column 0 : unknown1\n",
       " - Column 1 : unknown2\n",
@@ -1645,35 +1588,34 @@
      ]
     }
    ],
-   "source": [
-    "print_df_column_names(df)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "4066c4bf",
-   "metadata": {},
    "source": [
     "**9. Perform further tests and analysis to check the correctness of `column_names`**"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "2fc28539",
-   "metadata": {},
    "source": [
     "You can for example check some statistics for a specific column."
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 36,
-   "id": "29ec6e8c",
-   "metadata": {},
+   "source": [
+    "column_name = \"rainfall_rate_32bit\"\n",
+    "array_of_values = df.loc[:, [column_name]].astype(\"float\")\n",
+    "print_df_summary_stats(array_of_values)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - Column 0 ( rainfall_rate_32bit ):\n",
       "                    \n",
@@ -1686,37 +1628,33 @@
      ]
     }
    ],
-   "source": [
-    "column_name = \"rainfall_rate_32bit\"\n",
-    "array_of_values = df.loc[:, [column_name]].astype(\"float\")\n",
-    "print_df_summary_stats(array_of_values)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "06268e19",
-   "metadata": {},
    "source": [
     "**10. Final columns formatting**"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 37,
-   "id": "9f3febec",
-   "metadata": {},
+   "source": [
+    "check_l0a_column_names(df, sensor_name=sensor_name)"
+   ],
    "outputs": [
     {
-     "name": "stderr",
      "output_type": "stream",
+     "name": "stderr",
      "text": [
       "The following columns do no met the DISDRODB standards: ['unknown2', 'timestep', 'unknown4', 'unknown1', 'unknown6', 'unknown3', 'unknown5']\n"
      ]
     },
     {
+     "output_type": "error",
      "ename": "ValueError",
      "evalue": "The following columns do no met the DISDRODB standards: ['unknown2', 'timestep', 'unknown4', 'unknown1', 'unknown6', 'unknown3', 'unknown5']",
-     "output_type": "error",
      "traceback": [
       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
       "\u001b[0;31mValueError\u001b[0m                                Traceback (most recent call last)",
@@ -1726,19 +1664,18 @@
      ]
     }
    ],
-   "source": [
-    "check_l0a_column_names(df, sensor_name=sensor_name)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 38,
-   "id": "e2096f88",
-   "metadata": {},
+   "source": [
+    "check_column_names(column_names, sensor_name)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "The following columns do no met the DISDRODB standards: ['unknown2', 'timestep', 'unknown4', 'unknown1', 'unknown6', 'unknown3', 'unknown5'].\n",
       "Please remove such columns within the df_sanitizer_fun\n",
@@ -1747,84 +1684,76 @@
      ]
     }
    ],
-   "source": [
-    "check_column_names(column_names, sensor_name)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "84da87fe",
-   "metadata": {},
    "source": [
     "Now, it's time to remove all the columns that does not match the DISDRODB standard."
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 39,
-   "id": "bb368134",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "df = df.drop(\n",
     "    columns=[\"unknown1\", \"unknown2\", \"unknown3\", \"unknown4\", \"unknown5\", \"unknown6\"]\n",
     ")"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "040c283a",
-   "metadata": {},
    "source": [
     "It's also time to define the column `time` which is requested by the DISDRODB standard"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 40,
-   "id": "147f96d0",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "df[\"time\"] = pd.to_datetime(df[\"timestep\"], format=\"%m-%d-%Y %H:%M:%S\")\n",
     "df = df.drop(columns=[\"timestep\"])"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "e15c85b7",
-   "metadata": {},
    "source": [
     "Check column names met DISDRODB standards after custom processing :"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 41,
-   "id": "de6aa80e",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "check_l0a_column_names(df, sensor_name=sensor_name)"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "91587cca",
-   "metadata": {},
    "source": [
     "Check the dataframe looks as desired :"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 42,
-   "id": "d9b7e269",
-   "metadata": {},
+   "source": [
+    "print_df_column_names(df)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - Column 0 : rainfall_rate_32bit\n",
       " - Column 1 : rainfall_accumulated_32bit\n",
@@ -1847,19 +1776,18 @@
      ]
     }
    ],
-   "source": [
-    "print_df_column_names(df)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 43,
-   "id": "f06da442",
-   "metadata": {},
+   "source": [
+    "print_df_random_n_rows(df, n=5)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "- Column 0 (rainfall_rate_32bit) : ['0000.000' '0000.000' '0000.000' '0000.000' '0000.114']\n",
       "- Column 1 (rainfall_accumulated_32bit) : ['0056.67' '0056.52' '0056.67' '0056.71' '0056.67']\n",
@@ -1896,33 +1824,28 @@
      ]
     }
    ],
-   "source": [
-    "print_df_random_n_rows(df, n=5)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 44,
-   "id": "3f700c36",
-   "metadata": {},
+   "source": [
+    "print_df_columns_unique_values(df, column_indices=2, column_names=True)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - Column 2 ( weather_code_synop_4680 ):\n",
       "      ['00', '57', '61', '62', '71', '72', '88']\n"
      ]
     }
    ],
-   "source": [
-    "print_df_columns_unique_values(df, column_indices=2, column_names=True)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "0f3a7881",
-   "metadata": {},
    "source": [
     "**11. Define the dataframe sanitizer function**\n",
     "\n",
@@ -1931,14 +1854,12 @@
     "With the data used in this notebook, we need to drop some columns and define the `time` column ! \n",
     "\n",
     "From the code defined in Section 10, we define the following function: "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 45,
-   "id": "636f434c",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "def df_sanitizer_fun(df):\n",
     "    # Import pandas\n",
@@ -1962,30 +1883,28 @@
     "\n",
     "    # - Return the dataframe\n",
     "    return df"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "217a2627",
-   "metadata": {},
    "source": [
     "> 🚨 The `df_sanitizer_fun()` function will be transfered to the [reader_template.py](https://github.com/ltelab/disdrodb/blob/main/disdrodb/L0/readers/reader_template.py) file at the end of this notebook. "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "8561c1b5",
-   "metadata": {},
    "source": [
     "**12. Now let's try calling the reader function as it will be called in the DISDRODB L0 reader**\n",
     "\n",
     "* You may try with increasing number of files (update `file_list`)\n"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "d5829e79",
-   "metadata": {},
    "source": [
     "Here we combine all raw files in a single dataframe. \n",
     "\n",
@@ -1997,24 +1916,144 @@
     "* `df_sanitizer_fun`: the function to sanitize the data frame (defined previously)\n",
     "\n",
     "All these arguments are defined either in the data directory structure, or earlier in the code."
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 46,
-   "id": "b3d8d4c4",
-   "metadata": {},
+   "source": [
+    "subset_file_list = file_list[:1]\n",
+    "\n",
+    "df = read_raw_file_list(\n",
+    "    file_list=subset_file_list,\n",
+    "    column_names=column_names,\n",
+    "    reader_kwargs=reader_kwargs,\n",
+    "    sensor_name=sensor_name,\n",
+    "    verbose=verbose,\n",
+    "    df_sanitizer_fun=df_sanitizer_fun,\n",
+    ")\n",
+    "display(df)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       " - 1 / 1 processed successfully. File name: /home/ghiggi/Projects/disdrodb/data/DISDRODB/Raw/DATA_SOURCE/CAMPAIGN_NAME/data/station_name_1/file60_20180817.dat.gz\n",
       " -  - 0 of 1 have been skipped.\n"
      ]
     },
     {
+     "output_type": "display_data",
      "data": {
+      "text/plain": [
+       "      rainfall_rate_32bit  rainfall_accumulated_32bit  \\\n",
+       "0                     0.0                   56.490002   \n",
+       "1                     0.0                   56.490002   \n",
+       "2                     0.0                   56.490002   \n",
+       "3                     0.0                   56.490002   \n",
+       "4                     0.0                   56.490002   \n",
+       "...                   ...                         ...   \n",
+       "4736                  0.0                   56.709999   \n",
+       "4737                  0.0                   56.709999   \n",
+       "4738                  0.0                   56.709999   \n",
+       "4739                  0.0                   56.709999   \n",
+       "4740                  0.0                   56.709999   \n",
+       "\n",
+       "      weather_code_synop_4680  weather_code_synop_4677  reflectivity_32bit  \\\n",
+       "0                         0.0                      0.0              -9.999   \n",
+       "1                         0.0                      0.0              -9.999   \n",
+       "2                         0.0                      0.0              -9.999   \n",
+       "3                         0.0                      0.0              -9.999   \n",
+       "4                         0.0                      0.0              -9.999   \n",
+       "...                       ...                      ...                 ...   \n",
+       "4736                      0.0                      0.0              -9.999   \n",
+       "4737                      0.0                      0.0              -9.999   \n",
+       "4738                      0.0                      0.0              -9.999   \n",
+       "4739                      0.0                      0.0              -9.999   \n",
+       "4740                      0.0                      0.0              -9.999   \n",
+       "\n",
+       "      mor_visibility  laser_amplitude  number_particles  sensor_temperature  \\\n",
+       "0             9999.0          12611.0               0.0                35.0   \n",
+       "1             9999.0          12617.0               0.0                35.0   \n",
+       "2             9999.0          12600.0               0.0                35.0   \n",
+       "3             9999.0          12603.0               0.0                35.0   \n",
+       "4             9999.0          12606.0               0.0                34.0   \n",
+       "...              ...              ...               ...                 ...   \n",
+       "4736          9999.0          11059.0               0.0                15.0   \n",
+       "4737          9999.0          11175.0               0.0                15.0   \n",
+       "4738          9999.0          11275.0               0.0                15.0   \n",
+       "4739          9999.0          11361.0               0.0                15.0   \n",
+       "4740          9999.0          11492.0               0.0                15.0   \n",
+       "\n",
+       "      sensor_heating_current  sensor_battery_voltage  sensor_status  \\\n",
+       "0                       0.06                    24.9            0.0   \n",
+       "1                       0.06                    24.9            0.0   \n",
+       "2                       0.06                    24.9            0.0   \n",
+       "3                       0.05                    24.9            0.0   \n",
+       "4                       0.06                    24.9            0.0   \n",
+       "...                      ...                     ...            ...   \n",
+       "4736                    0.06                    24.9            0.0   \n",
+       "4737                    0.06                    24.9            0.0   \n",
+       "4738                    0.06                    24.9            0.0   \n",
+       "4739                    0.06                    24.9            0.0   \n",
+       "4740                    0.07                    24.9            0.0   \n",
+       "\n",
+       "      rainfall_amount_absolute_32bit  error_code  \\\n",
+       "0                              5.649         0.0   \n",
+       "1                              5.649         0.0   \n",
+       "2                              5.649         0.0   \n",
+       "3                              5.649         0.0   \n",
+       "4                              5.649         0.0   \n",
+       "...                              ...         ...   \n",
+       "4736                           5.671         0.0   \n",
+       "4737                           5.671         0.0   \n",
+       "4738                           5.671         0.0   \n",
+       "4739                           5.671         0.0   \n",
+       "4740                           5.671         0.0   \n",
+       "\n",
+       "                                 raw_drop_concentration  \\\n",
+       "0     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "1     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "2     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "3     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "...                                                 ...   \n",
+       "4736  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4737  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4738  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4739  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "4740  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
+       "\n",
+       "                              raw_drop_average_velocity  \\\n",
+       "0     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "1     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "2     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "3     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "...                                                 ...   \n",
+       "4736  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4737  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4738  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4739  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "4740  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
+       "\n",
+       "                                        raw_drop_number                time  \n",
+       "0     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:44:30  \n",
+       "1     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:45:01  \n",
+       "2     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:45:30  \n",
+       "3     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:46:01  \n",
+       "4     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:46:31  \n",
+       "...                                                 ...                 ...  \n",
+       "4736  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:13:25  \n",
+       "4737  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:13:56  \n",
+       "4738  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:14:26  \n",
+       "4739  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:14:55  \n",
+       "4740  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:15:25  \n",
+       "\n",
+       "[4741 rows x 18 columns]"
+      ],
       "text/html": [
        "<div>\n",
        "<style scoped>\n",
@@ -2290,150 +2329,31 @@
        "</table>\n",
        "<p>4741 rows × 18 columns</p>\n",
        "</div>"
-      ],
-      "text/plain": [
-       "      rainfall_rate_32bit  rainfall_accumulated_32bit  \\\n",
-       "0                     0.0                   56.490002   \n",
-       "1                     0.0                   56.490002   \n",
-       "2                     0.0                   56.490002   \n",
-       "3                     0.0                   56.490002   \n",
-       "4                     0.0                   56.490002   \n",
-       "...                   ...                         ...   \n",
-       "4736                  0.0                   56.709999   \n",
-       "4737                  0.0                   56.709999   \n",
-       "4738                  0.0                   56.709999   \n",
-       "4739                  0.0                   56.709999   \n",
-       "4740                  0.0                   56.709999   \n",
-       "\n",
-       "      weather_code_synop_4680  weather_code_synop_4677  reflectivity_32bit  \\\n",
-       "0                         0.0                      0.0              -9.999   \n",
-       "1                         0.0                      0.0              -9.999   \n",
-       "2                         0.0                      0.0              -9.999   \n",
-       "3                         0.0                      0.0              -9.999   \n",
-       "4                         0.0                      0.0              -9.999   \n",
-       "...                       ...                      ...                 ...   \n",
-       "4736                      0.0                      0.0              -9.999   \n",
-       "4737                      0.0                      0.0              -9.999   \n",
-       "4738                      0.0                      0.0              -9.999   \n",
-       "4739                      0.0                      0.0              -9.999   \n",
-       "4740                      0.0                      0.0              -9.999   \n",
-       "\n",
-       "      mor_visibility  laser_amplitude  number_particles  sensor_temperature  \\\n",
-       "0             9999.0          12611.0               0.0                35.0   \n",
-       "1             9999.0          12617.0               0.0                35.0   \n",
-       "2             9999.0          12600.0               0.0                35.0   \n",
-       "3             9999.0          12603.0               0.0                35.0   \n",
-       "4             9999.0          12606.0               0.0                34.0   \n",
-       "...              ...              ...               ...                 ...   \n",
-       "4736          9999.0          11059.0               0.0                15.0   \n",
-       "4737          9999.0          11175.0               0.0                15.0   \n",
-       "4738          9999.0          11275.0               0.0                15.0   \n",
-       "4739          9999.0          11361.0               0.0                15.0   \n",
-       "4740          9999.0          11492.0               0.0                15.0   \n",
-       "\n",
-       "      sensor_heating_current  sensor_battery_voltage  sensor_status  \\\n",
-       "0                       0.06                    24.9            0.0   \n",
-       "1                       0.06                    24.9            0.0   \n",
-       "2                       0.06                    24.9            0.0   \n",
-       "3                       0.05                    24.9            0.0   \n",
-       "4                       0.06                    24.9            0.0   \n",
-       "...                      ...                     ...            ...   \n",
-       "4736                    0.06                    24.9            0.0   \n",
-       "4737                    0.06                    24.9            0.0   \n",
-       "4738                    0.06                    24.9            0.0   \n",
-       "4739                    0.06                    24.9            0.0   \n",
-       "4740                    0.07                    24.9            0.0   \n",
-       "\n",
-       "      rainfall_amount_absolute_32bit  error_code  \\\n",
-       "0                              5.649         0.0   \n",
-       "1                              5.649         0.0   \n",
-       "2                              5.649         0.0   \n",
-       "3                              5.649         0.0   \n",
-       "4                              5.649         0.0   \n",
-       "...                              ...         ...   \n",
-       "4736                           5.671         0.0   \n",
-       "4737                           5.671         0.0   \n",
-       "4738                           5.671         0.0   \n",
-       "4739                           5.671         0.0   \n",
-       "4740                           5.671         0.0   \n",
-       "\n",
-       "                                 raw_drop_concentration  \\\n",
-       "0     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "1     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "2     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "3     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4     -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "...                                                 ...   \n",
-       "4736  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4737  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4738  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4739  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "4740  -9.999,-9.999,-9.999,-9.999,-9.999,-9.999,-9.9...   \n",
-       "\n",
-       "                              raw_drop_average_velocity  \\\n",
-       "0     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "1     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "2     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "3     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4     00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "...                                                 ...   \n",
-       "4736  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4737  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4738  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4739  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "4740  00.000,00.000,00.000,00.000,00.000,00.000,00.0...   \n",
-       "\n",
-       "                                        raw_drop_number                time  \n",
-       "0     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:44:30  \n",
-       "1     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:45:01  \n",
-       "2     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:45:30  \n",
-       "3     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:46:01  \n",
-       "4     000,000,000,000,000,000,000,000,000,000,000,00... 2018-01-08 12:46:31  \n",
-       "...                                                 ...                 ...  \n",
-       "4736  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:13:25  \n",
-       "4737  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:13:56  \n",
-       "4738  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:14:26  \n",
-       "4739  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:14:55  \n",
-       "4740  000,000,000,000,000,000,000,000,000,000,000,00... 2018-03-08 04:15:25  \n",
-       "\n",
-       "[4741 rows x 18 columns]"
       ]
      },
-     "metadata": {},
-     "output_type": "display_data"
+     "metadata": {}
     }
    ],
-   "source": [
-    "subset_file_list = file_list[:1]\n",
-    "\n",
-    "df = read_raw_file_list(\n",
-    "    file_list=subset_file_list,\n",
-    "    column_names=column_names,\n",
-    "    reader_kwargs=reader_kwargs,\n",
-    "    sensor_name=sensor_name,\n",
-    "    verbose=verbose,\n",
-    "    df_sanitizer_fun=df_sanitizer_fun,\n",
-    ")\n",
-    "display(df)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "d40dc5ef",
-   "metadata": {},
    "source": [
     "Here we derive the corresponding xr.Dataset object "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 47,
-   "id": "795fda4f",
-   "metadata": {},
+   "source": [
+    "ds = create_l0b_from_l0a(df, attrs, verbose=False)\n",
+    "print(ds)"
+   ],
    "outputs": [
     {
-     "name": "stdout",
      "output_type": "stream",
+     "name": "stdout",
      "text": [
       "<xarray.Dataset>\n",
       "Dimensions:                         (time: 4741, diameter_bin_center: 32,\n",
@@ -2483,42 +2403,34 @@
      ]
     }
    ],
-   "source": [
-    "ds = create_l0b_from_l0a(df, attrs, verbose=False)\n",
-    "print(ds)"
-   ]
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "739eee56",
-   "metadata": {},
    "source": [
     "which can be saved as DISDRODB L0B netCDF by running the following code:"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": 48,
-   "id": "d1e75d44",
-   "metadata": {},
-   "outputs": [],
    "source": [
     "# ds = set_encodings(ds, sensor_name)\n",
     "# ds.to_netcdf(\"/path/where/to/save/the/file.nc\")"
-   ]
+   ],
+   "outputs": [],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "74876274",
-   "metadata": {},
    "source": [
     "## Step 3 : Create the reader"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "6158d520",
-   "metadata": {},
    "source": [
     "We have now all the elements to start creating the new reader. \n",
     "All the modifications that we did in this notebook must be now transcribed into a DISDRODB L0 reader file.\n",
@@ -2532,12 +2444,11 @@
     "4. Add the `reader` name to the metadata YAML files of the stations.\n",
     "\n",
     "---------------------------------------------------------------------------------------"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "7738a3c0",
-   "metadata": {},
    "source": [
     "\n",
     "4. **Update the `columns_names` list**\n",
@@ -2666,12 +2577,11 @@
     "    ```\n",
     " \n",
     " ---------------------------------------------------------------------------------------   "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "e0abfb7c",
-   "metadata": {},
    "source": [
     "7. **Run the script**\n",
     "\n",
@@ -2696,12 +2606,11 @@
     "ATTENTION: For this to command to run, you need to have added the `reader` name to the station metadata YAML file !\n",
     "\n",
     " ---------------------------------------------------------------------------------------"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "b7f1edaf",
-   "metadata": {},
    "source": [
     "8. **Check if the script has correctly executed**\n",
     "\n",
@@ -2743,12 +2652,11 @@
     "    ```\n",
     "\n",
     "---------------------------------------------------------------------------------------"
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
-   "id": "92d2b761",
-   "metadata": {},
    "source": [
     "Well done 👋👋👋 \n",
     "\n",
@@ -2758,15 +2666,15 @@
     "Have a look at the [contributors guidelines](https://disdrodb.readthedocs.io/en/latest/contributors_guidelines.html) for more information and do not hesitate to open a [GitHub Issue](https://github.com/ltelab/disdrodb/issues) if you need any clarification. \n",
     "\n",
     "The DISDRODB team hope you enjoyed this tutorial "
-   ]
+   ],
+   "metadata": {}
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "dd52830d",
-   "metadata": {},
+   "source": [],
    "outputs": [],
-   "source": []
+   "metadata": {}
   }
  ],
  "metadata": {
@@ -2795,4 +2703,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 5
-}
+}
\ No newline at end of file