diff --git a/freight_economic_competitiveness/ReadMe.md b/freight_economic_competitiveness/ReadMe.md new file mode 100644 index 000000000..5c3575d20 --- /dev/null +++ b/freight_economic_competitiveness/ReadMe.md @@ -0,0 +1 @@ +Freight Economic Competitiveness Analysis conducted for Kelly McClendon to contribute to the SB125 conversation diff --git a/freight_economic_competitiveness/freight_ec_data.csv b/freight_economic_competitiveness/freight_ec_data.csv new file mode 100644 index 000000000..a4edec8aa --- /dev/null +++ b/freight_economic_competitiveness/freight_ec_data.csv @@ -0,0 +1,32 @@ +sln,freight_ec +03, +04,17860.5 +05,3370.16 +07,19599.28 +10,3323.42 +11,3119.47 +12,1467.86 +15,3473.47 +18,4199.73 +19,4555.1 +22,3479.98 +23,16643.05 +25, +27,771.96 +29,72.21 +30,10533.22 +32,1085.17 +37,7913.15 +39,12896.31 +40,11352.45 +42,4181.13 +43, +44,2524.16 +45,3950.36 +47,2863.96 +50,9210.17 +53,618.38 +54,12667.98 +61,443.47 +62,9753.58 +63,7552.66 diff --git a/freight_economic_competitiveness/freight_ec_data.parquet b/freight_economic_competitiveness/freight_ec_data.parquet new file mode 100644 index 000000000..7e6be911f Binary files /dev/null and b/freight_economic_competitiveness/freight_ec_data.parquet differ diff --git a/freight_economic_competitiveness/freight_truck_ec.ipynb b/freight_economic_competitiveness/freight_truck_ec.ipynb new file mode 100644 index 000000000..9f9684f55 --- /dev/null +++ b/freight_economic_competitiveness/freight_truck_ec.ipynb @@ -0,0 +1,1133 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "08ac9974-c868-4ba3-8b63-f265950ab2d2", + "metadata": {}, + "source": [ + "# SB 125\n", + "## Cycle 4, Spring FY2024\n", + "### Freight Truck Economic Competitiveness\n", + "Created March 2024 \n", + "Analysis and write-up completed by Noah Sanchez for Kelly McClendon for a request he received from CTC, Angel, and Hannah \n", + "Geodatabase provided by Affi N'Guessan contained data from the FAF5 https://www.bts.gov/faf " + ] + }, + { + "cell_type": "markdown", + "id": "1f582986-fe0c-42e8-b59d-642a330bd15f", + "metadata": {}, + "source": [ + "#### TCEP/SCCP Cycle 4\n", + "Project's included in the TCEP/SCCP Cycle 4 (https://experience.arcgis.com/experience/1173a09d9f7a452ca7be858c39546678/) were analyzed for freight movement to identify Freight Truck Economic Competitiveness. " + ] + }, + { + "cell_type": "markdown", + "id": "22f01fe4-6bf6-4c37-ad1d-982268929218", + "metadata": {}, + "source": [ + "#### Methodology\n", + "ArcGIS was used to identify the segments in the FAF5 datasets that corresponded with Caltrans' Projects that were included in the TCEP/SCCP Cycle 4. Not all projects were included, only non-rail projects that had project lines that were within the limits of the various projects. Attribute tables that included the segments of the various projects were exported from ArcGIS Pro and imported into JupyterLab for this analysis. Each Project had the values in the column ['TOT_Tons_All_22'] averaged. " + ] + }, + { + "cell_type": "markdown", + "id": "8ac14f1d-4898-44d1-afea-13b51f039a96", + "metadata": {}, + "source": [ + "#### Deliverable\n", + "This analysis is not a comprehensive economic analysis, but is being used to add to the conversation. The final deliverable is a CSV or Excel doc containing the Economic Competitive Analysis results and other general project information. The final deliverable was sent to Kelly McClendon and Affi N'Guessan via email on 3/21/2024. " + ] + }, + { + "cell_type": "markdown", + "id": "05dee661-6eba-4926-9cb8-514ceeefb9f3", + "metadata": {}, + "source": [ + "#### Additional Research\n", + "We discussed potential future analysis could be performed, including a more detailed breakdown of the freight being transported per segment in an effort to identify the average value of the freight in a given area." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "780fed8b-b6f2-49f6-ba32-878834b95c4f", + "metadata": {}, + "outputs": [], + "source": [ + "# import modules\n", + "import pandas as pd\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "import seaborn as sns\n", + "import pyarrow as pa\n", + "import pyarrow.parquet as pq\n", + "import os\n", + "import nbformat\n", + "from nbconvert import PDFExporter\n", + "from nbformat import read" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "0805fa7d-834c-41bf-9ffd-e1f06a4bd32d", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/opt/conda/lib/python3.9/site-packages/google/auth/_default.py:78: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a \"quota exceeded\" or \"API not enabled\" error. See the following page for troubleshooting: https://cloud.google.com/docs/authentication/adc-troubleshooting/user-creds. \n", + " warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)\n" + ] + } + ], + "source": [ + "\n", + "\n", + "test_df = pd.read_csv(\"gs://calitp-analytics-data/data-analyses/freight_ec_2024/03_links_flow_trucks_SR132_West_3A.csv\")\n", + "\n", + "# do a for loop to read in the data\n", + "# f strings can be used to create an easier way to read-in all the data" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "a05d78c1-aef6-4992-8dc1-9ca62afed631", + "metadata": {}, + "outputs": [], + "source": [ + "# Create an easy to use GCS path\n", + "GCS_PATH = \"gs://calitp-analytics-data/data-analyses/freight_ec_2024/\"" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "fd83db53-11ed-47b4-82ec-9a8c51171cb7", + "metadata": {}, + "outputs": [], + "source": [ + "#test_df.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "e45ef50d-279a-469d-9b5f-e3c21916f1ac", + "metadata": {}, + "outputs": [], + "source": [ + "# Assign the DataFrame names\n", + "\n", + "# assign names to the datafile that were exported from the FAF5 geodatabase\n", + "# the exported data includes FAF5 segments (similar to OSM segments) that contain Truck Freight Flow data\n", + "# sln == \"submission log number\"\n", + "sln03_data = '03_links_flow_trucks_SR132_West_3A.csv'\n", + "sln04_data = '04_links_flow_trucks_Sac5_managed_lanes.csv'\n", + "sln05_data = '05_links_flow_trucks_Konocti.csv'\n", + "sln07_data = '07_links_flow_trucks_sr132_west_phase2.csv'\n", + "sln10_data = '10_links_flow_trucks_SR46East_UnionRoad.csv'\n", + "sln11_data = '11_links_flow_trucks_SR46_AntelopeGrade.csv'\n", + "sln12_data = '12_links_flow_trucks_805_15_Transit_Only_Connector.csv'\n", + "sln15_data = '15_links_flow_trucks_SantaBarbaraUS101.csv'\n", + "sln18_data = '18_links_flow_trucks_sr84_us101_Interchange.csv'\n", + "sln19_data = '19_links_flow_trucks_I680_SR4_Interchange.csv'\n", + "sln22_data = '22_links_flow_trucks_SR37_SearsPoint_US101.csv'\n", + "sln23_data = '23_links_flow_trucks_TulareSixLane.csv'\n", + "sln25_data = '25_links_flow_trucks_centennial_corridor.csv'\n", + "sln27_data = '27_links_flow_trucks_HarborDrive_2_0.csv'\n", + "sln29_data = '29_links_flow_trucks_ScenicRoute_68.csv'\n", + "sln30_data = '30_links_flow_trucks_I680_NB_ExpressLane_phase1.csv'\n", + "sln32_data = '32_links_flow_trucks_I10_RiversideAvenue.csv'\n", + "sln37_data = '37_links_flow_trucks_AmericanCanyonSR29.csv'\n", + "sln39_data = '39_links_flow_trucks_SR60_WorldLogistics.csv'\n", + "sln40_data = '40_links_flow_trucks_SR60_RedlandsBlvd.csv'\n", + "sln42_data = '42_links_flow_trucks_i15_ExpressLanes_Southern.csv'\n", + "sln43_data = '43_links_flow_trucks_McCall_Boulevard.csv'\n", + "sln44_data = '44_links_flow_trucks_i15_sr74_ii.csv'\n", + "sln45_data = '45_links_flow_trucks_Watsonville1_SantaCruz.csv'\n", + "sln47_data = '47_links_flow_trucks_SR91_Central_Ave.csv'\n", + "sln50_data = '50_links_flow_trucks_harbor_scenic.csv'\n", + "sln53_data = '53_links_flow_trucks_HuenemeRoad.csv'\n", + "sln54_data = '54_links_flow_trucks_i5_managed_lanes.csv'\n", + "sln61_data = '61_links_flow_trucks_castrovilleBoulevard.csv'\n", + "sln62_data = '62_links_flow_trucks_multimodal_skyway.csv'\n", + "sln63_data = '63_links_flow_trucks_SC_SR71_GapClosure.csv'" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "ab8ea0e0-cbcb-40c6-b7b6-5475190f19f5", + "metadata": {}, + "outputs": [], + "source": [ + "# create a function to import the data from a csv file\n", + "def getData(path):\n", + " # reads in the data from a .csv file\n", + " # add in an f string to designate the data path\n", + " df = pd.read_csv(f\"{GCS_PATH}{path}\")\n", + " return df\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "e934ca92-cf41-47e8-9087-eea0589dd8c3", + "metadata": {}, + "outputs": [], + "source": [ + "# Pull in data\n", + "\n", + "sln03_data = getData(sln03_data)\n", + "sln04_data = getData(sln04_data)\n", + "sln05_data = getData(sln05_data)\n", + "sln07_data = getData(sln07_data)\n", + "sln10_data = getData(sln10_data)\n", + "sln11_data = getData(sln11_data)\n", + "sln12_data = getData(sln12_data)\n", + "sln15_data = getData(sln15_data)\n", + "sln18_data = getData(sln18_data)\n", + "sln19_data = getData(sln19_data)\n", + "sln22_data = getData(sln22_data)\n", + "sln23_data = getData(sln23_data)\n", + "sln25_data = getData(sln25_data)\n", + "sln27_data = getData(sln27_data)\n", + "sln29_data = getData(sln29_data)\n", + "sln30_data = getData(sln30_data)\n", + "sln32_data = getData(sln32_data)\n", + "sln37_data = getData(sln37_data)\n", + "sln39_data = getData(sln39_data)\n", + "sln40_data = getData(sln40_data)\n", + "sln42_data = getData(sln42_data)\n", + "sln43_data = getData(sln43_data)\n", + "sln44_data = getData(sln44_data)\n", + "sln45_data = getData(sln45_data)\n", + "sln47_data = getData(sln47_data)\n", + "sln50_data = getData(sln50_data)\n", + "sln53_data = getData(sln53_data)\n", + "sln54_data = getData(sln54_data)\n", + "sln61_data = getData(sln61_data)\n", + "sln62_data = getData(sln62_data)\n", + "sln63_data = getData(sln63_data)" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "4c307bf0-0788-4951-ae24-6333444a622a", + "metadata": {}, + "outputs": [], + "source": [ + "# Create subsets\n", + "\n", + "# Create subsets using only the [ID] and [TOT_Tons_22_All] columns\n", + "data_03 = sln03_data[['TOT_Tons_22_All']]\n", + "data_04 = sln04_data[['TOT_Tons_22_All']]\n", + "data_05 = sln05_data[['TOT_Tons_22_All']]\n", + "data_07 = sln07_data[['TOT_Tons_22_All']]\n", + "data_10 = sln10_data[['TOT_Tons_22_All']]\n", + "data_11 = sln11_data[['TOT_Tons_22_All']]\n", + "data_12 = sln12_data[['TOT_Tons_22_All']]\n", + "data_15 = sln15_data[['TOT_Tons_22_All']]\n", + "data_18 = sln18_data[['TOT_Tons_22_All']]\n", + "data_19 = sln19_data[['TOT_Tons_22_All']]\n", + "data_22 = sln22_data[['TOT_Tons_22_All']]\n", + "data_23 = sln23_data[['TOT_Tons_22_All']]\n", + "data_25 = sln25_data[['TOT_Tons_22_All']]\n", + "data_27 = sln27_data[['TOT_Tons_22_All']]\n", + "data_29 = sln29_data[['TOT_Tons_22_All']]\n", + "data_30 = sln30_data[['TOT_Tons_22_All']]\n", + "data_32 = sln32_data[['TOT_Tons_22_All']]\n", + "data_37 = sln37_data[['TOT_Tons_22_All']]\n", + "data_39 = sln39_data[['TOT_Tons_22_All']]\n", + "data_40 = sln40_data[['TOT_Tons_22_All']]\n", + "data_42 = sln42_data[['TOT_Tons_22_All']]\n", + "data_43 = sln43_data[['TOT_Tons_22_All']]\n", + "data_44 = sln44_data[['TOT_Tons_22_All']]\n", + "data_45 = sln45_data[['TOT_Tons_22_All']]\n", + "data_47 = sln47_data[['TOT_Tons_22_All']]\n", + "data_50 = sln50_data[['TOT_Tons_22_All']]\n", + "data_53 = sln53_data[['TOT_Tons_22_All']]\n", + "data_54 = sln54_data[['TOT_Tons_22_All']]\n", + "data_61 = sln61_data[['TOT_Tons_22_All']]\n", + "data_62 = sln62_data[['TOT_Tons_22_All']]\n", + "data_63 = sln63_data[['TOT_Tons_22_All']]" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "83d96202-50e4-4a24-8931-4d2821057b08", + "metadata": {}, + "outputs": [], + "source": [ + "# Create a function to average the Totals Column\n", + "def calculate_average_combined_freight(path):\n", + " try:\n", + " # Identify the dataset\n", + " data = path\n", + " \n", + " # Filter out NaN values from the specified column\n", + " filtered_data = data.dropna(subset=['TOT_Tons_22_All'])\n", + " \n", + " # Calculate the total of the specified column\n", + " total = filtered_data['TOT_Tons_22_All'].sum()\n", + " \n", + " # Calculate teh number of records with data in the column\n", + " count = filtered_data['TOT_Tons_22_All'].count()\n", + " \n", + " # Ensure count is not zero to avoid division by zero\n", + " if count != 0:\n", + " # Calculate the average\n", + " average = (total)/count\n", + " # Format the average to have two digits past the decimal point\n", + " formatted_average =\"{:.2f}\".format(average)\n", + " #Convert the formatted average back to a float\n", + " average_float = float(formatted_average)\n", + " # Convert the float to a DataFrame\n", + " # I had trouble with this one, still working on it\n", + " #formatted_average = pd.DataFrame(formatted_average)\n", + " \n", + " return average_float\n", + " else:\n", + " #print(\"No records with data in the column.\") # This step has been changed to a comment to clean up the final PDF version\n", + " return None\n", + " except Exception as e:\n", + " print(\"An error occurred:\", e)\n", + " return None\n", + "\n", + "# Create a function to rename the first column\n", + "def rename_col(df):\n", + " # rename the columns\n", + " mapping = {\n", + " df.columns[0]: 'freight_ec',\n", + " df.columns[1]: 'sln'\n", + " }\n", + " df = df.rename(columns=mapping)\n", + " return df\n", + "\n", + "\n", + "# Create a function to reorder the columns so the [sln] column appears first\n", + "def reorder_columns(df):\n", + " \"\"\"\n", + " Reorder columns from 'freight_ec' and 'sln' to 'sln' and 'freight_ec'\n", + " \n", + " Paramters:\n", + " df (pandas.DataFrame): Input DataFrame.\n", + " \n", + " Returns:\n", + " pandas.DataFrame: DataFrame with reordered columns\n", + " \"\"\"\n", + " # Ensure that the columns exist in the DataFrame\n", + " if 'freight_ec' in df.columns and 'sln' in df.columns:\n", + " # Reorder columns\n", + " new_df = df[['sln', 'freight_ec']]\n", + " return new_df\n", + " else:\n", + " print(\"Error: 'freight_ec_ and/or 'sln' columns not found in the DataFrame.\")\n", + " return df \n", + "\n", + "# Create a function to export the data to a parquet\n", + "def export_to_parquet(df, output_file):\n", + " \"\"\"\n", + " Export a Pandas DataFrame to a Parquet file.\n", + " \n", + " Parameters:\n", + " df (pandas.DataFrame): The DataFrame to Export\n", + " output_file (str): The path to the output Parquet file.\n", + " \n", + " Returns:\n", + " None\n", + " \"\"\"\n", + " # Convert the DataFrame to a PyArrow table\n", + " table = pa.Table.from_pandas(df)\n", + " \n", + " # write the PyArrow table to a Parquet file\n", + " pq.write_table(table, output_file)\n", + " \n", + " print(f\"DataFrame exported to Parquet successfully at {output_file}.\")\n", + "\n", + "# Create a function to export a notebook to a PDF\n", + "def notebook_to_pdf_with_code(input_notebook, output_pdf):\n", + " \"\"\"\n", + " Convert a Jupyter Notebook to PDF.\n", + " \n", + " Paramters: \n", + " input_notebook (str): Path to the input Jupyter Notebook.\n", + " output_pdf (str): Path to save the output PDF file. \n", + " \"\"\"\n", + " if not input_notebook_c.endswith('.ipynb'):\n", + " raise ValueError(\"Input file should be a Jupyter Notebook (.ipynb)\")\n", + " \n", + " if not output_pdf_c.endswith('.pdf'):\n", + " raise ValueError(\"Output file should be a PDF (.pdf)\")\n", + " \n", + " if not os.path.isfile(input_notebook_c):\n", + " raise FileNotFoundError(\"Input notebook not found.\")\n", + " \n", + " pdf_exporter = PDFExporter()\n", + " with open(input_notebook_c, 'rb') as f:\n", + " notebook_content = read(f, as_version=4)\n", + " body, _ = pdf_exporter.from_notebook_node(notebook_content)\n", + " \n", + " with open(output_pdf_c, 'wb') as f:\n", + " f.write(body)\n", + " \n", + " print(f\"Notebook successfully converted to PDF: {output_pdf_c}\") \n", + " \n", + "def notebook_to_pdf_without_code(notebook_path, output_path):\n", + " # Read the notebook\n", + " with open(input_notebook, 'r', encoding='utf-8') as f:\n", + " notebook = nbformat.read(f, as_version=4)\n", + " \n", + " # Iterate through each cell\n", + " for cell in notebook.cells:\n", + " # Hide code cells\n", + " if cell.cell_type == 'code':\n", + " cell['execution_count'] = None\n", + " cell['source'] = ''\n", + " \n", + " # Export to PDF\n", + " pdf_exporter = PDFExporter()\n", + " pdf_exporter.exclude_input = True\n", + " pdf_exporter.exclude_output_prompt = False # This can be changed if you want to hide teh output cells as well\n", + " (body, resources) = pdf_exporter.from_notebook_node(notebook)\n", + " \n", + " # Write PDF to file\n", + " with open(output_pdf, 'wb') as f:\n", + " f.write(body)\n", + " \n", + " print(f\"Notebook successfully converted to PDF: {output_pdf}\") " + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "4e9de515-610a-4b6c-abf8-348d3b09e79d", + "metadata": {}, + "outputs": [], + "source": [ + "# Use the calculate_average_combined_freight(path) function to identify the average\n", + "# freight tonnage for the segments in each of the project's limits\n", + "average_03 = calculate_average_combined_freight(data_03)\n", + "average_04 = calculate_average_combined_freight(data_04)\n", + "average_05 = calculate_average_combined_freight(data_05)\n", + "average_07 = calculate_average_combined_freight(data_07)\n", + "average_10 = calculate_average_combined_freight(data_10)\n", + "average_11 = calculate_average_combined_freight(data_11)\n", + "average_12 = calculate_average_combined_freight(data_12)\n", + "average_15 = calculate_average_combined_freight(data_15)\n", + "average_18 = calculate_average_combined_freight(data_18)\n", + "average_19 = calculate_average_combined_freight(data_19)\n", + "average_22 = calculate_average_combined_freight(data_22)\n", + "average_23 = calculate_average_combined_freight(data_23)\n", + "average_25 = calculate_average_combined_freight(data_25)\n", + "average_27 = calculate_average_combined_freight(data_27)\n", + "average_29 = calculate_average_combined_freight(data_29)\n", + "average_30 = calculate_average_combined_freight(data_30)\n", + "average_32 = calculate_average_combined_freight(data_32)\n", + "average_37 = calculate_average_combined_freight(data_37)\n", + "average_39 = calculate_average_combined_freight(data_39)\n", + "average_40 = calculate_average_combined_freight(data_40)\n", + "average_42 = calculate_average_combined_freight(data_42)\n", + "average_43 = calculate_average_combined_freight(data_43)\n", + "average_44 = calculate_average_combined_freight(data_44)\n", + "average_45 = calculate_average_combined_freight(data_45)\n", + "average_47 = calculate_average_combined_freight(data_47)\n", + "average_50 = calculate_average_combined_freight(data_50)\n", + "average_53 = calculate_average_combined_freight(data_53)\n", + "average_54 = calculate_average_combined_freight(data_54)\n", + "average_61 = calculate_average_combined_freight(data_61)\n", + "average_62 = calculate_average_combined_freight(data_62)\n", + "average_63 = calculate_average_combined_freight(data_63)" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "a5816f34-8da2-40a7-9d34-2ea4d41f4102", + "metadata": {}, + "outputs": [], + "source": [ + "# Create a DataFrame for each average value\n", + "df_03 = pd.DataFrame([average_03])\n", + "df_04 = pd.DataFrame([average_04])\n", + "df_05 = pd.DataFrame([average_05])\n", + "df_07 = pd.DataFrame([average_07])\n", + "df_10 = pd.DataFrame([average_10])\n", + "df_11 = pd.DataFrame([average_11])\n", + "df_12 = pd.DataFrame([average_12])\n", + "df_15 = pd.DataFrame([average_15])\n", + "df_18 = pd.DataFrame([average_18])\n", + "df_19 = pd.DataFrame([average_19])\n", + "df_22 = pd.DataFrame([average_22])\n", + "df_23 = pd.DataFrame([average_23])\n", + "df_25 = pd.DataFrame([average_25])\n", + "df_27 = pd.DataFrame([average_27])\n", + "df_29 = pd.DataFrame([average_29])\n", + "df_30 = pd.DataFrame([average_30])\n", + "df_32 = pd.DataFrame([average_32])\n", + "df_37 = pd.DataFrame([average_37])\n", + "df_39 = pd.DataFrame([average_39])\n", + "df_40 = pd.DataFrame([average_40])\n", + "df_42 = pd.DataFrame([average_42])\n", + "df_43 = pd.DataFrame([average_43])\n", + "df_44 = pd.DataFrame([average_44])\n", + "df_45 = pd.DataFrame([average_45])\n", + "df_47 = pd.DataFrame([average_47])\n", + "df_50 = pd.DataFrame([average_50])\n", + "df_53 = pd.DataFrame([average_53])\n", + "df_54 = pd.DataFrame([average_54])\n", + "df_61 = pd.DataFrame([average_61])\n", + "df_62 = pd.DataFrame([average_62])\n", + "df_63 = pd.DataFrame([average_63])" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "a3195816-0c8a-4999-87bb-75f564fb387c", + "metadata": {}, + "outputs": [], + "source": [ + "# adding a column to the datasets called 'sln' which stands for 'submission log number'\n", + "# the value of the 'sln' column will correspond with that record's submission log number that is found on the TCEP_SCCP_Cycle_4... Excel doc\n", + "df_03['sln'] = '03'\n", + "df_04['sln'] = '04'\n", + "df_05['sln'] = '05'\n", + "df_07['sln'] = '07'\n", + "df_10['sln'] = '10'\n", + "df_11['sln'] = '11'\n", + "df_12['sln'] = '12'\n", + "df_15['sln'] = '15'\n", + "df_18['sln'] = '18'\n", + "df_19['sln'] = '19'\n", + "df_22['sln'] = '22'\n", + "df_23['sln'] = '23'\n", + "df_25['sln'] = '25'\n", + "df_27['sln'] = '27'\n", + "df_29['sln'] = '29'\n", + "df_30['sln'] = '30'\n", + "df_32['sln'] = '32'\n", + "df_37['sln'] = '37'\n", + "df_39['sln'] = '39'\n", + "df_40['sln'] = '40'\n", + "df_42['sln'] = '42'\n", + "df_43['sln'] = '43'\n", + "df_44['sln'] = '44'\n", + "df_45['sln'] = '45'\n", + "df_47['sln'] = '47'\n", + "df_50['sln'] = '50'\n", + "df_53['sln'] = '53'\n", + "df_54['sln'] = '54'\n", + "df_61['sln'] = '61'\n", + "df_62['sln'] = '62'\n", + "df_63['sln'] = '63'" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "b25a8fae-95b5-4bad-a389-296787043cc8", + "metadata": {}, + "outputs": [], + "source": [ + "# Rename the columns using the rename column function\n", + "df_03 = rename_col(df_03)\n", + "df_04 = rename_col(df_04)\n", + "df_05 = rename_col(df_05)\n", + "df_07 = rename_col(df_07)\n", + "df_10 = rename_col(df_10)\n", + "df_11 = rename_col(df_11)\n", + "df_12 = rename_col(df_12)\n", + "df_15 = rename_col(df_15)\n", + "df_18 = rename_col(df_18)\n", + "df_19 = rename_col(df_19)\n", + "df_22 = rename_col(df_22)\n", + "df_23 = rename_col(df_23)\n", + "df_25 = rename_col(df_25)\n", + "df_27 = rename_col(df_27)\n", + "df_29 = rename_col(df_29)\n", + "df_30 = rename_col(df_30)\n", + "df_32 = rename_col(df_32)\n", + "df_37 = rename_col(df_37)\n", + "df_39 = rename_col(df_39)\n", + "df_40 = rename_col(df_40)\n", + "df_42 = rename_col(df_42)\n", + "df_43 = rename_col(df_43)\n", + "df_44 = rename_col(df_44)\n", + "df_45 = rename_col(df_45)\n", + "df_47 = rename_col(df_47)\n", + "df_50 = rename_col(df_50)\n", + "df_53 = rename_col(df_53)\n", + "df_54 = rename_col(df_54)\n", + "df_61 = rename_col(df_61)\n", + "df_62 = rename_col(df_62)\n", + "df_63 = rename_col(df_63)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "639bddbf-2091-451a-a580-f739a9dcffd3", + "metadata": {}, + "outputs": [], + "source": [ + "# Reorder the columns using the Reorder column function\n", + "df_03 = reorder_columns(df_03)\n", + "df_04 = reorder_columns(df_04)\n", + "df_05 = reorder_columns(df_05)\n", + "df_07 = reorder_columns(df_07)\n", + "df_10 = reorder_columns(df_10)\n", + "df_11 = reorder_columns(df_11)\n", + "df_12 = reorder_columns(df_12)\n", + "df_15 = reorder_columns(df_15)\n", + "df_18 = reorder_columns(df_18)\n", + "df_19 = reorder_columns(df_19)\n", + "df_22 = reorder_columns(df_22)\n", + "df_23 = reorder_columns(df_23)\n", + "df_25 = reorder_columns(df_25)\n", + "df_27 = reorder_columns(df_27)\n", + "df_29 = reorder_columns(df_29)\n", + "df_30 = reorder_columns(df_30)\n", + "df_32 = reorder_columns(df_32)\n", + "df_37 = reorder_columns(df_37)\n", + "df_39 = reorder_columns(df_39)\n", + "df_40 = reorder_columns(df_40)\n", + "df_42 = reorder_columns(df_42)\n", + "df_43 = reorder_columns(df_43)\n", + "df_44 = reorder_columns(df_44)\n", + "df_45 = reorder_columns(df_45)\n", + "df_47 = reorder_columns(df_47)\n", + "df_50 = reorder_columns(df_50)\n", + "df_53 = reorder_columns(df_53)\n", + "df_54 = reorder_columns(df_54)\n", + "df_61 = reorder_columns(df_61)\n", + "df_62 = reorder_columns(df_62)\n", + "df_63 = reorder_columns(df_63)" + ] + }, + { + "cell_type": "markdown", + "id": "6902a4b4-bed8-4e38-b596-4dac17961795", + "metadata": {}, + "source": [ + "#### Freight Economic Competitiveness Analysis Results" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "da39f086-15de-4792-81a4-bd090c413c04", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
slnfreight_ec
003NaN
10417860.50
2053370.16
30719599.28
4103323.42
5113119.47
6121467.86
7153473.47
8184199.73
9194555.10
10223479.98
112316643.05
1225NaN
1327771.96
142972.21
153010533.22
16321085.17
17377913.15
183912896.31
194011352.45
20424181.13
2143NaN
22442524.16
23453950.36
24472863.96
25509210.17
2653618.38
275412667.98
2861443.47
29629753.58
30637552.66
\n", + "
" + ], + "text/plain": [ + " sln freight_ec\n", + "0 03 NaN\n", + "1 04 17860.50\n", + "2 05 3370.16\n", + "3 07 19599.28\n", + "4 10 3323.42\n", + "5 11 3119.47\n", + "6 12 1467.86\n", + "7 15 3473.47\n", + "8 18 4199.73\n", + "9 19 4555.10\n", + "10 22 3479.98\n", + "11 23 16643.05\n", + "12 25 NaN\n", + "13 27 771.96\n", + "14 29 72.21\n", + "15 30 10533.22\n", + "16 32 1085.17\n", + "17 37 7913.15\n", + "18 39 12896.31\n", + "19 40 11352.45\n", + "20 42 4181.13\n", + "21 43 NaN\n", + "22 44 2524.16\n", + "23 45 3950.36\n", + "24 47 2863.96\n", + "25 50 9210.17\n", + "26 53 618.38\n", + "27 54 12667.98\n", + "28 61 443.47\n", + "29 62 9753.58\n", + "30 63 7552.66" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a DataFrame for each average value and then concatenate them together\n", + "freight_ec_data = pd.concat([df_03, df_04, df_05, df_07, df_10, df_11, df_12, df_15, df_18, df_19, df_22, df_23, df_25, df_27, df_29, df_30, df_32, df_37, df_39, df_40, df_42, df_43, df_44, df_45, df_47, df_50, df_53, df_54, df_61, df_62, df_63], ignore_index=True)\n", + "freight_ec_data" + ] + }, + { + "cell_type": "markdown", + "id": "fc2a93f7-9eab-4e98-87ea-13f0eafa8604", + "metadata": {}, + "source": [ + "### Data Visualizations" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "45798efb-7ef6-4f5c-a67e-dfe4557fc402", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# identifying the tonnage weight column ['freight_ec']\n", + "freight_ec_column = freight_ec_data['freight_ec']\n", + "\n", + "# Setting the style of seaborn\n", + "sns.set_style(\"whitegrid\")\n", + "\n", + "# Plotting the tonnage weight values\n", + "plt.figure(figsize=(10, 6))\n", + "sns.histplot(freight_ec_column, bins=20, color='skyblue', edgecolor='black')\n", + "plt.title('Freight Economic Competitiveness Analysis')\n", + "plt.xlabel('Tonnage Weight')\n", + "plt.ylabel('Frequency')\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "86fe2757-34d2-4add-9cab-2a6eaf62c3af", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# identifying the tonnage weight column ['freight_ec']\n", + "freight_ec_column = freight_ec_data['freight_ec']\n", + "\n", + "# Setting the style of seaborn\n", + "sns.set_style(\"whitegrid\")\n", + "\n", + "# Plotting the tonnage weight values using a violin plot\n", + "plt.figure(figsize=(10,6))\n", + "sns.violinplot(y=freight_ec_column, color='skyblue')\n", + "plt.title('Freight Economic Compeititve Analysis Violin Plot')\n", + "plt.ylabel('Tonnage Weight')\n", + "plt.show()" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "164efc26-83e4-4f0f-9c1c-4fb2d93ece84", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# identifying the tonnage weight column ['freight_ec']\n", + "freight_ec_column = freight_ec_data['freight_ec']\n", + "\n", + "# Generating x values (assuming sln numbers as the x values)\n", + "#x_values = freight_ec_data['sln']\n", + "x_values = freight_ec_data.index\n", + "\n", + "# Plotting the tonnage weight values using a scatterplot\n", + "plt.figure(figsize=(10,6))\n", + "plt.scatter(x_values, freight_ec_column, color='skyblue', alpha=0.6)\n", + "plt.title('Freight Economic Competitive Analysis Scatter Plot')\n", + "plt.ylabel('sln')\n", + "plt.xlabel('Tonnage Weight')\n", + "plt.grid(True)\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "c2d49962-c6ec-452d-93f8-1dc99ea4b8d4", + "metadata": {}, + "source": [ + "#### Exports" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "c2eec2bf-4d50-4b76-8ae6-f2da6772a6d0", + "metadata": {}, + "outputs": [], + "source": [ + "# Export Notebooks\n", + "\n", + "# PDF" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "b9ba289a-b0ae-4f73-805d-b80f81c026b5", + "metadata": {}, + "outputs": [], + "source": [ + "# Hide the code cells and write to a PDF paramters without code\n", + "input_notebook = 'freight_truck_ec.ipynb'\n", + "output_pdf = 'freight_truck_ec_analysis_hidden_code.pdf'" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "abeed225-b9a9-4549-a20a-f9c093897059", + "metadata": {}, + "outputs": [], + "source": [ + "# Hide the code cells and write to a PDF paramters with code\n", + "input_notebook_c = 'freight_truck_ec.ipynb'\n", + "output_pdf_c = 'freight_truck_ec_analysis.pdf'" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "6b3dc865-2103-4553-87ce-13891173438e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Notebook successfully converted to PDF: freight_truck_ec_analysis_hidden_code.pdf\n" + ] + } + ], + "source": [ + "# Create a PDF that hides the code cells\n", + "notebook_to_pdf_without_code(input_notebook, output_pdf)" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "0f038ef5-eb1b-4dc7-8385-4e5cdf275392", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Notebook successfully converted to PDF: freight_truck_ec_analysis.pdf\n" + ] + } + ], + "source": [ + "# Create a PDF that include the code cells\n", + "notebook_to_pdf_with_code(input_notebook_c, output_pdf_c)" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "id": "063eb2d0-3d10-4bd7-9610-e20913eea951", + "metadata": {}, + "outputs": [], + "source": [ + "# CSV" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "id": "e06736a5-6b5e-460c-b954-1dafbc855b3d", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "DataFrame exported to CSV successfully at freight_ec_data.csv\n" + ] + } + ], + "source": [ + "# Create a CSV from the data\n", + "freight_ec_data.to_csv('freight_ec_data.csv', index=False)\n", + "\n", + "# the following script is pending approval\n", + "#freight_ec_data.to_csv(f\"{GCS_PATH}/outputs/freight_ec_data.csv\", index=False)\n", + "\n", + "\n", + "# Print the success statement after the CSV has been exported\n", + "print(f\"DataFrame exported to CSV successfully at freight_ec_data.csv\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "342a4e65-b594-487c-97ca-8e6dfe2dc5ba", + "metadata": {}, + "outputs": [], + "source": [ + "# Parquet" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0bb36f8d-10fc-4641-bcf4-7fcee9f782fa", + "metadata": {}, + "outputs": [], + "source": [ + "# Define the output file path for the Parquet file\n", + "parquet_output_file = 'freight_ec_data.parquet'\n", + "\n", + "# Export the DataFrame to Parquet using the export_to_parquet function\n", + "export_to_parquet(freight_ec_data, parquet_output_file)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/freight_economic_competitiveness/freight_truck_ec_analysis.pdf b/freight_economic_competitiveness/freight_truck_ec_analysis.pdf new file mode 100644 index 000000000..432209cb8 Binary files /dev/null and b/freight_economic_competitiveness/freight_truck_ec_analysis.pdf differ diff --git a/freight_economic_competitiveness/freight_truck_ec_analysis_hidden_code.pdf b/freight_economic_competitiveness/freight_truck_ec_analysis_hidden_code.pdf new file mode 100644 index 000000000..2fc540ff8 Binary files /dev/null and b/freight_economic_competitiveness/freight_truck_ec_analysis_hidden_code.pdf differ