From 3ef6ca796229b9201fc4447b95aec6377859a110 Mon Sep 17 00:00:00 2001 From: sschmidt23 Date: Tue, 7 Jan 2025 12:56:59 -0800 Subject: [PATCH] add demo notebook for SOMSpecSelector --- .../example_SOMSpecSelector.ipynb | 514 ++++++++++++++++++ 1 file changed, 514 insertions(+) create mode 100644 examples/creation_examples/example_SOMSpecSelector.ipynb diff --git a/examples/creation_examples/example_SOMSpecSelector.ipynb b/examples/creation_examples/example_SOMSpecSelector.ipynb new file mode 100644 index 0000000..45cec0c --- /dev/null +++ b/examples/creation_examples/example_SOMSpecSelector.ipynb @@ -0,0 +1,514 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "963ec5d4-c617-41fc-8cb7-7b74d9f45195", + "metadata": {}, + "source": [ + "# SOMSpecSelector Demo\n", + "\n", + "Author: Sam Schmidt\n", + "\n", + "Last successfully run: Jan 7, 2025\n", + "\n", + "This is a short demo of the use of the SOM-based degrader `SOMSpecSelector` that is designed to select a subset of an input galaxy sample via SOM classification such that they match the properties of a reference sample, e.g. to make mock spectroscopic selections for a training set. \n", + "\n", + "The code works by training a SOM using the (presumably larger) input dataset, then classifying each galaxy from both the input dataset and the reference dataset to find the best SOM cell. It then loops over all occupied SOM cells, counts the number of reference galaxies in a cell, and selects the same number of input objects in that cell to include in the degraded sample (if there are more objects in the reference sample than are available to pick from in the reference sample, then it simply takes all available objects, which does mean that you can end up with some incompleteness if areas of your parameter space have more objects in the reference sample than the input). This should naturally force the chosen subsample to match (as much as possible given SOM cell classification given the input parameters) the properties of the reference sample.\n", + "\n", + "Note that, like other RAIL degraders, this degrader expects the input to be in a Pandas dataframe.\n", + "\n", + "We will demonstrate below, starting with some necessary imports. Note that the `SOMSpecSelector` code is in the `rail_som` package, so make sure that you have installed either the full RAIL package, or at least the `rail_som` algorithms with either `pip install pz-rail-som` or by going to https://github.com/LSSTDESC/rail_som and cloning and installing the package there.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cdf21773-a138-4f74-b3f0-f171433f4359", + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import os\n", + "import matplotlib.pyplot as plt\n", + "import tables_io" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "88c11713-6302-4536-897d-5266a2abf9f0", + "metadata": {}, + "outputs": [], + "source": [ + "from rail.creation.degraders.specz_som import SOMSpecSelector\n", + "from rail.utils.path_utils import find_rail_file\n", + "from rail.core.data import TableHandle, PqHandle\n", + "from rail.core.stage import RailStage" + ] + }, + { + "cell_type": "markdown", + "id": "20885821-6835-4126-8f2d-f9fedbc3b8cd", + "metadata": {}, + "source": [ + "First, let's set up the RAIL DataStore, which is explained in other RAIL demo notebooks, as our interface to the data. We will read in two files, one for reference (e.g. a specz sample that we would wish to model), and an input dataset. Because the samples that are included with RAIL are both representative, we will first make a few cuts to mimic incompleteness before adding to the datastore.\n", + "\n", + "\n", + "**Note:** The SOMSpecSelector stage requires a reference/spectroscopic set, and the stage expects that stage to be labeled as \"spec_data\" in the DataStore! So, once we make a few cuts to our example file, we will specify the name in the datastore as \"spec_data\". Later in the notebook we will have to set a different name in the `aliases` for a second instance of the stage if we want to use a file with a different label as the reference/specz file (more on that below):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1325a9d4-e411-4bb4-b879-513243cc0dfa", + "metadata": {}, + "outputs": [], + "source": [ + "DS = RailStage.data_store\n", + "DS.__class__.allow_overwrite = True" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8d202f9d-d488-4865-b072-52a3ad5eb19b", + "metadata": {}, + "outputs": [], + "source": [ + "trainfile = find_rail_file('examples_data/testdata/test_dc2_training_9816.hdf5')\n", + "testfile = find_rail_file('examples_data/testdata/test_dc2_validation_9816.hdf5')\n", + "testhdf5 = tables_io.read(testfile)['photometry']\n", + "trainhdf5 = tables_io.read(trainfile)['photometry']\n", + "# convert the data to pandas dataframe\n", + "testpq = tables_io.convert(testhdf5, tables_io.types.PD_DATAFRAME)\n", + "trainpq = tables_io.convert(trainhdf5, tables_io.types.PD_DATAFRAME)\n", + "test_data = DS.add_data(\"input\",testpq, PqHandle)\n", + "\n", + "# make a few cuts to the \"training\" data to simulate some incompleteness so that the distributions do not match\n", + "# we'll cut all galaxies with redshift > 1.5 and mag_i_lsst > 24.4 and g-r color > 1.0\n", + "mask = np.logical_and(trainpq['redshift'] < 1.5,\n", + " np.logical_and(trainpq['mag_i_lsst'] < 24.4, \n", + " trainpq['mag_g_lsst'] - trainpq['mag_r_lsst'] < 1.0))\n", + "cutpq = trainpq[mask]\n", + "\n", + "ref_data = DS.add_data(\"spec_data\", cutpq, PqHandle)" + ] + }, + { + "cell_type": "markdown", + "id": "b07ea95b-f1af-461e-b474-b9d203eff071", + "metadata": {}, + "source": [ + "Let's plot our two samples in redshift and color to compare:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "05f11d2b-b040-453a-8e85-91e4d37511b9", + "metadata": {}, + "outputs": [], + "source": [ + "plt.figure(figsize=(8,5))\n", + "zbins = np.linspace(-.01, 3.01, 51)\n", + "plt.hist(ref_data()['redshift'], bins=zbins, alpha=0.5, color='k', label=\"specz data\");\n", + "plt.hist(test_data()['redshift'], bins=zbins, alpha=0.5, color='orange', label=\"input data\");\n", + "plt.legend(loc='upper right', fontsize=12)\n", + "plt.xlabel(\"redshift\", fontsize=14)\n", + "plt.ylabel(\"Number\", fontsize=14);" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e7033c67-f89c-4b67-b09b-d2d8559faaaa", + "metadata": {}, + "outputs": [], + "source": [ + "fig, axs = plt.subplots(2, 1, figsize=(7,10))\n", + "axs[0].scatter(test_data()['mag_i_lsst'], test_data()['mag_g_lsst'] - test_data()['mag_r_lsst'], \n", + " s=1, label='input data', alpha=0.4, color='orange')\n", + "axs[0].scatter(ref_data()['mag_i_lsst'], ref_data()['mag_g_lsst'] - ref_data()['mag_r_lsst'], \n", + " s=4, label='specz data', alpha=0.4, color='k')\n", + "axs[0].legend(loc='upper left', fontsize=12)\n", + "axs[0].set_xlabel(\"i-band magnitude\", fontsize=14)\n", + "axs[0].set_ylabel(\"g - r\", fontsize=14);\n", + "\n", + "\n", + "axs[1].scatter(test_data()['mag_g_lsst'] - test_data()['mag_r_lsst'], \n", + " test_data()['mag_r_lsst'] - test_data()['mag_i_lsst'], \n", + " s=1, label='input data', alpha=0.4, color='orange')\n", + "axs[1].scatter(ref_data()['mag_g_lsst'] - ref_data()['mag_r_lsst'],\n", + " ref_data()['mag_r_lsst'] - ref_data()['mag_i_lsst'], \n", + " s=4, label='specz data', alpha=0.4, color='k')\n", + "axs[1].legend(loc='upper right', fontsize=12)\n", + "axs[1].set_xlabel(\"g - r\", fontsize=14)\n", + "axs[1].set_ylabel(\"r - i\", fontsize=14);\n" + ] + }, + { + "cell_type": "markdown", + "id": "e94896ea-978c-48cb-a92b-a2e293b3d3bc", + "metadata": {}, + "source": [ + "We can see that, given our cuts, our \"specz\" data is no longer representaive of the input sample. Now, let's set up our degrader to try to select a subset of galaxies that matches the number and distribution of the specz sample. We'll start by setting up the `SOMSpecSelector` stage. As input, the stage takes in multiple config parameters, these are:\n", + "\n", + "- noncolor_cols: a list of column names in the files that will be used directly in training the SOM\n", + "\n", + "- color_cols: a list of column names in the files, these will be taken in order and differenced to make, e.g. colors. So, if you want to include u-g, g-r, and r-i as inputs to the SOM, you would specify ['u', 'g', 'r', 'i'] as the `color_cols` values, and these will be differenced before inclusion in the SOM.\n", + "\n", + " - nondetect_val: if this value is present in either `noncolor_cols` or `color_cols` columns as a value, it will be replaced with the corresponding \"nondetection value\" in `noncolor_nondet` and`color_nondet` respectively.\n", + " \n", + "- noncolor_nondet: the list of nondetect values that a non-detection in `noncolor_cols` should be replaced with\n", + "\n", + "- color_nondet: the list of nondetect values that a non-detection in `color_cols` should be replaced with\n", + " \n", + "- som_size: a tuple, e.g. (32, 32), that specifies the shape of the SOM. (32, 32) is the default.\n", + "\n", + "\n", + "Let's set up our inputs. As an example, let's train our SOM using i-band magnitude, redshift, and the colors u-g, g-r, r-i, i-z, and z-y. To do this, we will specify `noncolor_cols` of 'mag_i_lsst' and 'redshift', and color_cols with all six magnitudes. The code will difference the six magnitudes, producing the desired five colors. Thus, our SOM inputs will be trained on six inputs: `mag_i_lsst`, `u-g`, `g-r`, `r-i`, `i-z`, and `z-y`. Given that our mock data has true redshifts, we could also include `redshift` as an explicit feature, which would lead to even better results; however, for this demo we will test without redshift included as a test of how well the method does in recovering the redshift distribution with only the implicit color -> redshift relation information included.\n", + "\n", + "We also need to specify the magnitude and color limits, we'll use the 1 sigma i-band 10 year limit for i-band and just put -1.0 for redshift. For colors we'll just put 0.0 for all colors." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "840c89b8-c850-422a-a5f3-e98f527137f4", + "metadata": {}, + "outputs": [], + "source": [ + "bands = ['u', 'g', 'r', 'i', 'z', 'y']\n", + "noncol_cols = ['mag_i_lsst']\n", + "col_cols = []\n", + "for band in bands:\n", + " col_cols.append(f\"mag_{band}_lsst\")\n", + "\n", + "noncol_nondet = [28.62, -1.0]\n", + "col_nondet = np.zeros(5, dtype=float)\n", + "\n", + "som_dict = dict(color_cols=col_cols,\n", + " noncolor_cols=noncol_cols,\n", + " nondetect_val=99.0,\n", + " noncolor_nondet=noncol_nondet,\n", + " color_nondet=col_nondet)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3cb2e78a-6864-4da5-9387-40a4dda531e8", + "metadata": {}, + "outputs": [], + "source": [ + "som_degrade = SOMSpecSelector.make_stage(name=\"som_degrader\", output=\"specz_mock_sample.pq\", **som_dict)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "eef9045c-7146-4e07-8d66-1495f407c590", + "metadata": {}, + "outputs": [], + "source": [ + "trimdf = som_degrade(test_data)" + ] + }, + { + "cell_type": "markdown", + "id": "f99189ba-abfd-4850-832e-f0ac0016966b", + "metadata": {}, + "source": [ + "let's plot the redshift histogram and mag vs color plot to see how well our selection matches the reference set:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6fac189f-2f2b-4287-a09a-4dbe16dc213f", + "metadata": {}, + "outputs": [], + "source": [ + "fig, axs = plt.subplots(4,1, figsize=(8,24))\n", + "xbins = np.linspace(-.005, 3.005,52)\n", + "magbins = np.linspace(14, 25.5, 52)\n", + "axs[0].hist(test_data()['redshift'], bins=xbins, alpha=0.15, color='orange', label='input sample');\n", + "axs[0].hist(ref_data()['redshift'], bins=xbins, alpha=0.5, color='k', label='specz sample');\n", + "axs[0].hist(trimdf()['redshift'], bins=xbins, alpha=0.15, color='b', label='degraded sample')\n", + "axs[0].set_xlabel('redshift', fontsize=14)\n", + "axs[0].legend(loc='upper right', fontsize=12)\n", + "axs[0].set_ylabel('number', fontsize=14);\n", + "\n", + "axs[1].scatter(test_data()['mag_i_lsst'], test_data()['mag_g_lsst'] - test_data()['mag_r_lsst'], \n", + " s=2, label='input data', alpha=0.4, color='orange')\n", + "axs[1].scatter(ref_data()['mag_i_lsst'], ref_data()['mag_g_lsst'] - ref_data()['mag_r_lsst'], \n", + " s=4, label='specz data', alpha=0.4, color='k')\n", + "axs[1].scatter(trimdf()['mag_i_lsst'], trimdf()['mag_g_lsst'] - trimdf()['mag_r_lsst'], \n", + " s=4, label='degraded data', alpha=0.4, color='b')\n", + "axs[1].legend(loc='upper left', fontsize=12)\n", + "axs[1].set_ylim(-1,3.5);\n", + "axs[1].set_xlabel(\"i-band mag\", fontsize=14)\n", + "axs[1].set_ylabel(\"g - r\", fontsize=14)\n", + "\n", + "axs[2].scatter(test_data()['mag_g_lsst'] - test_data()['mag_r_lsst'], \n", + " test_data()['mag_r_lsst'] - test_data()['mag_i_lsst'], \n", + " s=2, label='input data', alpha=0.4, color='orange')\n", + "axs[2].scatter(ref_data()['mag_g_lsst'] - ref_data()['mag_r_lsst'],\n", + " ref_data()['mag_r_lsst'] - ref_data()['mag_i_lsst'], \n", + " s=4, label='specz data', alpha=0.4, color='k')\n", + "axs[2].scatter(trimdf()['mag_g_lsst'] - trimdf()['mag_r_lsst'],\n", + " trimdf()['mag_r_lsst'] - trimdf()['mag_i_lsst'], \n", + " s=4, label='degraded data', alpha=0.3, color='b')\n", + "axs[2].legend(loc='upper right', fontsize=12)\n", + "axs[2].set_xlabel(\"g - r\", fontsize=14)\n", + "axs[2].set_ylabel(\"r - i\", fontsize=14)\n", + "\n", + "axs[3].hist(test_data()['mag_i_lsst'], bins=magbins, alpha=0.15, color='orange', label='input sample');\n", + "axs[3].hist(ref_data()['mag_i_lsst'], bins=magbins, alpha=0.5, color='k', label='specz sample');\n", + "axs[3].hist(trimdf()['mag_i_lsst'], bins=magbins, alpha=0.15, color='b', label='degraded sample')\n", + "axs[3].set_xlabel('i-band magnitude', fontsize=14)\n", + "axs[3].legend(loc='upper left', fontsize=12)\n", + "axs[3].set_ylabel('number', fontsize=14);\n" + ] + }, + { + "cell_type": "markdown", + "id": "8843a3ce-eac8-4155-8a16-130501f138b5", + "metadata": {}, + "source": [ + "The redshift distribution of our degraded sample matches very well with the reference data, the magnitude vs color distribution is not as clean; however, this is very likely due to the small number of objects used to train the SOM, and performance and matchup should improve with larger samples. Below we will download a slightly larger data samples, and we can (optionally) test how well the results agree when more data is available. \n", + "\n", + "\n", + "**Note:**\n", + "The files are rather large, so you will need to uncomment the lines below in order to download the files and have the second half of this notebook run. Let's grab some data from the Roman-DESC sims, we'll grab a tar file with two files, one with 37,500 galaxies, and one with 75,000 galaxies:" + ] + }, + { + "cell_type": "markdown", + "id": "a49b9217-35db-4100-8b48-6416ebf9be98", + "metadata": {}, + "source": [ + "## Uncomment the lines in the cell below and execute to download the data needed for the rest of the notebook!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "45a97a88-e994-47a1-b975-58684927abb2", + "metadata": {}, + "outputs": [], + "source": [ + "training_file = \"./romandesc_specdeep.tar\"\n", + "\n", + "#UNCOMMENT THESE LINES TO GRAB THE LARGER DATA FILES!\n", + "\n", + "#if not os.path.exists(training_file):\n", + "# os.system('curl -O https://portal.nersc.gov/cfs/lsst/PZ/romandesc_specdeep.tar')\n", + "#!tar -xvf romandesc_specdeep.tar" + ] + }, + { + "cell_type": "markdown", + "id": "01ffce9b-5018-4ef3-b0ee-efbe6fe8b105", + "metadata": {}, + "source": [ + "We will read in the two files, make similar cuts to the mock \"spec\" file as we did in the example above, and then add the files to the datastore" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "84605d30-65b9-468e-b244-b48712c374b4", + "metadata": {}, + "outputs": [], + "source": [ + "rdspecfile = \"./romandesc_spec_data_37k.hdf5\"\n", + "rdtestfile = \"./romandesc_deep_data_75k.hdf5\"\n", + "\n", + "rdtest = tables_io.read(rdtestfile)\n", + "rdtestpq = tables_io.convert(rdtest, tables_io.types.PD_DATAFRAME)\n", + "big_test_data = DS.add_data(\"big_input\", rdtestpq, PqHandle)\n", + "\n", + "\n", + "rdspec = tables_io.read(rdspecfile)\n", + "rdspecpq = tables_io.convert(rdspec, tables_io.types.PD_DATAFRAME)\n", + "\n", + "mask = np.logical_and(rdspecpq['redshift'] < 1.5,\n", + " np.logical_and(rdspecpq['i'] < 24.4, \n", + " rdspecpq['g'] - rdspecpq['r'] < 1.0))\n", + "rdspecpqcut = rdspecpq[mask]\n", + "big_spec_data = DS.add_data(\"big_spec\", rdspecpqcut, PqHandle)\n" + ] + }, + { + "cell_type": "markdown", + "id": "aad8ec8e-e1f8-4497-be13-f8112186d831", + "metadata": {}, + "source": [ + "Let's take a look at the columns available, this file should contain both the magnitudes and colors for the Roman-DESC sims:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "57e1e907-9bbb-4cb4-ae58-1f5296c5786a", + "metadata": {}, + "outputs": [], + "source": [ + "big_spec_data().head()" + ] + }, + { + "cell_type": "markdown", + "id": "706609b1-8295-4521-af0a-60fa4a2d60d6", + "metadata": {}, + "source": [ + "As in the first example, we will just use one magnitude, `i`, and the five colors to build the SOM. Because the colors are already present we can just add them directly to the non-color columns. Let's set things up appropriately:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "26cf7979-1d15-4d93-9c1d-1684301c6dfa", + "metadata": {}, + "outputs": [], + "source": [ + "noncol_cols = ['i', 'ug', 'gr', 'ri', 'iz', 'zy']\n", + "col_cols = []\n", + "\n", + "noncol_nondet = [28.62, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0]\n", + "col_nondet = []\n", + "\n", + "som_dict = dict(color_cols=col_cols,\n", + " noncolor_cols=noncol_cols,\n", + " nondetect_val=99.0,\n", + " noncolor_nondet=noncol_nondet,\n", + " color_nondet=col_nondet)" + ] + }, + { + "cell_type": "markdown", + "id": "8ca62808-f505-4a3e-a9f2-640ad31ac5db", + "metadata": {}, + "source": [ + "**Note:** as mentioned earlier in this demo, the `SOMSpecSelector` stage expects the reference/spectroscopic data file to be labeled as \"spec_data\" in the DataStore. As we already loaded the previous example data with that name, we'll need to tell this second copy of the `SOMSpecSelector` stage that we will be using a dataset with a different label. We do this by setting the new label in a dictionary fed in as `aliases` to the stage. We added the new reference/specz file with the label \"big_spec\", so we can simply add `aliases=dict(spec_data=\"big_spec\")` to let the stage know which file to use as the reference/spec data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4b5e73ac-ee59-416d-8dc3-6ec6166944e0", + "metadata": {}, + "outputs": [], + "source": [ + "# note that \n", + "roman_som_degrade = SOMSpecSelector.make_stage(name=\"roman_som_degrader\", \n", + " output=\"roman_specz_mock_sample.pq\", \n", + " aliases = dict(spec_data=\"big_spec\"),\n", + " **som_dict)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aeb05326-e355-4f12-a853-25060d23813d", + "metadata": {}, + "outputs": [], + "source": [ + "roman_trim = roman_som_degrade(big_test_data)" + ] + }, + { + "cell_type": "markdown", + "id": "75402f9f-21db-49e7-a14d-b0d06a84b704", + "metadata": {}, + "source": [ + "Let's make the same plots as above:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c5e7b0d5-7d82-4cf7-9375-aabc9d864121", + "metadata": {}, + "outputs": [], + "source": [ + "fig, axs = plt.subplots(4,1, figsize=(8,24))\n", + "xbins = np.linspace(-.005, 3.005,52)\n", + "magbins = np.linspace(14, 25.5, 52)\n", + "axs[0].hist(big_test_data()['redshift'], bins=xbins, alpha=0.15, color='orange', label='input sample');\n", + "axs[0].hist(big_spec_data()['redshift'], bins=xbins, alpha=0.5, color='k', label='specz sample');\n", + "axs[0].hist(roman_trim()['redshift'], bins=xbins, alpha=0.15, color='b', label='degraded sample')\n", + "axs[0].set_xlabel('redshift', fontsize=14)\n", + "axs[0].legend(loc='upper right', fontsize=12)\n", + "axs[0].set_ylabel('number', fontsize=14);\n", + "\n", + "axs[1].scatter(big_test_data()['i'], big_test_data()['gr'], \n", + " s=2, label='input data', alpha=0.4, color='orange')\n", + "axs[1].scatter(big_spec_data()['i'], big_spec_data()['gr'], \n", + " s=4, label='specz data', alpha=0.4, color='k')\n", + "axs[1].scatter(roman_trim()['i'], roman_trim()['gr'], \n", + " s=4, label='degraded data', alpha=0.4, color='b')\n", + "axs[1].legend(loc='upper left', fontsize=12)\n", + "axs[1].set_ylim(-.5,2.2);\n", + "\n", + "axs[2].scatter(big_test_data()['gr'], \n", + " big_test_data()['ri'], \n", + " s=2, label='input data', alpha=0.4, color='orange')\n", + "axs[2].scatter(big_spec_data()['gr'],\n", + " big_spec_data()['ri'],\n", + " s=4, label='specz data', alpha=0.4, color='k')\n", + "axs[2].scatter(roman_trim()['gr'],\n", + " roman_trim()['ri'],\n", + " s=4, label='degraded data', alpha=0.3, color='b')\n", + "axs[2].legend(loc='upper right', fontsize=12)\n", + "axs[2].set_xlabel(\"g - r\", fontsize=14)\n", + "axs[2].set_ylabel(\"r - i\", fontsize=14)\n", + "\n", + "axs[3].hist(big_test_data()['i'], bins=magbins, alpha=0.15, color='orange', label='input sample');\n", + "axs[3].hist(big_spec_data()['i'], bins=magbins, alpha=0.5, color='k', label='specz sample');\n", + "axs[3].hist(roman_trim()['i'], bins=magbins, alpha=0.15, color='b', label='degraded sample')\n", + "axs[3].set_xlabel('i-band magnitude', fontsize=14)\n", + "axs[3].legend(loc='upper left', fontsize=12)\n", + "axs[3].set_ylabel('number', fontsize=14);" + ] + }, + { + "cell_type": "markdown", + "id": "1c42ef91-1de7-4466-b0a9-e4ba31f8d117", + "metadata": {}, + "source": [ + "We again see good agreement on the redshift and i-band magnitude distributions, and good but not perfect agreement on magnitude-color and color-color distributions. So, it appears that our mock specz selection algorithm is working as expected." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0b1871e6-ce51-4ebf-8ffb-a61bbbcc6a6e", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}