Skip to content
This repository has been archived by the owner on Oct 30, 2024. It is now read-only.

Load data #16

Merged
merged 27 commits into from
Jan 26, 2023
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
2c378c0
Add logic to load matlab files
lauraporta Dec 20, 2022
7d7c9df
Change to h5py
lauraporta Jan 6, 2023
558aed3
Add style changes
lauraporta Jan 6, 2023
6c27ab6
Merge branch 'developement' into load-data
lauraporta Jan 6, 2023
6629190
Add logic to read HDF5 data, i.e. read 7.3 matlab complex files
lauraporta Jan 9, 2023
eda6a8e
Add docstrings
lauraporta Jan 9, 2023
cf84da1
Merge commit 'd7338f16187d2ab121528255c51128c7042cafed' into load-data
lauraporta Jan 10, 2023
6fb4983
Change how to handle config path
lauraporta Jan 10, 2023
4ec4a94
Improvements to DataRaw, logging, classmethods and repr obj
lauraporta Jan 10, 2023
fb42944
Fix some errors in reading hdf5 files
lauraporta Jan 10, 2023
0095870
Add integration test with mock for loading hdf5 data
lauraporta Jan 10, 2023
2178039
Add h5py dependency to pyproject.toml
lauraporta Jan 10, 2023
c5db159
Update doctrings
lauraporta Jan 10, 2023
8b1011c
Add usage of numpy squeeeze
lauraporta Jan 11, 2023
96b0485
Split tests
lauraporta Jan 11, 2023
bfd99d8
Add missing squeeze
lauraporta Jan 11, 2023
d3c8d4c
Add pyproject.toml dependency
lauraporta Jan 23, 2023
3b0b91b
Add instructions to handle configs
lauraporta Jan 23, 2023
3229083
Add improvements to load data
lauraporta Jan 23, 2023
d00926c
Add edits to DataRaw according to suggestions
lauraporta Jan 23, 2023
9656f22
Invert order of methods again
lauraporta Jan 23, 2023
b331b76
Improve tests
lauraporta Jan 23, 2023
6ebad71
Remove unnecessary tox edit
lauraporta Jan 23, 2023
e09d394
Add better example explanation
lauraporta Jan 23, 2023
c546648
Add example in folder name request
lauraporta Jan 24, 2023
1ad55e3
Fix decouple package name
lauraporta Jan 26, 2023
021d61f
Remove dot
lauraporta Jan 26, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -85,3 +85,6 @@ venv/
# config specific
**/config.yml
.env

# used in testing
*.hdf5
36 changes: 32 additions & 4 deletions load_suite2p/load/load_data.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,20 @@
import logging
from pathlib import Path
from typing import Tuple

import h5py
from decouple import config
Copy link
Member

@JoeZiminski JoeZiminski Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be a added to pyproject.toml from decouple import config

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks!


from ..objects.data_raw import DataRaw
from ..objects.enums import AnalysisType, DataType
from ..objects.specifications import Specifications
from .read_config import read

CONFIG_PATH = config("CONFIG_PATH")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This python-decouple module is very cool, I have not seen it before. As a completely naive user, I tried playing around with this just in the console and got 'CONFIG_PATH not found. Declare it as envvar or define a default value.', I can't quite figure out how it works. Is it possible to add these two lines of code to a function (e.g. a new function read_configs.get_config_path()) and add a docstring on how decouple works?

Would it be possible / sensible to get config_path in main.py and pass it load_data > get_specifications > read_configurations() ) so that settings are displayed together in main.py for readability? but maybe this suggestion is not optimal for python-decouple

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using decouple as a way to not store in the repo the local paths of the config. It is set up in a way that it takes from granted the existence of an .env containing the variable CONFIG_PATH. I forgot to specify this, my bad. I will add it to Readme.md file.
I am not yet sure of the way I want to handle the configuration file yet, that is why this bit is still vague and unpolished.

config_path = Path(__file__).parents[1] / CONFIG_PATH


def load_data(folder_name: str) -> Tuple[list, Specifications]:
def load_data(folder_name: str) -> Tuple[DataRaw, Specifications]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is folder_name the top_level folder of the experimental data directory? This is probably obvious as I can't recall how they lay out their data, but a quick example e.g. if the path is /windows/something/myproject the name is myproject? how does the loader know the full path to the project directory? This seems to be all handled very gracefully, maybe just some docs for the naive user on the main() docstring would be useful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path is stored in the config file that was hidden from you sorry ☹️

"""Creates the configuration object and loads the data.

Parameters
Expand Down Expand Up @@ -51,8 +56,31 @@ def get_specifications(folder_name: str) -> Specifications:
return specs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could a little confused about the difference between 'specifications' vs. 'confgurations'. Instead of 'specifications' could this be a subset of configurations e.g. project_configs (maybe later there will be other types of configs)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is very confusing because also the decouple package uses the word config, but in this case as a method name. Specifications is a bigger container that holds the configurations loaded from the file config.yml and other information that is extracted at runtime, as the complete path to the project folder. I want to re-name specifications and all the config management in a separate PR.



def load(specs: Specifications) -> list:
raise NotImplementedError("TODO")
def load(specs: Specifications) -> DataRaw:
if specs.config["use-allen-dff"]:
if specs.config["analysis-type"] == "sf_tf":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to expand this abbreviation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is meant to be spatial frequency, temporal frequency, which is quite long...
I could just call it frequencies if it appears very confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe spat_freq_temp_freq? or "spat_f_temp_f`?

allen_data_files = [
file
for file in specs.folder_naming.all_files
if file.datatype == DataType.ALLEN_DFF
and file.analysistype == AnalysisType.SF_TF
]
if len(allen_data_files) == 1:
data_raw = DataRaw(
h5py.File(allen_data_files[0].path, "r"), is_allen=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may already be happening in DataRaw, but if not it might be worth opening the file in a context manager that will close it automatically in case of an exception during loading, e.g.

with h5py.File(allen_data_files[0].path, "r") as h5py_file:
    data_raw = DataRaw(h5py_file, is_allen=True)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it makes sense, thanks!

)
logging.info(f"Allen data loaded: {data_raw}")
return data_raw
else:
raise ValueError(
"There is more than one Allen file for sf_tf analysis"
)
else:
raise NotImplementedError(
"Only sf_tf analysis is implemented for Allen data"
)
else:
raise NotImplementedError("Only loading for Allen data is implemented")


def read_configurations() -> dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is similar to read_config.read() but with an extra layer of logging. Could these two functions be combined? (just a suggestion, this layering of logging could be very useful)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it is quite useless

Expand All @@ -65,7 +93,7 @@ def read_configurations() -> dict:
"""

logging.debug("Reading configurations")
config = read(CONFIG_PATH)
config = read(config_path)
logging.debug(f"Configurations read: {config}")

return config
146 changes: 146 additions & 0 deletions load_suite2p/objects/data_raw.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
import logging
from typing import Union

import h5py
import numpy as np


class DataRaw:
"""Class to load and contain the raw data.
It can load data from Allen or from
a list of Paths. Only the Allen case is implemented so far.
"""

def __init__(self, data, is_allen: bool = True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data does not have class definition (but maybe it is not possible)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can, I forgot to add it!

if is_allen:
logging.info("Loading Allen data, starting to unpack...")

self.day = self.unpack_data(data["day"], data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is self.unpack_data private or also planned to be called from outside class

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar with group_to_dict_recursive and ref_dataset_to_array

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unpack_data and the others should be private I think, I'll edit the name

logging.info("Unpacked day")

self.imaging = self.unpack_data(data["imaging"], data)
logging.info("Unpacked imaging")

self.f = self.unpack_data(data["f"], data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could the field "f" be un-abbreviated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, it is frames!

logging.info("Unpacked f")

self.is_cell = self.unpack_data(data["is_cell"], data)
logging.info("Unpacked is_cell")

self.r_neu = self.unpack_data(data["r_neu"], data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is a candidate for expanding the field too (but maybe it is obvious in the field)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll unpack it

logging.info("Unpacked r_neu")

self.stim = self.unpack_data(data["stim"], data)
logging.info("Unpacked stim")

self.trig = self.unpack_data(data["trig"], data)
logging.info("Unpacked trig")
else:
self.day = data["day"]
self.imaging = data["imaging"]
self.f = data["f"]
self.is_cell = data["is_cell"]
self.r_neu = data["r_neu"]
self.stim = data["stim"]
self.trig = data["trig"]

def __repr__(self) -> str:
return f"DataRaw(day={self.day}, imaging={self.imaging}, f={self.f}, \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is cool

is_cell={self.is_cell}, r_neu={self.r_neu}, stim={self.stim}, \
trig={self.trig})"

@classmethod
def group_to_dict_recursive(cls, group: h5py._hl.group.Group) -> dict:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if its worth it, but might be nice to have an example folder directory tree, just to get a handle on how the underlying data looks to help interpret the functions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have it... I could add this overall info in the class docstring

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two examples in the unpack_data() docstring

"""Takes a Group and resolves its content. If the Group contains
other Groups, it calls itself recursively.
It assumes there are no more References.

Args:
group (h5py._hl.group.Group):
HDF5 Group containing references

Returns:
dict: the resolved dictionary
"""
dict = {}
for key in group:
if isinstance(group[key], h5py._hl.group.Group):
dict[key] = cls.group_to_dict_recursive(group[key])
else:
dict[key] = np.squeeze(group[key][:])
return dict

@classmethod
def ref_dataset_to_array(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could / should the order of these functions be re-arranged for readability? if I understand correctly unpack_data calls ref_dataset_to_array calls group_to_dict_recursive

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I can re-arrange them!

cls,
dataset: h5py._hl.dataset.Dataset,
parent: Union[h5py._hl.group.Group, h5py.File],
) -> np.ndarray:
"""Takes a Dataset that contains references to other Datasets or
Groups and resolves its content.

Args:
dataset (h5py._hl.dataset.Dataset):
HDF5 Dataset containing references
parent_container (Union[h5py._hl.group.Group, h5py.File]):
is the object that contains the element.
It is used to resolve references.

Returns:
np.ndarray: an array of numbers or an array of dictionaries
"""
array = np.zeros((dataset.shape[0], dataset.shape[1]), dtype=object)

for i in range(dataset.shape[0]):
for j in range(dataset.shape[1]):
ref = dataset[i][j]
if isinstance(parent[ref], h5py._hl.group.Group):
array[i, j] = cls.group_to_dict_recursive(parent[ref])
else:
array[i, j] = np.squeeze(parent[ref][:])

return np.squeeze(array)

@classmethod
def unpack_data(
cls,
element: Union[h5py._hl.dataset.Dataset, h5py._hl.group.Group],
parent: Union[h5py.File, h5py._hl.group.Group],
) -> Union[np.ndarray, dict]:
"""This method unpack a complex MATLAB datastructure and returns a
nested dictionary or numpy array. Only the relevant subset (Dataset
and Groups) of the possible datastructures is implemented.
Datasets can be mapped to arrays. Groups can be mapped to
dictionaires, and each entry can be a Dataset or another Group.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo (dictionaries)

An array might contain numbers or point to other Arrays or Groups
through References.
References are a HDF5 type that can point either to an array or
to a group.
They need to be resolved in order to get the data. They are resolved
by calling the methods ref_dataset_to_array.
If element is a Group, its content is unpacked recursively.

Args:
element Union[h5py._hl.dataset.Dataset, h5py._hl.group.Group]:
is either a h5py Group or Dataset.
It is what we want to unpack.
parent Union[h5py.File, h5py._hl.group.Group]:
is the object that contains the element.
It is used to resolve references.

Returns:
Union[np.ndarray, dict]:
is either a numpy array or a nested dictionary.
"""
if isinstance(element, h5py._hl.dataset.Dataset):
if element.dtype == h5py.special_dtype(ref=h5py.Reference):
return cls.ref_dataset_to_array(element, parent)
else:
return np.squeeze(element[:])
elif isinstance(element, h5py._hl.group.Group):
dict = {}
for key in element:
dict[key] = cls.unpack_data(element[key], element)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this recursive file / folder unpacking is handled very nicely

return dict
else:
return None
5 changes: 3 additions & 2 deletions load_suite2p/objects/photon_data.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
from .data_raw import DataRaw
from .specifications import Specifications


class PhotonData:
"""Class to load the formatted data from suite2p and registers2p."""

def __init__(self, data_raw: list, specs: Specifications):
def __init__(self, data_raw: DataRaw, specs: Specifications):

self.response_matrix = self.get_response_matrix()
self.preprocess(data_raw, specs)
Expand All @@ -20,7 +21,7 @@ def __init__(self, data_raw: list, specs: Specifications):
def get_response_matrix(self):
raise NotImplementedError("This method is not implemented yet")

def preprocess(self, data_raw: list, specs: Specifications):
def preprocess(self, data_raw: DataRaw, specs: Specifications):
raise NotImplementedError("This method is not implemented yet")

def reorder(self):
Expand Down
6 changes: 3 additions & 3 deletions load_suite2p/objects/specifications.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ class Specifications:
to be loaded."""

def __init__(self, config: dict, folder_name: str):
self.base_paths: dict = config["paths"]
self.config: dict = config
self.folder_name = folder_name
self.folder_naming_specs = FolderNamingSpecs(folder_name, config)
self.folder_naming_specs.extract_all_file_names()
self.folder_naming = FolderNamingSpecs(folder_name, config)
self.folder_naming.extract_all_file_names()
self.options = Options(config)
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ dependencies = [
"fancylog",
"PyYAML",
"types-PyYAML",
"h5py",
]

[project.urls]
Expand Down
86 changes: 86 additions & 0 deletions tests/test_integration/test_data_raw.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
import h5py
import numpy as np

from load_suite2p.objects.data_raw import DataRaw

array0 = np.array([1, 2, 3, 4, 5])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use np.arange(1, 6) etc. as more concise, but less readable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes thanks!

array1 = np.array([3, 4, 5, 6, 7])
array2 = np.array([5, 6, 7, 8, 9])
array3 = np.array([7, 8, 9, 10, 11])
array4 = np.array([9, 10, 11, 12, 13])


def create_mock_hdf5_data():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a pytest fixture could be used in this case, it is a nice way to control data that is passed to tests. The main benefits are
a) you don't need to call the function in the code each time, you can pass it as an argument e.g.

@pytest.mark.fixture(scope="function")
def create_mock_hdf5_data():
    ....
    yeild None
    teardown_code()

def test_unpack_of_simple_dataset(create_mock_hdf5_data)

I think this will work, but maybe not as the create function doesn't explicitly return anything, but that shouldn't matter, including it in this way should trigger the dataset creation for each function

b) fixtures are nice because you can control the scope of the data. For example with scope="function", create_mock_hdf5_data() will be called again for each function it is used in. for scope="class", the function will be called once and then re-used (nice to save time if you are definately not changing the data). However typically I would always use function scope just so you are 100% sure you are starting fresh for each test.

c) All code after the yeild keyword is still run, which makes tearing down after tests easier. In this case I would suggest adding teardown code that deletes the written file.

I always think pytest docs are a bit confusing but this is nice https://towardsdatascience.com/make-your-python-tests-efficient-with-pytest-fixtures-3d7a1892265f

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, I like fixtures very much. I've just implemented better mocks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest has a automatic fixture tmp_path that gives you access to a temporary directory that it might be easier to write to rather than cwd (which if I understand correctly this does) https://docs.pytest.org/en/7.1.x/how-to/tmp_path.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! I've implemented it

with h5py.File("mytestfile.hdf5", "w") as f:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just personal preference, I like to unabreviate f to file in this case just to be super explicit

# create a file and add a simple dataset
f.create_dataset("array0", data=array0, dtype="i")

# create a file and add a group with a dataset inside
grp = f.create_group("mygroup")
grp.create_dataset("array1", data=array1, dtype="f")

# create group with subgroup and a dataset
subgroup = grp.create_group("subgroup")
subgroup.create_dataset("array2", data=array2, dtype="f")

# create a dataset with references of dataset
dataset_to_be_referenced = f.create_dataset(
"array3", data=array3, dtype="f"
)
ref = dataset_to_be_referenced.ref
ref_array = [[ref, ref], [ref, ref]]
f.create_dataset(
"ref_dataset",
data=ref_array,
dtype=h5py.special_dtype(ref=h5py.Reference),
)

# create a dataset with references of group with subgroup
group_to_be_referenced = f.create_group("#ref_group#")
subgroup2 = group_to_be_referenced.create_group("subgroup2")
subgroup2.create_dataset("array4", data=array4, dtype="f")
ref2 = group_to_be_referenced.ref
ref_array2 = [[ref2, ref2], [ref2, ref2]]
f.create_dataset(
"ref_dataset2",
data=ref_array2,
dtype=h5py.special_dtype(ref=h5py.Reference),
)


def test_unpack_of_simple_dataset():
create_mock_hdf5_data()
with h5py.File("mytestfile.hdf5", "r") as f:
assert np.all(DataRaw.unpack_data(f["array0"], f) == array0)


def test_unpack_of_dataset_in_group():
create_mock_hdf5_data()
with h5py.File("mytestfile.hdf5", "r") as f:
assert np.all(DataRaw.unpack_data(f["mygroup"]["array1"], f) == array1)


def test_unpack_of_dataset_in_subgroup():
create_mock_hdf5_data()
with h5py.File("mytestfile.hdf5", "r") as f:
assert np.all(
DataRaw.unpack_data(f["mygroup"]["subgroup"]["array2"], f)
== array2
)


def test_unpack_of_dataset_with_references_to_dataset():
create_mock_hdf5_data()
with h5py.File("mytestfile.hdf5", "r") as f:
assert np.all(DataRaw.unpack_data(f["ref_dataset"], f)[0][0] == array3)


def test_unpack_of_dataset_with_references_to_group_with_subgroup():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think these tests look very nice. It may also be worth testing the package handles any potentially common bad inputs the user might give. This can be done by checking an error is raised (and its content) with pytest.raises()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what a bad input could be in this case, maybe if they give the wrong file type? or a corrupted hdf5 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yes, I am not too sure. Maybe bad paths on the config file? Although this is already quite obvious from the error. Another possibility is they pass a .mat file that is not structured in the expected way. But I'm not sure if it's possible in practice.

Another test possibility (although this may be overkill) is to go through one exmaple .mat file and save all (or a subset) of the expected arrays into separate mat files (or csv or something). Then you can load the file file into load-suite2p and test all arrays in python against matlab. This would be useful as a sanity check against a real dataset, and would highlight any small numerical changes that may occur during conversion (these are likely to be extremely small, if they exist at all).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Joe for the suggestions. I will open a separate issue to address these use cases, also because they seem related to a further expansion of the load function that I aim to develop later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#21 here we go

create_mock_hdf5_data()
with h5py.File("mytestfile.hdf5", "r") as f:
assert np.all(
DataRaw.unpack_data(f["ref_dataset2"], f)[0][0]["subgroup2"][
"array4"
]
== array4
)