Skip to content

Commit

Permalink
Move the whole mongo_storage module to utilix (#1445)
Browse files Browse the repository at this point in the history
* switch to utilix for db operations

* replace straxen.mongo_downloader

* replace straxen.mongodownloader

* remove mongo_storage

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update init for mongo_storage

* update init

* fix import problem

* override base env with latest utilix release

* remove the override group as it doesn't work

* override utilix

* force upgrade utilix in pytest

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove doc for mongo_storage

* remove redundant override in toml

* Use latest utilix

* Remove patched `sharedarray`

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Dacheng Xu <[email protected]>
  • Loading branch information
3 people authored Oct 14, 2024
1 parent f514d5e commit df0d488
Show file tree
Hide file tree
Showing 13 changed files with 38 additions and 636 deletions.
26 changes: 13 additions & 13 deletions docs/source/config_storage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ what is done behind the scenes.

Downloading XENONnT files from the database
-------------------------------------------
Most generically one downloads files using the :py:class:`straxen.MongoDownloader`
Most generically one downloads files using the :py:class:`utilix.mongo_storage.MongoDownloader`
function. For example, one can download a file:

.. code-block:: python
import straxen
downloader = straxen.MongoDownloader()
import utilix
downloader = utilix.mongo_storage.MongoDownloader()
# The downloader allows one to download files from the mongo database by
# looking for the requested name in the files database. The downloader
#returns the path of the downloaded file.
Expand Down Expand Up @@ -55,7 +55,7 @@ Therefore, this manner of loading data is intended only for testing purposes.
How does the downloading work?
--------------------------------------
In :py:mod:`straxen/mongo_storage.py` there are two classes that take care of the
In :py:mod:`utilix/mongo_storage.py` there are two classes that take care of the
downloading and the uploading of files to the `files` database. In this
database we store configuration files under a :py:obj:`config_identifier` i.e. the
:py:obj:`'file_name'`. This is the label that is used to find the document one is
Expand All @@ -69,8 +69,8 @@ an admin user (with the credentials to upload files to the database) uploads a
file to the `files`- database (not shown) such that it can be downloaded later
by any user. The admin user can upload a file using the command
:py:obj:`MongoUploader.upload_from_dict({'file_name', '/path/to/file'})`.
This command will use the :py:class:`straxen.MongoUploader` class to put the file
:py:obj:`'file_name'` in the `files` database. The :py:class:`straxen.MongoUploader` will
This command will use the :py:class:`utilix.mongo_storage.MongoUploader` class to put the file
:py:obj:`'file_name'` in the `files` database. The :py:class:`utilix.mongo_storage.MongoUploader` will
communicate with the database via `GridFs
<https://docs.mongodb.com/manual/core/gridfs/>`_.
The GridFs interface communicates with two mongo-collections; :py:obj:`'fs.files'` and
Expand All @@ -81,15 +81,15 @@ storing pieces of data (not to be confused with :py:class:`strax.Chunks`).
Uploading
^^^^^^^^^
When the admin user issues the command to upload the :py:obj:`'file_name'`-file. The
:py:class:`straxen.MongoUploader` will check that the file is not already stored in the
database. To this end, the :py:class:`straxen.MongoUploader` computes the :py:obj:`md5-hash` of
:py:class:`utilix.mongo_storage.MongoUploader` will check that the file is not already stored in the
database. To this end, the :py:class:`utilix.mongo_storage.MongoUploader` computes the :py:obj:`md5-hash` of
the file stored under the :py:obj:`'/path/to/file'`. If this is the first time a file
with this :py:obj:`md5-hash` is uploaded, :py:class:`straxen.MongoUploader` will upload it to
with this :py:obj:`md5-hash` is uploaded, :py:class:`utilix.mongo_storage.MongoUploader` will upload it to
:py:obj:`GridFs`. If there is already an existing file with the :py:obj:`md5-hash`, there is no
need to upload. This however does mean that if there is already a file :py:obj:`'file_name'`
stored and you modify the :py:obj:`'file_name'`-file, it will be uploaded again! This is
a feature, not a bug. When a user requests the :py:obj:`'file_name'`-file, the
:py:class:`straxen.MongoDownloader` will fetch the :py:obj:`'file_name'`-file that was uploaded
:py:class:`utilix.mongo_storage.MongoDownloader` will fetch the :py:obj:`'file_name'`-file that was uploaded
last.


Expand All @@ -98,7 +98,7 @@ Downloading
Assuming that an admin user uploaded the :py:obj:`'file_name'`-file, any user (no
required admin rights) can now download the :py:obj:`'file_name'`-file (see above for the
example). When the user executes :py:obj:`MongoUploader.download_single('file_name')`,
the :py:class:`straxen.MongoDownloader` will check if the file is downloaded already. If
the :py:class:`utilix.mongo_storage.MongoDownloader` will check if the file is downloaded already. If
this is the case it will simply return the path of the file. Otherwise, it will
start downloading the file. It is important to notice that the files are saved
under their :py:obj:`md5-hash`-name. This means that wherever the files are stored,
Expand All @@ -112,8 +112,8 @@ already stored but it would be if the file has been changed as explained above.

Straxen Mongo config loader classes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Both the :py:class:`straxen.MongoUploader` and :py:class:`straxen.MongoDownloader` share a common
Both the :py:class:`utilix.mongo_storage.MongoUploader` and :py:class:`utilix.mongo_storage.MongoDownloader` share a common
parent class, the :py:class:`straxen.GridFsInterface` that provides the appropriate
shared functionality and connection to the database. The important difference
is the :py:obj:`readonly` argument that naturally has to be :py:obj:`False` for the
:py:class:`straxen.MongoUploader` but :py:obj:`True` for the :py:class:`straxen.MongoDownloader`.
:py:class:`utilix.mongo_storage.MongoUploader` but :py:obj:`True` for the :py:class:`utilix.mongo_storage.MongoDownloader`.
8 changes: 0 additions & 8 deletions docs/source/reference/straxen.storage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,6 @@ straxen.storage package
Submodules
----------

straxen.storage.mongo\_storage module
-------------------------------------

.. automodule:: straxen.storage.mongo_storage
:members:
:undoc-members:
:show-inheritance:

straxen.storage.online\_monitor\_frontend module
------------------------------------------------

Expand Down
4 changes: 2 additions & 2 deletions docs/source/url_configs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,10 @@ A concrete plugin example
print(f"Path is local. Loading {self.algorithm} TF model locally "
f"from disk.")
else:
downloader = straxen.MongoDownloader()
downloader = utilix.mongo_storage.MongoDownloader()
try:
self.model_file = downloader.download_single(self.model_file)
except straxen.mongo_storage.CouldNotLoadError as e:
except utilix.mongo_storage.CouldNotLoadError as e:
raise RuntimeError(f'Model files {self.model_file} is not found') from e
with tempfile.TemporaryDirectory() as tmpdirname:
tar = tarfile.open(self.model_file, mode="r:gz")
Expand Down
3 changes: 1 addition & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,8 @@ docutils = "==0.18.1"
mistune = "==0.8.4"
pymongo = "*"
requests = "*"
utilix = ">=0.5.3"
utilix = ">=0.11.0"
xedocs = "*"
sharedarray = { url = "https://xenon.isi.edu/python/SharedArray-3.2.3.tar.gz", optional = true }
base_environment = { git = "https://github.com/XENONnT/base_environment.git", optional = true }
commonmark = { version = "0.9.1", optional = true }
nbsphinx = { version = "0.8.9", optional = true }
Expand Down
2 changes: 1 addition & 1 deletion pytest.ini
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[pytest]
filterwarnings =
ignore::numba.NumbaExperimentalFeatureWarning
ignore::straxen.storage.mongo_storage.DownloadWarning
ignore::utilix.mongo_storage.DownloadWarning
2 changes: 1 addition & 1 deletion straxen/analyses/daq_waveforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ def _board_to_host_link(daq_config: dict, board: int, add_crate=True) -> str:

def _get_cable_map(name: str = "xenonnt_cable_map.csv") -> pandas.DataFrame:
"""Download the cable map and return as a pandas dataframe."""
down = straxen.MongoDownloader()
down = utilix.mongo_storage.MongoDownloader()
cable_map = down.download_single(name)
cable_map = pandas.read_csv(cable_map)
return cable_map
Expand Down
3 changes: 2 additions & 1 deletion straxen/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
import numba
import strax
import straxen
import utilix

export, __all__ = strax.exporter()
__all__.extend(
Expand Down Expand Up @@ -222,7 +223,7 @@ def get_resource(x: str, fmt="text"):
return open_resource(x, fmt=fmt)
# 3. load from database
elif straxen.uconfig is not None:
downloader = straxen.MongoDownloader()
downloader = utilix.mongo_storage.MongoDownloader()
if x in downloader.list_files():
path = downloader.download_single(x)
return open_resource(path, fmt=fmt)
Expand Down
3 changes: 2 additions & 1 deletion straxen/config/protocols.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
from immutabledict import immutabledict

from utilix import xent_collection
import utilix
from scipy.interpolate import interp1d


Expand All @@ -42,7 +43,7 @@ def get_resource(name: str, fmt: str = "text", **kwargs):
"""Fetch a straxen resource Allow a direct download using <fmt='abs_path'> otherwise kwargs are
passed directly to straxen.get_resource."""
if fmt == "abs_path":
downloader = straxen.MongoDownloader()
downloader = utilix.mongo_storage.MongoDownloader()
return downloader.download_single(name)
return straxen.get_resource(name, fmt=fmt)

Expand Down
3 changes: 2 additions & 1 deletion straxen/corrections_services.py
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,8 @@ def get_pmt_gains(
return to_pe

def get_config_from_cmt(self, run_id, model_type, version="ONLINE"):
"""Smart logic to return NN weights file name to be downloader by straxen.MongoDownloader()
"""Smart logic to return NN weights file name to be downloader by
utilix.mongo_storage.MongoDownloader()
:param run_id: run id from runDB
:param model_type: model type and neural network type; model_mlp, or model_gcn or model_cnn
Expand Down
4 changes: 2 additions & 2 deletions straxen/storage/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,5 @@
from . import rundb
from .rundb import *

from . import mongo_storage
from .mongo_storage import *
from utilix import mongo_storage
from utilix.mongo_storage import *
Loading

0 comments on commit df0d488

Please sign in to comment.