Skip to content

Commit

Permalink
Proposal to use pre-commit for continuous integration (#121)
Browse files Browse the repository at this point in the history
* Proposal to use pre-commit for continuous integration

* Minor change

* Minor change
  • Loading branch information
dachengx authored Sep 13, 2024
1 parent e768f7e commit 1c4a3ce
Show file tree
Hide file tree
Showing 18 changed files with 740 additions and 709 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/Pipi.yml → .github/workflows/pypi.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Lets upload utilix to PyPi to make it pip instalable
# Lets upload utilix to PyPi to make it pip instalable
# Mostly based on https://github.com/marketplace/actions/pypi-publish
on:
release:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ on:
- master

jobs:
update:
test:
runs-on: ubuntu-latest
steps:
- name: Setup python
Expand Down
39 changes: 39 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files

- repo: https://github.com/psf/black
rev: 24.8.0
hooks:
- id: black
args: [--safe, --line-length=100, --preview]
language_version: python3

- repo: https://github.com/pycqa/docformatter
rev: v1.7.5
hooks:
- id: docformatter

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.11.2
hooks:
- id: mypy
additional_dependencies: [
types-PyYAML, types-tqdm, types-pytz,
types-requests, types-setuptools,
]

- repo: https://github.com/pycqa/flake8
rev: 7.1.1
hooks:
- id: flake8

ci:
autoupdate_schedule: weekly
58 changes: 29 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# utilix
[![PyPI version shields.io](https://img.shields.io/pypi/v/utilix.svg)](https://pypi.python.org/pypi/utilix/)

``utilix`` is a utility package for XENON software, mainly relating to analysis. It currently has two main features: (1) a general XENON configuration framework and (2) easy access to the runsDB by wrapping python calls to a RESTful API. Eventually, we would like to include easy functions for interacting with the Midway and OSG batch queues.
``utilix`` is a utility package for XENON software, mainly relating to analysis. It currently has two main features: (1) a general XENON configuration framework and (2) easy access to the runsDB by wrapping python calls to a RESTful API. Eventually, we would like to include easy functions for interacting with the Midway and OSG batch queues.

## Installation
`git clone` this repo and:
Expand All @@ -24,37 +24,37 @@ environment variables can be used in the form `$HOME`. Example:


The idea is that analysts could use this single config for multiple purposes/analyses.
You just need to add a (unique) section for your own purpose and then you can use the `utilix.Config`
You just need to add a (unique) section for your own purpose and then you can use the `utilix.Config`
easily. For example, if you made a new section called `WIMP` with `detected = yes` under it:

from utilix.config import Config
cfg = Config()
value = cfg.get('WIMP', 'detected') # value = 'yes'

For more information, see the [ConfigParser](https://docs.python.org/3.6/library/configparser.html)
documentation, from which `utilix.config.Config` inherits.

## Runs Database
Nearly every analysis requires access to the XENON runsDB. The goal of utilix is to simplify the usage of this resource as much as possible. The ``rundb`` module includes two ways to access the runsDB:

1. A RESTful API: a Flask app running at Chicago that queries the runDB in a controlled manner. This is the recommended way to query the database if the specific query is supported. The source code for this app can be found [here](https://github.com/XENONnT/xenon_runsDB_api).
2. A wrapper around ``pymongo``, which sets up the Mongo client for you, similarly to how we did queries in XENON1T. In that case each package usually needed its own copy + pasted boilerplate code; that code is now just included in utilix where it can be easily imported by other packages.
2. A wrapper around ``pymongo``, which sets up the Mongo client for you, similarly to how we did queries in XENON1T. In that case each package usually needed its own copy + pasted boilerplate code; that code is now just included in utilix where it can be easily imported by other packages.


-------------
### RunDB API

#### RunDB API Authentication
The API authenticates using a token system. `utilix` makes the creation and renewal of these tokens easy with the `utilix.rundb.Token` class. When you specify a user/password in your utilix configuration file, as shown above, a token is saved locally at `~/.dbtoken` that contains this information. This token is used/renewed as needed, depending on the users specified in the config file.
The API authenticates using a token system. `utilix` makes the creation and renewal of these tokens easy with the `utilix.rundb.Token` class. When you specify a user/password in your utilix configuration file, as shown above, a token is saved locally at `~/.dbtoken` that contains this information. This token is used/renewed as needed, depending on the users specified in the config file.

Different API users have different permissions, with the general analysis user only able to read from the runDB and not write. This is an additional layer of security around the RunDB.
Different API users have different permissions, with the general analysis user only able to read from the runDB and not write. This is an additional layer of security around the RunDB.

#### Setting up the runDB
The goal of utilix is to make access to the runDB trivial. If using the runDB API, all you need to do to setup the runDB in your local script/shell is

from utilix import db
This instantiates the RunDB class, allowing for easy queries. Below we go through some examples of the type of queries currently supported by the runDB API wrapper in utilix.

This instantiates the RunDB class, allowing for easy queries. Below we go through some examples of the type of queries currently supported by the runDB API wrapper in utilix.

**If there is functionality missing that you think would be useful, please contact teamA or make a new issue (or even better, a pull request).**

Expand All @@ -69,23 +69,23 @@ Note that the interface returns pages of 1,000 entries, with the first page bein

#### Get a full document

You can also grab the full run document using the run number. A run name is also supported (from XENON1T days),
You can also grab the full run document using the run number. A run name is also supported (from XENON1T days),
but not going to be used for XENONnT

doc = db.get_doc(7200)

#### Get only the data entry of a document

data = db.get_data(2000)


#### Strax(en) Contexts
In XENONnT we need to track the hash (or lineage) that specifies a configuration for each datatype. We keep that information in a specific collection of the runDB. We can access that collection using the runDB API as shown below.

For a given context name and straxen version, we can get the hash for each dataype. For example, for the xenonnt_online context and straxen version 0.11.0:

>>> db.get_context('xenonnt_online', '0.11.0')

{'_id': '5f89f588d33cced1fd104ea5',
'date_added': '2020-10-16T19:33:28.913000',
'hashes': {'aqmon_hits': '4gwju6gdto',
Expand Down Expand Up @@ -136,44 +136,44 @@ If you know the specific datatype whose hash you need, use instead `get_hash`:

>>> db.get_hash('xenonnt_online', 'peaklets', '0.11.0')
'nagx3zzuiv'


If you are deemed worthy to have write permissions to the runDB (you have a corresponding user/password with write access in your config file), you can also add documents to the context collection with

>>> db.update_context_collection(document_data)

where `document_data` is a dictionary that contains the context name, straxen version, hash information, and more as shown in the example above.


where `document_data` is a dictionary that contains the context name, straxen version, hash information, and more as shown in the example above.


### Boilerplate pymongo setup
The runDB API is the recommended option for most database queries, but sometimes a specific query isn't supported or you might want to do complex aggregations, etc. For that reason, `utilix` also includes a wrapper around `pymongo` to setup the MongoClient. To use this, you need to specify in your config file
The runDB API is the recommended option for most database queries, but sometimes a specific query isn't supported or you might want to do complex aggregations, etc. For that reason, `utilix` also includes a wrapper around `pymongo` to setup the MongoClient. To use this, you need to specify in your config file

[RunDB]
pymongo_url = [ask someone]
pymongo_database = [ask someone]
pymongo_user = [ask someone]
pymongo_password = [ask someone]
pymongo_password = [ask someone]

Note that this is needed in addition to the runDB API fields. Given the correct user/password, you can setup the XENONnT runDB collection for queries as follows:

>>> from utilix.rundb import pymongo_collection
>>> collection = pymongo_collection()

Then you can query the runDB using normal pymongo commands. For example:

>>> collection.find_one({'number': 9000}, {'number': 1})
{'_id': ObjectId('5f2d999448350bff030d2d3b'), 'number': 9000}

You can also access different collections by passing an argument to `pymongo_collection`. The analogous query to the contexts collection shown above would be:

>>> collection = pymongo_collection('contexts')
>>> collection.find_one({'name': 'xenonnt_online', 'straxen_version': '0.11.0'})

If you need to use different databases or do not want to use the information listed in your utilix configuration file, you can also pass keyword arguments to overwrite the information in the config file. This is useful if you need to e.g. use both XENONnT and XENON1T collections. To use the XENON1T collection, for example:

>>> xe1t_coll, xe1t_db, xe1t_user, xe1t_pw, xe1t_url = [ask someone]
>>> xe1t_collection = pymongo_collection(xe1t_coll, database=xe1t_coll, user=xe1t_user, password=xe1t_pw, url=xe1t_url)

## Data processing requests
You may find yourself missing some data which requires a large amount of resources to process. In these cases, you can submit a processing request to the computing team.

Expand Down Expand Up @@ -267,7 +267,7 @@ submit_job(

## TODO
We want to implement functionality for easy job submission to the Midway batch queue.
Eventually we want to do the same for OSG.
Eventually we want to do the same for OSG.

It would be nice to port e.g. the admix database wrapper to utilix, which can then be used
easily by all analysts.
It would be nice to port e.g. the admix database wrapper to utilix, which can then be used
easily by all analysts.
26 changes: 26 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
[flake8]
# Set maximum width of the line to 100
max-line-length = 100

# E203 whitespace before ':'
# E402 module level import not at top of file
# E501 line too long
# E731 do not assign a lambda expression, use a def
# F541 f-string is missing placeholders
# F401 imported but unused
# F403 unable to detect undefined names
# F405 name may be undefined, or defined from star imports
# W503 line break before binary operator
# ignore = E203, E731, F541, W503
per-file-ignores =
utilix/*__init__.py: F401, E402
tests/*: F403, F405
tests/test_import.py: F401


[docformatter]
in-place = true
blank = true
style = sphinx
wrap-summaries = 100
wrap-descriptions = 100
16 changes: 7 additions & 9 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,25 +1,23 @@
from setuptools import setup, find_packages

# Get requirements from requirements.txt, stripping the version tags
with open('requirements.txt') as f:
requires = [
r.split('/')[-1] if r.startswith('git+') else r
for r in f.read().splitlines()]
with open("requirements.txt") as f:
requires = [r.split("/")[-1] if r.startswith("git+") else r for r in f.read().splitlines()]

with open('README.md') as file:
with open("README.md") as file:
readme = file.read()

with open('HISTORY.md') as file:
with open("HISTORY.md") as file:
history = file.read()

setup(
name="utilix",
version="0.8.5",
url='https://github.com/XENONnT/utilix',
url="https://github.com/XENONnT/utilix",
description="User-friendly interface to various utilities for XENON users",
long_description_content_type='text/markdown',
long_description_content_type="text/markdown",
packages=find_packages(),
install_requires=requires,
python_requires=">=3.6",
long_description=readme + '\n\n' + history,
long_description=readme + "\n\n" + history,
)
Loading

0 comments on commit 1c4a3ce

Please sign in to comment.