Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usability enhancements #2

Merged
merged 27 commits into from
Mar 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions .github/workflows/test_pr_merge.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,17 @@ on:

jobs:
tests:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.6", "3.7", "3.8", "3.9", "3.10", " 3.11", "3.12"]
runs-on: ubuntu-20.04
steps:
- name: Checkout repo content
uses: actions/checkout@v2
- name: install dependencies
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: ./scripts/ci_tests.sh install
- name: run tests
run: ./scripts/ci_tests.sh test
- name: Run tests
run: ./scripts/ci_tests.sh test
9 changes: 3 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,22 +20,19 @@ devenv: .venv ## create a python virtual environment with tools to dev, run and
@echo "To activate the virtual environment, run 'source $</bin/activate'"


requirements: devenv ## runs pip-tools to compile dependencies
# freezes requirements
pip-compile requirements/test.in --resolver=backtracking --output-file requirements/test.txt


.PHONY: install-test
install-test: ## install dependencies for testing
pip install -r requirements/test.txt
pip install -r requirements/test.in
pip list --verbose

.PHONY: tests-dev
tests-dev: ## run tests in development mode
.venv/bin/pytest --pdb -vvv tests

.PHONY: tests-ci
tests-ci: ## run testds in the CI
.venv/bin/pytest -vvv --color=yes --cov-report term --cov=activity_monitor tests
.venv/bin/pytest -vvv --color=yes --cov-report term --cov=activity_monitor --cov=activity tests


.PHONY: release
Expand Down
157 changes: 156 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,156 @@
# service-activity-monitor
# service-activity-monitor

Tooling for monitoring processes activity inside a docker container. Depends on python and the well supported `psutil` package.

Monitors:
- child process cpu usage
- child process disk usage
- overall container network usage
- jupyter kernel activity

Exposes Prometheus metrics regarding:
- total outgoing network usage
- total incoming network usage

# Quick-ish start

## Step 1

Inside your `Dockerfile` add the following. Please replace the `TARGET_VERSION` and adjust all `BUSY_THRESHOLD` for your application.

```Dockerfile
ARG ACTIVITY_MONITOR_VERSION=TARGET_VERSION

# Detection thresholds for application
ENV ACTIVITY_MONITOR_BUSY_THRESHOLD_CPU_PERCENT=1000
ENV ACTIVITY_MONITOR_BUSY_THRESHOLD_DISK_READ_BPS=1099511627776
ENV ACTIVITY_MONITOR_BUSY_THRESHOLD_DISK_WRITE_BPS=1099511627776
ENV ACTIVITY_MONITOR_BUSY_THRESHOLD_NETWORK_RECEIVE_BPS=1099511627776
ENV ACTIVITY_MONITOR_BUSY_THRESHOLD_NETWORK_SENT__BPS=1099511627776

# install service activity monitor
RUN apt-get update && \
apt-get install -y curl && \
# install using curl
curl -sSL https://raw.githubusercontent.com/ITISFoundation/service-activity-monitor/main/scripts/install.sh | \
bash -s -- ${ACTIVITY_MONITOR_VERSION} && \
# cleanup and remove curl
apt-get purge -y --auto-remove curl && \
rm -rf /var/lib/apt/lists/*
```

## Step 2

Inside your boot script before starting your application start something similar to

```bash
python /usr/local/bin/service-monitor/activity_monitor.py &
```

In most cases something similar to the below will do the trick (don't forget to replace `USER`).

```bash
exec gosu "$USER" python /usr/local/bin/service-monitor/activity_monitor.py &
```

## Step 3

Inside you image's label something similar to this should end up:

```yaml
...
services:
...
YOUR_SERVICE:
...
build:
labels:
...
simcore.service.callbacks-mapping: '{"inactivity": {"service": "container",
"command": ["python", "/usr/local/bin/service-monitor/activity.py"], "timeout":
1.0}}'
```
Note if your service defines it's own compose spec. `container` must be replaced with the name of the service where these are installed.

In most cases you will easily configure this by adding the following to your `.osparc/service-name/runtime.yaml` file:

```yaml
...
callbacks-mapping:
inactivity:
service: container
command: ["python", "/usr/local/bin/service-monitor/activity.py"]
timeout: 1
```
# Available configuration options

##### The following flags disable the monitors. By default all the monitors are enabled.
- `ACTIVITY_MONITOR_DISABLE_JUPYTER_KERNEL_MONITOR` default=`False`: disables and does not configure the jupyter kernel monitor
- `ACTIVITY_MONITOR_DISABLE_CPU_USAGE_MONITOR` default=`False`: disables and does not configure the cpu usage monitor
- `ACTIVITY_MONITOR_DISABLE_DISK_USAGE_MONITOR` default=`False`: disables and does not configure the disk usage monitor
- `ACTIVITY_MONITOR_DISABLE_NETWORK_USAGE_MONITOR` default=`False`: disables and does not configure the network usage monitor

##### All the following env vars are to be interpreted as follows: if the value is greater than (>) threshold, the corresponding manager will report busy.
- `ACTIVITY_MONITOR_BUSY_THRESHOLD_CPU_PERCENT` [percentage(%)], default=`1000`: used cpu usage monitor
- `ACTIVITY_MONITOR_BUSY_THRESHOLD_DISK_READ_BPS` [bytes], default=`1099511627776`: used by disk usage monitor
- `ACTIVITY_MONITOR_BUSY_THRESHOLD_DISK_WRITE_BPS` [bytes], default=`1099511627776`: used by disk usage monitor
- `ACTIVITY_MONITOR_BUSY_THRESHOLD_NETWORK_RECEIVE_BPS` [bytes], default=`1099511627776`: used by network usage monitor
- `ACTIVITY_MONITOR_BUSY_THRESHOLD_NETWORK_SENT__BPS` [bytes], default=`1099511627776`: used by network usage monitor

##### Other:
- `ACTIVITY_MONITOR_JUPYTER_NOTEBOOK_BASE_URL` [str] default=`http://localhost:8888`: endpoint where the jupyter notebook is exposed
- `ACTIVITY_MONITOR_JUPYTER_NOTEBOOK_KERNEL_CHECK_INTERVAL_S` [float] default=`5`: used by the jupyter kernel monitor to update it's metrics
- `ACTIVITY_MONITOR_MONITOR_INTERVAL_S` [float] default=`1`: all other monitors us this interval to update their metrics
- `ACTIVITY_MONITOR_LISTEN_PORT` [int] default=`19597`: port on which the http server will be exposed



# Exposed API


### `GET /activity`

Used by oSPARC top retrieve the status of the service if it's active or not

```json
{"seconds_inactive": 0}
```

```bash
curl http://localhost:19597/activity
```

### `GET /debug`

Used for debugging and not used by oSPARC

```json
{
"kernel_monitor": {"is_busy": true},
"cpu_usage": {"is_busy": false, "total": 0},
"disk_usage": {"is_busy": false, "total": {"bytes_read_per_second": 0, "bytes_write_per_second": 0}},
"network_usage": {"is_busy": false, "total": {"bytes_received_per_second": 345452, "bytes_sent_per_second": 343809}}
}
```

```bash
curl http://localhost:19597/debug
```

### `GET /metrics`

Exposes Prometheus metrics relative to the running processes.

```
# HELP network_bytes_received_total Total number of bytes received across all network interfaces.
# TYPE network_bytes_received_total counter
network_bytes_received_total 23434790

# HELP network_bytes_sent_total Total number of bytes sent across all network interfaces.
# TYPE network_bytes_sent_total counter
network_bytes_sent_total 22893843
```

```bash
curl http://localhost:19597/metrics
```
4 changes: 1 addition & 3 deletions requirements/test.in
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
# from jupyter
# required packages

psutil
tornado

# testing

pytest
pytest-asyncio
pytest-cov
pytest-mock
requests
Expand Down
54 changes: 0 additions & 54 deletions requirements/test.txt

This file was deleted.

1 change: 0 additions & 1 deletion scripts/ci_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ install() {
make .venv
source .venv/bin/activate
make install-test
pip list --verbose
}

test() {
Expand Down
15 changes: 8 additions & 7 deletions scripts/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ IFS=$'\n\t'
# Function to display usage information
usage() {
echo "Usage: $0 <tag>"
echo "Example: $0 v.0.0.9-debug"
echo "Example: $0 v.0.0.1"
exit 1
}

Expand All @@ -24,22 +24,23 @@ if [ $# -ne 1 ]; then
usage
fi

TAG=$1
URL="https://github.com/ITISFoundation/service-activity-monitor/releases/download/$TAG/release_archive_$TAG.zip"

# Download and install
TAG=$1
URL="https://github.com/ITISFoundation/service-activity-monitor/releases/download/$TAG/release_archive_$TAG.zip"
echo "Downloading release $TAG..."
curl -sSL -o /tmp/release.zip "$URL"

echo "Extracting files..."
echo "Installing..."

# python scripts
mkdir -p /usr/local/bin/service-monitor
unzip -q /tmp/release.zip -d /usr/local/bin/service-monitor
# requirements
pip install psutil

echo "Installing..."
# Here you can write your installation steps, for now let's just echo the installation is complete
echo "Installation complete."

# Cleanup
rm /tmp/release.zip

echo "Done!"
14 changes: 12 additions & 2 deletions src/activity.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
import os

import requests

r = requests.get("http://localhost:19597")
print(r.text)
LISTEN_PORT: int = int(os.environ.get("ACTIVITY_MONITOR_LISTEN_PORT", 19597))


def main():
response = requests.get(f"http://localhost:{LISTEN_PORT}/activity")
print(response.text)


if __name__ == "__main__":
main()
Loading
Loading