The Basic Usage of RAFAEL

RAFAEL (RApid Federated Analysis in ELastic Framework) is a federated analytic framework provides two high-level features:

composable federated analytic algorithms in biomedical domain
accelerate the computation using the JAX library, which can be further extended to use GPU.

We provide the Dockerfile to make users easily deploy a federated analysis service.

Index

The Basic Usage of RAFAEL
The Advanced Usage of RAFAEL
- Environment Variables to Set Up Service
  - Service Configuration
  - Uvicorn Service
- Main Federated Analytic Algorithms
References

The Basic Usage of RAFAEL

1. Clone the RAFAEL Repository

git clone https://github.com/ailabstw/RAFAEL.git /rafael
cd rafael

2. Build the RAFAEL Service Docker Image

docker build -t rafael .

3. Initialize a RAFAEL service

The RAFAEL service automatically generates a service configuration in services/configs from the given environment variables and logs in services/log. To specify the service log path, use -e SERVER_LOG_PATH=/your/log/path, -e COMPENSATOR_LOG_PATH=/your/log/path, or -e CLIENT_LOG_PATH=/your/log/path. You can also specify the path to save output results in the request.

The uvicorn service will run on the http://0.0.0.0:${UVICORN_PORT} in the docker container. The container listen to the uvicorn service by using docker run ... -p ${UVICORN_PORT}:${DOCKER_PORT} .... See also https://docker-fastapi-projects.readthedocs.io/en/latest/uvicorn.html#troubleshoots.

3.0 Create the Local Docker Network

docker network create --subnet=172.18.0.0/16 rafael-net

3.1 Run a RAFAEL Server

docker run -it --rm --name rafael-server --network rafael-net --ip 172.18.0.2 -p 8000:8000 -v ./services:/rafael/services rafael

3.2 Run a RAFAEL Compensator

docker run -it --rm --name rafael-compensator --network rafael-net --ip 172.18.0.6 -p 8080:8080 -v ./services:/rafael/services -e ROLE=compensator -e PORT=8080 rafael

3.3 Run RAFAEL Clients

Start the RAFAEL clients with assigned IDs:

client 1 ID: f01b3208-11b8-446d-a178-39e18f16f89b

docker run -it --rm --name rafael-client1 --network rafael-net --ip 172.18.0.3 -p 8001:8001 -v ./services:/rafael/services -e ROLE=client -e PORT=8001 -e CLIENT_NODE_ID=f01b3208-11b8-446d-a178-39e18f16f89b rafael

client 2 ID: ab5f2e2a-3860-4c56-983d-cd16ea184098

docker run -it --rm --name rafael-client2 --network rafael-net --ip 172.18.0.4 -p 8002:8002 -v ./services:/rafael/services -e ROLE=client -e PORT=8002 -e CLIENT_NODE_ID=ab5f2e2a-3860-4c56-983d-cd16ea184098 rafael

client 3 ID: 97bc8985-9ce5-481b-a0fb-8c5e6f872158

docker run -it --rm --name rafael-client3 --network rafael-net --ip 172.18.0.5 -p 8003:8003 -v ./services:/rafael/services -e ROLE=client -e PORT=8003 -e CLIENT_NODE_ID=97bc8985-9ce5-481b-a0fb-8c5e6f872158 rafael

Note: Make sure the server and compensator are ready to be connected by client.

4. Run a Demo

It's recommended to terminate all services after completing a federated analysis. The current in-memory data repository is a Python dictionary, which might cause parameter issues when conducting multiple analyses.

4.0 The introduction of analysis request

Post an analysis request to server

The base analysis request format in RAFAEL:

{
    "node_id": "${SERVER_NODE_ID}",
    "args": {
        "config": {
            // parameters for the analysis API
        }
    },
    "api": "${ANALYSIS_API}"
}

The config are the parameters for the analysis API.

For example, the available parameters in CoxPHRegression are:

clients: The list of client IDs participating in the analysis.
feature_cols: The feature columns to perform the analysis. Default is to use all features.
clinical_data_path: The paths to the clinical data. The clinical data should contain columns named as event and time to perform survival analysis.
meta_cols: The sample metadata to be excluded from the survival analysis.
save_dir: The directory path to save the results.
r: The number of samples in the global anchor matrix. Default is 100.
k: The latent dimensions of the SVD. The decomposed matrix is used for creating proxy data matrix. Default is 20.
bs_prop: The proportion of samples to be sampled for each bootstrap. Default is 0.6.
bs_times: The number of bootstrap iterations. Default is 20.
alpha: The statistical significance level. Default is 0.05.
step_size: Deal with the fitting error, delta contains nan value(s). Default is 0.5.

Example

The following script is the example of POST http://${SERVER_HOST}:${SERVER_PORT}/tasks:

import requests

req = {
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098",
                "97bc8985-9ce5-481b-a0fb-8c5e6f872158"
            ],
            "clinical_data_path":[
                "data/client1/GSE62564-1.csv",
                "data/client2/GSE62564-2.csv",
                "data/client3/GSE62564-3.csv"
            ],
            "save_dir":[
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098",
                "/rafael/services/results/97bc8985-9ce5-481b-a0fb-8c5e6f872158"
            ],
            "meta_cols":[
                "sample-id"
            ]
        }
    },
    "api": "CoxPHRegression"
}

requests.post("http://localhost:8000/tasks", json=req)

Refer to rafael/datamodel.py for parameter specifications of other analysis APIs.

4.1 Genome Wide Association Study (GWAS)

Dataset: The two clients' demo data hapmap1 is stored in data/.

POST http://localhost:8000/tasks

Quantitative trait:

{
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098"
            ],
            "compensators": [
                "8c66f4e8-9d4c-446d-9e6c-cbdf8b285554"
            ],
            "bfile_path": [
                "data/client1/hapmap1_100_1",
                "data/client2/hapmap1_100_2"
            ],
            "cov_path":[
                "data/client1/hapmap1_100_1.cov",
                "data/client2/hapmap1_100_2.cov"
            ],
            "regression_save_dir": [
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098"
            ],
            "local_qc_output_path": [
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b/qc",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098/qc"
            ],
            "global_qc_output_path": "/rafael/services/results/7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49/qc",
            "snp_chunk_size": 10,
            "maf": 0.05,
            "geno": 0.05,
            "mind": 0.05
        }
    },
    "api": "FullQuantGWAS"
}

Binary trait: Assign binary phenotype data with pheno_path.

{
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098"
            ],
            "compensators": [
                "8c66f4e8-9d4c-446d-9e6c-cbdf8b285554"
            ],
            "bfile_path": [
                "data/client1/hapmap1_100_1",
                "data/client2/hapmap1_100_2"
            ],
            "cov_path":[
                "data/client1/hapmap1_100_1.cov",
                "data/client2/hapmap1_100_2.cov"
            ],
            "pheno_path": [
                "data/client1/hapmap1_100_1.pheno",
                "data/client2/hapmap1_100_2.pheno"
            ],
            "regression_save_dir": [
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098"
            ],
            "local_qc_output_path": [
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b/qc",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098/qc"
            ],
            "global_qc_output_path": "/rafael/services/results/7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49/qc",
            "snp_chunk_size": 10,
            "maf": 0.05,
            "geno": 0.05,
            "mind": 0.05
        }
    },
    "api": "FullBinGWAS"
}

4.2 Principal Component Analysis (PCA)

Dataset: The three clients' demo data GSE62564 are stored in data/.

POST http://localhost:8000/tasks

{
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098",
                "97bc8985-9ce5-481b-a0fb-8c5e6f872158"
            ],
            "file_path":[
                "data/client1/GSE62564-1.csv",
                "data/client2/GSE62564-2.csv",
                "data/client3/GSE62564-3.csv"
            ],
            "svd_save_dir":[
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b/pca",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098/pca",
                "/rafael/services/results/97bc8985-9ce5-481b-a0fb-8c5e6f872158/pca"
            ],
            "to_pc": true,
            "meta_cols":[
                "sample-id",
                "time",
                "event"
            ]
        }
    },
    "api": "PCAfromTabular"
}

4.3 Singular Value Decomposition (SVD)

Dataset: The three clients' demo data GSE62564 are stored in data/.

POST http://localhost:8000/tasks

{
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098",
                "97bc8985-9ce5-481b-a0fb-8c5e6f872158"
            ],
            "file_path":[
                "data/client1/GSE62564-1.csv",
                "data/client2/GSE62564-2.csv",
                "data/client3/GSE62564-3.csv"
            ],
            "svd_save_dir":[
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b/svd",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098/svd",
                "/rafael/services/results/97bc8985-9ce5-481b-a0fb-8c5e6f872158/svd"
            ],
            "to_pc": false,
            "meta_cols":[
                "sample-id",
                "time",
                "event"
            ]
        }
    },
    "api": "SVDfromTabular"
}

4.4 Cox Proportion Hazard Regression

Dataset: The three clients' demo data GSE62564 are stored in data/.

POST http://localhost:8000/tasks

{
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098",
                "97bc8985-9ce5-481b-a0fb-8c5e6f872158"
            ],
            "clinical_data_path":[
                "data/client1/GSE62564-1.csv",
                "data/client2/GSE62564-2.csv",
                "data/client3/GSE62564-3.csv"
            ],
            "save_dir":[
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b/cox",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098/cox",
                "/rafael/services/results/97bc8985-9ce5-481b-a0fb-8c5e6f872158/cox"
            ],
            "meta_cols":[
                "sample-id"
            ]
        }
    },
    "api": "CoxPHRegression"
}

4.5 Kaplan-Meier

Dataset: The three clients' demo data GSE62564 are stored in data/.

POST http://localhost:8000/tasks

{
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098",
                "97bc8985-9ce5-481b-a0fb-8c5e6f872158"
            ],
            "clinical_data_path":[
                "data/client1/GSE62564-1.csv",
                "data/client2/GSE62564-2.csv",
                "data/client3/GSE62564-3.csv"
            ],
            "save_dir":[
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b/km",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098/km",
                "/rafael/services/results/97bc8985-9ce5-481b-a0fb-8c5e6f872158/km"
            ],
            "meta_cols":[
                "sample-id"
            ]
        }
    },
    "api": "KaplanMeier"
}

5. Run a Federated Analysis on Your Own Dataset

Similar to the 3.3 Run RAFAEL Clients with an additional parameter -v to mount your own data:

docker run -it --rm --name ${CLIENT_NAME} --network rafael-net --ip ${CLIENT_HOST} -p ${CLIENT_PORT}:${CLIENT_PORT} -v ./services:/rafael/services -v /path/to/data/directory:/rafael/data -e ROLE=client -e PORT=${CLIENT_PORT} -e CLIENT_NODE_ID=${CLIENT_NODE_ID} rafael

In fact, the /rafael/data is a recommended path to mount, not necessarily. The input file path of the analysis can be specified in the request.

Example:

Initialize services

> Initialize a server
> Initialize a compensator

# Client1
docker run -it --rm --name rafael-client1 --network rafael-net --ip 172.18.0.3 -p 8001:8001 -v ./services:/rafael/services -v ~/Desktop/client1_data:/mnt -e ROLE=client -e PORT=8001 -e CLIENT_NODE_ID=f01b3208-11b8-446d-a178-39e18f16f89b rafael

# Client2
docker run -it --rm --name rafael-client2 --network rafael-net --ip 172.18.0.4 -p 8002:8002 -v ./services:/rafael/services -v ~/Desktop/client2_data:/mnt -e ROLE=client -e PORT=8002 -e CLIENT_NODE_ID=ab5f2e2a-3860-4c56-983d-cd16ea184098 rafael

POST http://localhost:8000/tasks

{
    "node_id": "7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49",
    "args": {
        "config": {
            "clients": [
                "f01b3208-11b8-446d-a178-39e18f16f89b",
                "ab5f2e2a-3860-4c56-983d-cd16ea184098"
            ],
            "compensators": [
                "8c66f4e8-9d4c-446d-9e6c-cbdf8b285554"
            ],
            "bfile_path": [
                "/mnt/hapmap1_100_1",
                "/mnt/hapmap1_100_2"
            ],
            "cov_path":[
                "/mnt/hapmap1_100_1.cov",
                "/mnt/hapmap1_100_2.cov"
            ],
            "regression_save_dir": [
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098"
            ],
            "local_qc_output_path": [
                "/rafael/services/results/f01b3208-11b8-446d-a178-39e18f16f89b/qc",
                "/rafael/services/results/ab5f2e2a-3860-4c56-983d-cd16ea184098/qc"
            ],
            "global_qc_output_path": "/rafael/services/results/7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49/qc",
            "snp_chunk_size": 10,
            "maf": 0.05,
            "geno": 0.05,
            "mind": 0.05
        }
    },
    "api": "FullQuantGWAS"
}

The Advanced Usage of RAFAEL

Make sure you have been familiar with how to run a RAFAEL service in the previous section.
This section is to provide the hyperparameters for the customized service configurations and more comprehensive analysis.

Environment Variables to Set Up Service

The following environment variables can be assigned like so:

docker run --rm -it ... -e ROLE=${ROLE} -e PORT=${PORT} ... rafael

Service Configuration

These variables control the service role, who to connect, its identity, where to save log, and the network protocol.

ROLE: The service to be created as ROLE. Available parameters are server, compensator and client. Default is server.
SERVER_NODE_ID: The server node ID. Default is 7a5f34c4-4415-4b9a-bab7-ebbcdcc23a49.
COMPENSATOR_NODE_ID: The compressor node ID. Default is 8c66f4e8-9d4c-446d-9e6c-cbdf8b285554.
CLIENT_NODE_ID: The client node ID. Default is randomly generated UUID4.
SERVER_LOG_PATH: The path to save the server log. Default is /rafael/services/log/server.log.
COMPENSATOR_LOG_PATH: The path to save the compressor log. Default is /rafael/services/log/compensator.log.
CLIENT_LOG_PATH: The path to save the client log. Default is /rafael/services/log/client-${CLIENT_NODE_ID}.log
PROTOCOL: The web service protocol to contruct the address as ${PROTOCOL}://${SERVER_HOST}:${SERVER_PORT} or ${PROTOCOL}://${COMPENSATOR_HOST}:${COMPENSATOR_PORT}. Default is ws.
SERVER_HOST: The sever host to connect to. Default is 172.18.0.2.
SERVER_PORT: The server port to connect to. Default is 8000.
COMPENSATOR_HOST: The compensator host to connect to. Default is 172.18.0.6.
COMPENSATOR_PORT: The compensator port to connect to. Default is 8080.

Uvicorn Service

These variables are the uvicorn parameters to initialize a service. See also uvicorn options.

PORT: The uvicorn service in the container running at PORT. Default is 8000.
PING_INTERVAL: The uvicorn websocket implementation parameter ws_ping_interval. Default is set to 600 due to some time-consuming calculations.
PING_TIMEOUT: The uvicorn websocket implementation parameter ws_ping_timeout. Default is set to 300 due to some time-consuming calculations.
MAX_MESSAGE_SIZE: The uvicorn websocket implementation parameter ws_max_size. Default is set to 1e20 due to tremendous memory consumption in GWAS.

Main Federated Analytic Algorithms

GWAS

The base parameters for performing federated GWAS are:

bfile_path: The path to bfile with prefix. For example, client1.bed, client1.fam and client1.bim are under ~/data/, the bfile_path should be ~/data/client1.
cov_path: The path to the covariate file. Default is None.
pheno_path: The path to the phenotype file. Default is None.
pheno_name: The column name of the phenotype in phenotype file.
snp_chunk_size: The number of SNPs to be calculated in a single chunk. Default is 10000.
regression_save_dir: The path to save the results, gwas.glm, gwas.manhattan.png and gwas.qq.png.

The recommended combinations for the federated GWAS (means POST to these APIs step by step):

QunatGWAS for qunatitative trait.
BinGWAS for binary trait.
BasicBfileQC → QunatGWAS for the additional quality control.
BasicBfileQC → BinGWAS for the additional quality control.
FullQuantGWAS for the complete GWAS, including the LD-pruning and PCA.
FullBinGWAS for the complete GWAS, including the LD-pruning and PCA.

GWAS - BasicBfileQC (GWASConfig)

maf: Filter out all variants with minor allele frequency below the given threshold. Default is 0.05.
geno: Filter out all SNPs with missing call rates exceeding the given value. Default is 0.02.
hwe: Filter out all SNPs having Hardy-Weinberg equilibrium test p-value below the given threshold. Default is 5e-7.
mind: Filters out all samples with missing call rates exceeding the given value. Default is 0.02.
local_qc_output_path: The path to the output file for the QC report in client.
global_qc_output_path: The path to the output file for the QC report in server.

GWAS - LDPruning (GWASConfig)

prune_method: The method to deal with the remained SNPs from clients after pruning at each client side. Default is "intersect". Available options are "intersect" and "union".
win_size: Window size in variant count. Default is 50.
step: Variant count to shift the window at the end of each step. Default is 5.
r2: Variants whose $r^2$ is greater than given threshold were removed. Default is 0.2.

GWAS - GenotypePCA (GWASConfig)

Same in RandomizedSVD, but the default of svd_save_dir is set to local_qc_output_path.

GWAS - CovariateStdz (GWASConfig)

No hyperparameters

GWAS - QuantGWAS (GWASConfig)

block_size: Number of SNPs to run in a block in a process. Defaults is 10000.
num_core: The number of cores to perform parallel computation. Default is 4.

GWAS - FullQuantGWAS (GWASConfig)

The union of BasicBfileQC, LDPruning, GenotypePCA and QuantGWAS.

GWAS - BinGWAS (GWASConfig)

logistic_max_iters: The maximum number of iterations in logistic regression.

GWAS - FullBinGWAS (GWASConfig)

The union of BasicBfileQC, LDPruning, GenotypePCA and BinGWAS.

Linear Algebra

The federated linear algebra in RAFAEL currently supports PCA and SVD, the former can leverage the latter and the APIs supporting the federated standardization to achieve. Hence, the PCA shares the same parameters as the SVD. The tabular data reading APIs are shared as well.

LinAlg - RandomizedSVD (SVDConfig)

This API cannot be directly used. It requires other APIs to prepare the variable A and save it to the data repository.

k1: The initial number of latent dimensions. Default is 20.
k2: The output number of latent dimensions. Default is 20.
svd_max_iters: The maximum number of iterations to update the eigenvectors. Default is 20.
epsilon: The tolerance of the convergence. Default is 1e-9.
first_n: The first n latent dimensions to share globally. Default is 4. It is noted that this parameter and its corresponding APIs should be removed when considering a rigorus federatad scenario.
to_pc: The outputs are eigenvecots ($U$ and $V$) or PCs ($U\Sigma$ and $V\Sigma$). Default is False.
label: The data in output figures are colored by what label. Default is None.
svd_save_dir: The directory to save the eigenvectors.

LinAlg - PCA (SVDConfig)

This API cannot be directly used. It requires other APIs to prepare the variable A and save it to the data repository.

Same in RandomizedSVD.

LinAlg - SVDfromTabular (TabularDataSVDConfig)

Inherited from RandomizedSVD.
file_path: The path to the data.
meta_cols: The metadata column names, which are used for coloring the data in output figures and excluding the unwanted data to participate in SVD/PCA. Default is None, meaning no unwanted data.
drop_cols: The column names to be dropped. Different from meta_cols, drop_cols are not not used in any downstream task, while the meta_cols may be used for labeling. When drop_cols is given, the keep_cols shouldn't be used. Default is None.
keep_cols: The column names to be kept to perform SVD/PCA. When keep_cols is given, the drop_cols shouldn't be used. Default is None, meaning to use all feature in the provided data.

LinAlg - PCAfromTabular (TabularDataSVDConfig)

Same in SVDfromTabular.

Survival

The Cox PH Regression is the implementation of DC-COX, which secures the data in a methematical way. The current Kaplan-Meier survival analysis in RAFAEL only supports continuous data. It leverages the federated standardization APIs to divide samples into two groups and perform Kaplan-Meier survival analysis respectively.

CoxPHRegression and KaplanMeier share the same base parameters:

clinical_data_path: The paths to the clinical data. The clinical data should contain columns named as event and time to perform survival analysis.
feature_cols: The columns to perform the survival analysis. Default is to use all features.
meta_cols: The sample metadata to be excluded from the survival analysis.
save_dir: The directory path to save the results.

Survival - CoxPHRegression (CoxPHRegressionConfig)

r: The number of samples in the global anchor matrix. Default is 100.
k: The latent dimensions of the SVD. The decomposed matrix is used for creating proxy data matrix. Default is 20.
bs_prop: The proportion of samples to be sampled for each bootstrap. Default is 0.6.
bs_times: The number of bootstrap iterations. Default is 20.
alpha: The statistical significance level. Default is 0.05.
step_size: Deal with the fitting error, delta contains nan value(s). Default is 0.5.

Survival - KaplanMeier (KaplanMeierConfig)

alpha: The statistical significance level. Default is 0.05.
n_std: Regard n_std as $k$. Separate samples into $\geq k*\sigma$ and $\leq-k*\sigma$, but with the standardization, $\sigma=1$, so we can simplify as $\geq k$ and $\leq-k$. Default is 1.

References

Federated GWAS Regression & Mechanism: sPLINK: a hybrid federated tool as a robust alternative to meta-analysis in genome-wide association studies
Federated SVD: Federated horizontally partitioned principal component analysis for biomedical applications
Federated Cox PH Regression: DC-COX: Data collaboration Cox proportional hazards model for privacy-preserving survival analysis on multiple parties
PLINK2: Second-generation PLINK: rising to the challenge of larger and richer datasets

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
configs		configs
data		data
docs		docs
examples/http		examples/http
pipelines		pipelines
rafael		rafael
services		services
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
MANIFEST.in		MANIFEST.in
README.md		README.md
build.py		build.py
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

ailabstw/RAFAEL

Folders and files

Latest commit

History

Repository files navigation

The Basic Usage of RAFAEL

1. Clone the RAFAEL Repository

2. Build the RAFAEL Service Docker Image

3. Initialize a RAFAEL service

3.0 Create the Local Docker Network

3.1 Run a RAFAEL Server

3.2 Run a RAFAEL Compensator

3.3 Run RAFAEL Clients

4. Run a Demo

4.0 The introduction of analysis request

Post an analysis request to server

See also

Example

4.1 Genome Wide Association Study (GWAS)

4.2 Principal Component Analysis (PCA)

4.3 Singular Value Decomposition (SVD)

4.4 Cox Proportion Hazard Regression

4.5 Kaplan-Meier

5. Run a Federated Analysis on Your Own Dataset

Example:

The Advanced Usage of RAFAEL

Environment Variables to Set Up Service

Service Configuration

Uvicorn Service

Main Federated Analytic Algorithms

GWAS

GWAS - BasicBfileQC (GWASConfig)

GWAS - LDPruning (GWASConfig)

GWAS - GenotypePCA (GWASConfig)

GWAS - CovariateStdz (GWASConfig)

GWAS - QuantGWAS (GWASConfig)

GWAS - FullQuantGWAS (GWASConfig)

GWAS - BinGWAS (GWASConfig)

GWAS - FullBinGWAS (GWASConfig)

Linear Algebra

LinAlg - RandomizedSVD (SVDConfig)

LinAlg - PCA (SVDConfig)

LinAlg - SVDfromTabular (TabularDataSVDConfig)

LinAlg - PCAfromTabular (TabularDataSVDConfig)

Survival

Survival - CoxPHRegression (CoxPHRegressionConfig)

Survival - KaplanMeier (KaplanMeierConfig)

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages