Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Central repository for results #274

Open
k-rister opened this issue Dec 6, 2022 · 3 comments
Open

Central repository for results #274

k-rister opened this issue Dec 6, 2022 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@k-rister
Copy link
Contributor

k-rister commented Dec 6, 2022

No description provided.

@k-rister k-rister added the enhancement New feature or request label Jan 18, 2024
@k-rister k-rister self-assigned this Jan 18, 2024
@k-rister
Copy link
Contributor Author

  • support local and shared repository
  • queries should be able to query one or more repository
  • file store for raw result data
  • ES server
  • May need to add a system of indices (weekly, monthly, etc.) in order to scale the storage/queries/etc.
  • Authentication tokens for uploading data and indexing
  • Tight integration between existing CDM queries and repository to allow for cross querying
  • Investigate migrating to OpenSearch in place of ElasticSearch
  • SCP/SSH target

@atheurer
Copy link
Contributor

After some more thought on this, using a cluster fs (cephsFS) that RH supports seems like the most appropriate thing to do. I think we just need to research some best practices and run very specific tests that are relevant to our use case (uploading crucible result from a user, processing a result to index, etc).

Since we are going to use VMs for OpenSearch anyway, perhaps we can make them dual purpose, and have an Opensearch and a Ceph OSD run on each VM. Each VM could be dedicated an SSD on the host.

@atheurer
Copy link
Contributor

For "Tight integration between existing CDM queries and repository to allow for cross querying", it's possible that the central ES and a user's local ES are using different versions of the CDM (for example cdmv6 and cdmv7). We will need to decide what to do in this scenario: (a) have queries support two different versions (b) force users to migrate to the same version as the shared ES, (c) something else. I suspect "A" will be what we have to do, and in practice, there is little to no difference in the queries when going from one version to the next. The bigger difference is in indexing, and generally we support 2 versions already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

2 participants