Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexibility in scheduling the job #97

Open
dbbaughe opened this issue Nov 10, 2021 · 0 comments
Open

Flexibility in scheduling the job #97

dbbaughe opened this issue Nov 10, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@dbbaughe
Copy link
Contributor

From: opendistro-for-elasticsearch/job-scheduler#32

  • Any way to schedule one time job to run
  • Is it possible to start the job immediately after scheduling, rather running it as delayed job?
    If these options are already available , please update the documentation with relevant examples

Comments:

From @zengyan-amazon
Hi agllno,

Currently we don't support one time job out of the box. You can consider implementing your job runner to update your job config to disable the job, or delete the job from your job index. Hope this helps.


From @JackRyanson
@zengyan-amazon we're also quite in need for the one time job scheduling, would be great to have these. Thanks


From @agllno
@zengyan-amazon can you please clarify on second point as well


From @zengyan-amazon
@agllno ,

sorry for missing the second point. the job scheduler doesn't support executing job immediately after scheduling. You can create a feature request for it, and also you are welcome to contribute to this feature.

Meanwhile, if you want to trigger something immediately, maybe you can consider implementing it as an API.


From @dbbaughe
Hi @agllno and @JackRyanson,

Curious what the use case is for one time job scheduling that you have.

Thanks,
Drew


From @JackRyanson
@dbbaughe one time data enrichment triggered on the user side. E.g. NLP this document set, or lenghty computations e.g. an advanced algorithm that produces results in an index.


From @dbbaughe

@JackRyanson,

Ah so a manually started job from user side?
Is your use case something like:
[User does something] -> [Creates job] -> [Triggers once and deletes itself]
Or
[Creates job that only triggers manually] -> [User does something] -> [Triggers job once]
... sometime later [User does something] -> [ Triggers job once]

Thanks


From @agllno

Sorry for the late reply.
Our use case is as follows:
We use job scheduler to aggregate the historic data and save. This will run for scheduled interval say every hour.
If this scheduled job fails for any reason, we might miss the data as aggregation because raw data will be quarantined. So if we have provision to run one time job for these failed cases it would have been great.


From @JackRyanson

@agllno @dbbaughe sorry for the reply

yes those use cases are correct. Say i have 10M textual document and an NLP service. I want to launch a job that for each document in the 10M indexes performs an NLP analysis and may create more documents in other indexes (E.g. update specific other indexes according to the entities that were found). .. as well as reindex the original document.

This would be a one off, this should start immediately. This should be a job (i can see its status, cancel it etc)

Would this make sense (or instead be something for which you think one should use the basic reindex api in combination with pipeline operators?) do you see yourself adding this capability anytime soon? highly needed in our project here. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

1 participant