Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify Parallel Executor #2031

Open
wants to merge 8 commits into
base: fix-streaming
Choose a base branch
from

Conversation

CyrusNuevoDia
Copy link
Collaborator

No description provided.

@CyrusNuevoDia CyrusNuevoDia changed the base branch from main to fix-streaming January 9, 2025 10:56
Comment on lines -71 to -106
def test_multi_thread_evaluate_call_cancelled(monkeypatch):
# slow LM that sleeps for 1 second before returning the answer
class SlowLM(DummyLM):
def __call__(self, *args, **kwargs):
import time

time.sleep(1)
return super().__call__(*args, **kwargs)

dspy.settings.configure(lm=SlowLM({"What is 1+1?": {"answer": "2"}, "What is 2+2?": {"answer": "4"}}))

devset = [new_example("What is 1+1?", "2"), new_example("What is 2+2?", "4")]
program = Predict("question -> answer")
assert program(question="What is 1+1?").answer == "2"

# spawn a thread that will sleep for .1 seconds then send a KeyboardInterrupt
def sleep_then_interrupt():
import time

time.sleep(0.1)
import os

os.kill(os.getpid(), signal.SIGINT)

input_thread = threading.Thread(target=sleep_then_interrupt)
input_thread.start()

with pytest.raises(KeyboardInterrupt):
ev = Evaluate(
devset=devset,
metric=answer_exact_match,
display_progress=False,
num_threads=2,
)
score = ev(program)
assert score == 100.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively got moved to tests/utils/test_parallelizer.py

@okhat
Copy link
Collaborator

okhat commented Jan 10, 2025

Thanks a lot @CyrusNuevoDia ! Is this ready to merge

@CyrusNuevoDia
Copy link
Collaborator Author

It works... can you think of any tricky testing scenarios?

else:
logger.error(
f"Error processing item {item}: {e}. Set `provide_traceback=True` to see the stack trace."
)
with self._lock:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably move this above the logging

return self._execute_isolated_single_thread(wrapped_function, data)
else:
return self._execute_multi_thread(wrapped_function, data)
exec_type = "multi" if self.num_threads != 1 else "single"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would could self._execute_single_thread if self.num_threads == 1 else self._execute_multi_thread?

# If not in the main thread, skip setting signal handlers
yield

def cancellable_function(parent_overrides, index_item):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, this was really important. It seems to have gotten lost in the (otherwise extremely neat) refactor.

When launching multiple threads, we want each thread to inherit the parent thread's local overrides.

def _execute_multi_thread(self, function, data):
pbar = self._create_pbar(data)
total_score = 0
total_processed = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere here, we'd need to have something like:

from dspy.dsp.utils.settings import thread_local_overrides
parent_overrides = thread_local_overrides.overrides.copy()

and then we should pass parent_overrides in data, so that wrapped(item) can handle using the parent thread's overrides, not the new child's overrides.

@okhat
Copy link
Collaborator

okhat commented Jan 12, 2025

Thanks a ton @CyrusNuevoDia ! I reviewed this carefully. Two issues:

  • Like in the comments above, we need to carefully handle parent overrides.
  • Will also need to update requirements.txt until Hanna moves us to purely pyproject.toml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants