How can I use joblib with parallel backend as "dask" #3104
Unanswered
Sinha-Ujjawal
asked this question in
Q&A
Replies: 1 comment
-
I resolved the issue by doing this- from prefect import task, Flow
# from prefect.engine.executors import LocalDaskExecutor
from prefect.engine.executors import DaskExecutor
import dask.dataframe as dd
# from dask.distributed import Client
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
def getXy():
df = dd.read_csv("train-*.csv")
X = df[["feature_1", "feature_2"]]
y = df["label"]
return X, y
@task
def fit_linear_regression():
X, y = getXy()
model = LinearRegression()
# client = Client()
with joblib.parallel_backend("dask"):
model.fit(X, y)
@task
def fit_decision_tree():
X, y = getXy()
model = DecisionTreeRegressor()
# client = Client()
with joblib.parallel_backend("dask"):
model.fit(X, y)
with Flow("Train 2 models") as flow:
fit_linear_regression()
fit_decision_tree()
if __name__ == "__main__":
# state = flow.run(executor=LocalDaskExecutor())
state = flow.run(executor=DaskExecutor()) Didn't used Client(), didn't used LocalDaskExecutor(), used DaskExecutor() instead. It resolved the issue. But I still want to understand why this ran, but the former didn't |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Example code-
When I ran this, the first time the flow exited unexpectedly, but when I ran it the second time, It's stuck.
I kind of understand the issue, can anyone help me with properly using dask backend for ml models?
Beta Was this translation helpful? Give feedback.
All reactions