You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is your question?
I' trying to run NYCTaxi-E2E and noted very slow csv process,
Below part takes 2min 34s on 16V100, is it normal?
%%time
X_train = taxi_df.query('day < 25').persist()
# create a Y_train ddf with just the target variable
Y_train = X_train[['fare_amount']].persist()
# drop the target variable from the training ddf
X_train = X_train[X_train.columns.difference(['fare_amount'])]
# this wont return until all data is in GPU memory
done = wait([X_train, Y_train])
The text was updated successfully, but these errors were encountered:
Hey @BlueFelix this part may be slow due to the fact that it's downloading ~300GB of data into GPU memory and bandwidth/speed can vary. I know that for me, it does take some time. I've found that getting the data is sometimes the longest part of a notebook. :). Does this help?
What is your question?
I' trying to run NYCTaxi-E2E and noted very slow csv process,
Below part takes 2min 34s on 16V100, is it normal?
The text was updated successfully, but these errors were encountered: