You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import{upload}from'together-ai/lib/upload'awaitupload('dataset.parquet')"failed to read parquet file dataset.parquet"
With a bit of added logging in together-typescript I found the more specific error message:
invalid parquet version
Which is coming from inside the parquetjs library which is woefully unmaintained in 5+ years and does not support the majority of modern parquet files. #102 also ran into a parquet parsing issue. Might be worth using a different parquet library.
The text was updated successfully, but these errors were encountered:
I generated a parquet file by using the
tokenize_data.py
script in thetogether-python
repo to generate a parquet file with tokenized data. I followed the steps here https://docs.together.ai/docs/fine-tuning-data-preparation#tokenized-dataI then tried to upload this dataset to together:
With a bit of added logging in
together-typescript
I found the more specific error message:Which is coming from inside the
parquetjs
library which is woefully unmaintained in 5+ years and does not support the majority of modern parquet files. #102 also ran into a parquet parsing issue. Might be worth using a different parquet library.The text was updated successfully, but these errors were encountered: