Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column lengths are not equal from polars when reading parquet #14

Open
SteampunkIslande opened this issue Mar 29, 2024 · 0 comments
Open

Comments

@SteampunkIslande
Copy link
Collaborator

Using some real world kind of VCF, I had this issue with reading produced parquet file using polars:

import pyvcf2parquet as pv
import polars as pl
pv.convert_vcf("realworld_data.vcf.gz","test.parquet")

In [5]: pl.scan_parquet("test.parquet").head().collect()
Out[5]: thread 'ipython' panicked at crates/polars-core/src/fmt.rs:513:13:
The column lengths in the DataFrame are not equal.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---------------------------------------------------------------------------
PanicException                            Traceback (most recent call last)
<ipython-input-5-5f2079c139cc> in ?()
----> 1 pl.scan_parquet("test.parquet").head().collect()

/media/charles/SANDISK/TousLesVCF_PPI/touslesppi/venv/lib/python3.10/site-packages/decorator.py in ?(*args, **kw)
    229         def fun(*args, **kw):
    230             if not kwsyntax:
    231                 args, kw = fix(args, kw, sig)
--> 232             return caller(func, *(extras + args), **kw)

/media/charles/SANDISK/TousLesVCF_PPI/touslesppi/venv/lib/python3.10/site-packages/polars/dataframe/frame.py in ?(self)
   1425     def __repr__(self) -> str:
-> 1426         return self.__str__()

/media/charles/SANDISK/TousLesVCF_PPI/touslesppi/venv/lib/python3.10/site-packages/polars/dataframe/frame.py in ?(self)
   1422     def __str__(self) -> str:
-> 1423         return self._df.as_str()

PanicException: The column lengths in the DataFrame are not equal.

Still, using duckdb I was able to read the parquet file just as normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant