You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a stage with a few CSV files in it. I wanted to create a Snowpark dataframe that read the CSV files and placed the filename as a column in the dataframe. I was then trying to use the filename column downstream but I got a lot of strange errors.
However, if I look at the columns in the Snowpark dataframe, the "FILENAME" column doesn't exist:
csv_spdf.columns# ['TIME', 'CH1', 'CH2']
If I include the FILENAME in the schema section (SPT.StructField("FILENAME", SPT.StringType())), then it does show up in the csv_spdf.columns. However, when I try to convert it to a pandas dataframe I get an error that there are duplicate columns: SnowparkFetchDataException: (1406): Failed to fetch a Pandas Dataframe. The error is: Found non-unique column index
Some other strange things that happen here is that if I try to group by the FILENAME column, it can execute the query and return the results:
# This works and I get the expected pandas dataframe:
(
csv_spdf
.group_by(F.col("FILENAME"))
.count()
).to_pandas()
However, if I try to ask for the columns that are in the above dataframe expression, I get an error:
(
csv_spdf
.group_by(F.col("FILENAME"))
.count()
).columns# Returns SQL compilation error: error line 1 at position 7 invalid identifier 'FILENAME'
The text was updated successfully, but these errors were encountered:
Please answer these questions before submitting your issue. Thanks!
3.9.12
Linux-5.10.197-186.748.amzn2.x86_64-x86_64-with-glibc2.31
pip freeze
)?absl-py==1.1.0
acryl-datahub==0.12.1.0
adagio==0.2.4
adjustText==0.8
aenum==3.1.11
aeppl==0.0.34
aesara==2.7.9
affine==2.3.0
agate==1.7.1
aiobotocore==2.5.4
aiohttp==3.8.5
aioitertools==0.11.0
aiosignal==1.2.0
alembic==1.11.3
altair==5.1.1
altair-transform==0.2.0
angel-cd==1.0.3
ansi2html==1.8.0
antlr4-python3-runtime==4.11.1
anyio==3.6.2
apify-client==1.6.1
apify-shared==1.1.0
appdirs==1.4.4
argon2-cffi==21.1.0
argparse==1.4.0
arrow==1.2.2
arviz==0.12.1
asn1crypto==1.4.0
astor==0.8.1
astunparse==1.6.3
async-timeout==4.0.2
atomicwrites==1.4.1
atpublic==4.0
attrs==23.1.0
Authlib==1.2.1
autograd==1.5
autograd-gamma==0.5.0
autopep8==1.5.7
avro==1.11.3
avro-gen3==0.7.11
awswrangler==3.3.0
Babel==2.12.1
backcall==0.2.0
backoff==2.1.2
bayesian-optimization==1.2.0
bcrypt==3.2.0
beautifulsoup4==4.10.0
bertopic==0.15.0
bidict==0.22.1
bigframes==0.12.0
bimlpa==0.1.2
biopython==1.80
bitarray==2.5.1
black==22.12.0
bleach==4.1.0
blis==0.7.10
blosc2==2.0.0
bokeh==2.4.3
boto==2.49.0
boto3==1.28.17
botocore==1.31.17
boxsdk==3.9.2
branca==0.4.2
build==0.10.0
cached-property==1.5.2
cachetools==5.3.0
catalogue==2.0.6
catboost==1.2
cattrs==1.10.0
cdlib==0.2.6
certifi==2021.10.8
cffi==1.14.6
cftime==1.5.1
chainladder==0.8.18
charset-normalizer==2.0.6
chinese-whispers==0.8.0
chroma-hnswlib==0.7.1
chromadb==0.4.1
click==8.0.4
click-default-group==1.2.4
click-plugins==1.1.1
click-spinner==0.1.10
clickhouse-driver==0.2.3
cligj==0.7.2
clikit==0.6.2
cloudpickle==2.0.0
cma==3.2.2
cmdstanpy==1.0.4
colorama==0.4.6
colorcet==3.0.0
coloredlogs==15.0.1
colour==0.1.5
comm==0.1.4
confection==0.1.1
cons==0.4.5
contextily==1.1.0
convertdate==2.3.2
convoys==0.2.1
crashtest==0.3.1
cryptography==40.0.2
cvxpy==1.3.1
cycler==0.10.0
cymem==2.0.5
Cython==0.29.24
cytoolz==0.11.2
dash==2.9.2
dash-core-components==2.0.0
dash-html-components==2.0.0
dash-table==5.0.0
dask==2022.4.0
databricks-cli==0.17.7
dataclasses-json==0.5.9
datadog==0.35.0
datatable==1.0.0
db-dtypes==1.1.1
dbt-core==1.7.3
dbt-extractor==0.5.1
dbt-semantic-interfaces==0.4.2
dbt-snowflake==1.7.1
debugpy==1.6.7.post1
decorator==5.1.0
deepdish==0.3.7
defusedxml==0.7.1
demon==2.0.6
Deprecated==1.2.13
descartes==1.1.0
df2gspread==1.0.4
dill==0.3.4
distro==1.8.0
dm-tree==0.1.6
dnspython==2.2.0
docker==6.1.3
docutils==0.16
dropbox==11.36.2
duckdb==0.8.1
dulwich==0.21.7
dynetx==0.3.1
easychart==0.1.16
easypost==5.1.3
easytree==0.1.12
ecos==2.0.12
emcee==3.1.2
emoji==2.4.0
entrypoints==0.3
ephem==4.1
et-xmlfile==1.1.0
eth-abi==4.1.0
eth-account==0.9.0
eth-hash==0.5.2
eth-keyfile==0.6.1
eth-keys==0.4.0
eth-rlp==0.3.0
eth-typing==3.4.0
eth-utils==2.2.0
etils==0.9.0
etuples==0.3.5
eva-lcd==0.1.1
expandvars==0.12.0
fastapi==0.99.1
fastdiff==0.3.0
fastprogress==1.0.0
fbprophet @ git+https://github.com/hex-inc/prophet.git@f7cd9b9fa71f5a941421f540cb10b751e59ae2ba
filelock==3.8.0
Fiona==1.8.20
flake8==4.0.1
Flask==2.1.3
flatbuffers==2.0.7
folium==0.12.1.post1
fonttools==4.34.4
formulaic==0.5.2
fredapi==0.5.0
frozenlist==1.3.0
fs==2.4.16
fsspec==2023.9.0
fugue==0.8.3
fugue-sql-antlr==0.1.6
funcy==1.16
future==0.18.2
fuzzywuzzy==0.18.0
gast==0.4.0
gcsfs==2023.9.0
gensim==4.3.2
geographiclib==1.52
geopandas==0.12.2
geopy==2.2.0
gitdb==4.0.9
GitPython==3.1.27
google-api-core==2.11.0
google-api-python-client==1.6.7
google-api-support==0.1.3
google-auth==2.17.1
google-auth-httplib2==0.1.0
google-auth-oauthlib==1.0.0
google-cloud-aiplatform==1.36.0
google-cloud-appengine-logging==1.1.1
google-cloud-audit-log==0.2.0
google-cloud-bigquery==3.13.0
google-cloud-bigquery-connection==1.13.2
google-cloud-bigquery-storage==2.19.1
google-cloud-billing==1.5.1
google-cloud-core==2.1.0
google-cloud-functions==1.13.3
google-cloud-iam==2.12.2
google-cloud-logging==3.0.0
google-cloud-resource-manager==1.10.4
google-cloud-storage==2.1.0
google-crc32c==1.3.0
google-pasta==0.2.0
google-resumable-media==2.0.3
googleapis-common-protos==1.59.0
googlemaps==4.5.3
gql==3.4.0
graphql-core==3.2.0
graphviz==0.20.1
greenlet==1.1.2
gremlinpython==3.6.4
grpc-google-iam-v1==0.12.6
grpcio==1.50.0
grpcio-status==1.41.0
gspread==5.7.1
gspread-dataframe==3.3.0
gspread-pandas==3.2.2
gunicorn==21.2.0
h11==0.14.0
h3==3.7.6
h5py==3.1.0
hdbscan==0.8.33
Editable install with no version control (hex-api==1.0.0)
-e /python-sdk
Editable install with no version control (hex-data-service==0.1.0)
-e /data-service-python
Editable install with no version control (hex-lazy-installer==0.1.0)
-e /hex-lazy-installer
Editable install with no version control (hex-packages==0.1.0)
-e /python-kernel-packages
Editable install with no version control (hex-shared==0.1.0)
-e /python-shared
hexbytes==0.2.2
Editable install with no version control (hextoolkit==0.1.0)
-e /python-api
hijri-converter==2.2.2
holidays==0.14.2
holoviews==1.15.0
httpcore==1.0.2
httplib2==0.20.1
httpstan==4.8.1
httptools==0.6.0
httpx==0.25.2
hubspot-api-client==8.1.1
huggingface-hub==0.11.1
humanfriendly==10.0
humanize==4.8.0
hyperopt==0.2.7
ibis-framework==6.2.0
idna==3.2
igraph==0.9.11
ijson==3.2.3
imageio==2.9.0
imbalanced-learn==0.9.1
importlib-metadata==6.8.0
importlib-resources==5.10.0
interface-meta==1.3.0
ipykernel==6.25.1
ipython==7.32.0
ipython-genutils==0.2.0
ipywidgets==7.8.1
isodate==0.6.1
itsdangerous==2.1.2
jaraco.classes==3.3.0
jax==0.3.23
jaxlib==0.3.22
jedi==0.17.2
jeepney==0.8.0
Jinja2==3.1.2
jinjasql==0.1.8
jmespath==0.10.0
joblib==1.2.0
jsonpatch==1.33
jsonpath-ng==1.5.3
jsonpointer==2.4
jsonref==1.1.0
jsonschema==4.0.1
jupyter-client==7.0.6
jupyter-dash==0.4.2
jupyter_core==5.3.1
jupyterlab-pygments==0.1.2
jupyterlab-widgets==1.0.2
kaleido==0.2.1
keplergl==0.1.2
keras==2.12.0
Keras-Applications==1.0.8
keyring==24.3.0
kiwisolver==1.3.2
korean-lunar-calendar==0.2.1
langchain==0.0.347
langchain-core==0.0.11
langcodes==3.3.0
langsmith==0.0.69
leather==0.3.4
libclang==14.0.1
lifelines==0.27.4
lightfm==1.17
lightgbm==3.3.5
littleutils==0.2.2
llvmlite==0.40.1
locket==0.2.1
Logbook==1.5.3
logical-unification==0.4.5
loguru==0.6.0
looker-sdk==22.2.1
lru-dict==1.1.7
LunarCalendar==0.0.9
lxml==4.8.0
Mako==1.2.4
Markdown==3.3.4
markdown-it-py==2.2.0
markov-clustering==0.0.6.dev0
MarkupSafe==2.0.1
marshmallow==3.17.0
marshmallow-enum==1.5.1
mashumaro==3.11
matplotlib==3.5.2
matplotlib-inline==0.1.3
matplotlib-venn==0.11.7
mccabe==0.6.1
mdurl==0.1.2
mercantile==1.2.1
miniKanren==1.0.3
minimal-snowplow-tracker==0.0.2
mistune==0.8.4
mixpanel==4.10.0
mizani==0.7.4
mlflow==2.6.0
modelbit==0.29.0
monotonic==1.6
more-itertools==8.10.0
mpmath==1.3.0
msgpack==1.0.5
multidict==6.0.2
multipledispatch==0.6.0
multiprocess==0.70.12.2
multitasking==0.0.11
munch==2.5.0
murmurhash==1.0.5
mypy==0.961
mypy-extensions==0.4.3
natsort==8.4.0
nbclient==0.5.4
nbconvert==6.2.0
nbformat==5.1.3
nest-asyncio==1.5.1
netCDF4==1.5.7
networkx==2.6.3
nevergrad==0.5.0
nf1==0.0.4
nltk==3.7
notebook==6.4.12
numba==0.57.1
numexpr==2.8.5
numpy==1.23.4
numpy-financial==1.0.0
numpyro==0.10.1
oauth2client==4.1.3
oauthlib==3.1.1
onnxruntime==1.15.1
openai==1.3.7
openapi-schema-pydantic==1.2.4
opencv-python==4.8.0.74
openpyxl==3.0.9
opensearch-py==1.1.0
opt-einsum==3.3.0
optbinning==0.17.3
orbit-ml==1.1.4.2
orjson==3.8.9
ortools==9.4.1874
oscrypto==1.2.1
osqp==0.6.2.post9
outdated==0.2.2
overrides==7.4.0
packaging==21.3
palettable==3.3.0
pandas==1.5.3
pandas-flavor==0.6.0
pandas-gbq==0.19.1
pandasql==0.7.3
pandocfilters==1.5.0
panel==0.13.1
param==1.12.2
paramiko==2.10.2
parsedatetime==2.6
parsimonious==0.9.0
parso==0.7.1
parsy==2.1
partd==1.2.0
pastel==0.2.1
pathlib-mate==1.0.1
pathspec==0.9.0
pathy==0.10.2
patsy==0.5.2
pexpect==4.8.0
pg8000==1.29.2
pickleshare==0.7.5
Pillow==9.1.1
pinecone-client==2.2.4
pingouin==0.5.3
pip==23.3.1
pkginfo==1.9.6
platformdirs==3.0.0
plotly==5.13.0
plotly-resampler==0.8.3.2
plotnine==0.9.0
pluggy==0.13.1
ply==3.11
polars==0.19.2
pooch==1.6.0
posthog==3.0.2
preshed==3.0.5
presto-python-client==0.7.0
progressbar2==4.0.0
prometheus-client==0.11.0
prompt-toolkit==3.0.20
prophet==1.1
proto-plus==1.22.1
protobuf==4.24.1
psutil==5.8.0
psycopg2==2.9.1
ptyprocess==0.7.0
PuLP==2.6.0
pulsar-client==3.2.0
py==1.10.0
py-cpuinfo==9.0.0
py4j==0.10.9.7
pyarrow==10.0.1
pyarrow-hotfix==0.6
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyclustering==0.10.1.2
pycodestyle==2.8.0
pycparser==2.20
pycryptodome==3.14.1
pycryptodomex==3.11.0
pyct==0.4.8
pydantic==1.10.12
pydata-google-auth==1.8.2
pydeck==0.7.1
pydot==1.4.2
pyerf==1.0.1
pyflakes==2.4.0
Pygments==2.10.0
pygraphviz==1.10
pygsheets==2.0.5
pyjanitor==0.25.0
PyJWT==2.4.0
pyLDAvis==3.2.2
pylev==1.4.0
pymc==4.1.5
pymc3==3.11.4
PyMeeus==0.5.11
pymer4==0.8.0
pymongo==4.0.1
pymsteams==0.2.2
PyMySQL==1.0.2
PyNaCl==1.5.0
pynmeagps==1.0.20
pynndescent==0.5.4
pyodbc==4.0.32
pyOpenSSL==23.2.0
pyparsing==2.4.7
PyPika==0.48.9
pyproj==3.6.1
pyproject_hooks==1.0.0
pyro-api==0.1.2
pyro-ppl==1.8.4
pyrsistent==0.18.0
pysftp==0.2.9
pyshp==2.1.3
pysimdjson==3.2.0
pystan==3.5.0
pytest==5.4.3
python-box==7.0.1
python-dateutil==2.8.2
python-dotenv==1.0.0
python-igraph==0.9.11
python-json-logger==2.0.7
Editable install with no version control (python-kernel-startup==0.1.0)
-e /python-kernel-startup
python-Levenshtein==0.12.2
python-louvain==0.16
python-slugify==8.0.1
Editable install with no version control (python-universal-dataframe==0.1.0)
-e /python-universal-dataframe
python-utils==3.3.3
pytimeparse==1.1.8
pytorch-ignite==0.4.6
pytrends==4.7.3
pytz==2023.3.post1
pyunormalize==15.0.0
pyviz-comms==2.2.0
PyWavelets==1.1.1
PyYAML==6.0.1
pyzmq==22.3.0
qdldl==0.1.7
qpd==0.4.0
querystring-parser==1.2.4
rasterio==1.2.9
redmail==0.4.0
redshift-connector==2.0.915
regex==2023.5.5
requests==2.28.0
requests-aws4auth==1.2.3
requests-file==1.5.1
requests-oauthlib==1.3.0
requests-toolbelt==0.9.1
retrying==1.3.4
rich==13.2.0
rlp==3.0.0
ropwr==1.0.0
rpy2==3.5.13
rsa==4.7.2
Rtree==0.9.7
ruamel.yaml==0.17.26
ruamel.yaml.clib==0.2.7
s3fs==2023.9.0
s3transfer==0.6.0
SALib==1.4.7
scikit-image==0.18.3
scikit-learn==1.2.2
scikits.bootstrap==1.1.0
scipy==1.10.1
scramp==1.4.1
scs==3.2.3
seaborn==0.12.1
SecretStorage==3.3.3
selenium==3.141.0
semver==2.13.0
Send2Trash==1.8.0
sentence-transformers==2.2.2
sentencepiece==0.1.97
sentry-sdk==1.39.0
setuptools==68.2.2
setuptools-git==1.2
shap==0.41.0
shapely==2.0.1
sidetable==0.9.0
simple-salesforce==1.11.4
simplejson==3.18.1
six==1.15.0
sklearn-pandas==1.8.0
sktime==0.16.1
slack-sdk==3.18.3
slicer==0.0.7
smart-open==5.2.1
smmap==5.0.0
snapshottest==0.6.0
sniffio==1.3.0
snowflake-connector-python==3.1.0
snowflake-ml-python==1.0.7
snowflake-snowpark-python==1.6.1
snowflake-sqlalchemy==1.4.7
snuggs==1.4.7
sortedcontainers==2.4.0
soupsieve==2.2.1
spacy==3.6.0
spacy-legacy==3.0.12
spacy-loggers==1.0.4
sparse==0.14.0
splunk-sdk==1.6.18
sql-metadata==2.5.0
SQLAlchemy==1.4.25
sqlalchemy-redshift==0.7.9
sqlalchemy2-stubs==0.0.2a32
sqlglot==11.4.5
sqlmodel==0.0.8
sqlparse==0.4.2
srsly==2.4.7
starlette==0.27.0
statsforecast==1.5.0
statsmodels==0.14.0
stone==3.2.1
stripe==2.60.0
styleframe==4.1
sympy==1.12
tableauserverclient==0.24
tables==3.8.0
tabulate==0.8.9
tenacity==8.2.2
tensorboard==2.12.1
tensorboard-data-server==0.7.0
tensorboard-plugin-wit==1.8.0
tensorflow==2.12.0
tensorflow-decision-forests==1.3.0
tensorflow-estimator==2.12.0
tensorflow-hub==0.14.0
tensorflow-io-gcs-filesystem==0.26.0
tensorflow-probability==0.19.0
termcolor==1.1.0
terminado==0.12.1
testpath==0.5.0
text-unidecode==1.3
textblob==0.15.3
texttable==1.6.4
tfcausalimpact==0.0.13
Theano-PyMC==1.1.2
thinc==8.1.10
threadpoolctl==3.0.0
thresholdclustering==1.1
tifffile==2021.8.30
tiktoken==0.5.2
tokenizers==0.13.2
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.1
toolz==0.11.1
torch==1.12.1
torchvision==0.13.1
tornado==6.3.3
tqdm==4.65.0
trace-updater==0.0.9
traitlets==5.9.0
traittypes==0.2.1
transformers==4.25.1
triad==0.8.4
typed-ast==1.5.1
typer==0.4.0
types-cachetools==5.3.0.5
types-pkg-resources==0.1.3
types-PyYAML==6.0.12
types-requests==2.28.2
types-urllib3==1.26.15
typing-inspect==0.9.0
typing_extensions==4.7.1
tzlocal==3.0
ua-parser==0.10.0
ujson==5.4.0
umap-learn==0.5.1
UpSetPlot==0.6.1
uritemplate==3.0.1
urllib3==1.26.16
user-agents==2.2.0
uszipcode==0.2.6
uvicorn==0.23.2
uvloop==0.17.0
validators==0.20.0
vegafusion==1.5.0
vegafusion-python-embed==1.5.0
vl-convert-python==0.13.1
wasabi==1.1.2
wasmer==1.1.0
wasmer_compiler_cranelift==1.1.0
watchfiles==0.19.0
wcwidth==0.2.5
web3==6.8.0
webargs==8.2.0
webencodings==0.5.1
websocket-client==1.6.1
websockets==11.0.3
Werkzeug==2.0.2
wheel==0.38.4
widgetsnbextension==3.6.6
wordcloud==1.8.1
wrapt==1.12.1
wurlitzer==3.0.2
xarray==0.19.0
xarray-einstats==0.3.0
xgboost==1.7.3
xlrd==2.0.1
xxhash==3.4.1
yarl==1.7.2
yfinance==0.1.87
youtube-data-api==0.0.21
zipp==3.10.0
zstandard==0.21.0
I have a stage with a few CSV files in it. I wanted to create a Snowpark dataframe that read the CSV files and placed the filename as a column in the dataframe. I was then trying to use the filename column downstream but I got a lot of strange errors.
Here's what I started with:
When I convert it to a pandas dataframe and look at the columns, I see everything I expect:
However, if I look at the columns in the Snowpark dataframe, the "FILENAME" column doesn't exist:
If I include the FILENAME in the schema section (
SPT.StructField("FILENAME", SPT.StringType())
), then it does show up in thecsv_spdf.columns
. However, when I try to convert it to a pandas dataframe I get an error that there are duplicate columns:SnowparkFetchDataException: (1406): Failed to fetch a Pandas Dataframe. The error is: Found non-unique column index
Some other strange things that happen here is that if I try to group by the FILENAME column, it can execute the query and return the results:
However, if I try to ask for the columns that are in the above dataframe expression, I get an error:
The text was updated successfully, but these errors were encountered: