Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function only reading filtered rows. How do I read all rows from a sheet? #246

Open
abhi250372 opened this issue Dec 15, 2022 · 13 comments
Open

Comments

@abhi250372
Copy link

Here is my code -
filepath = os.path.join(FolderPath, FileName)
a = load_workbook(filepath, read_only=True)

def read_excel_file(path: str, sheet_index: int) -> pd.DataFrame:
buffer = StringIO()
Xlsx2csv(path, outputencoding="utf8").convert(buffer,sheetid=sheet_index)
buffer.seek(0)
df = pd.read_csv(buffer)
return df

dfinal=read_excel_file(filepath,a.index(a.get_sheet_by_name('Sheetname'))+1)

@abhi250372
Copy link
Author

I am using XLSX2CSV because it is faster in reading excel files as compared to read_excel() from openpyxl. But using XLSX2CSV only reads filtered rows of a sheet whereas read_excel reads all rows regardless if the sheet is filtered or not.

@tongngo
Copy link

tongngo commented Dec 21, 2022

Hello, I have just experimented it also. Is there a way/option that XLSX2CSV would read all rows of a sheet ?

@serene-dev
Copy link

what is a filtered row? can you share a sample xlsx file?

@tongngo
Copy link

tongngo commented Dec 21, 2022

In the attached file, ColumnA is filtering rows only showing 'A' value in ColumnA (not showing the rows beginning with 'B' value)
The xlsx2csv would only convert the visible rows, not all rows as Excel would do.
Samplefile.xlsx

@abhi250372
Copy link
Author

abhi250372 commented Dec 21, 2022 via email

@abhi250372
Copy link
Author

Here is sample file Car.xlsx

@abhi250372
Copy link
Author

Looks like dilshod has fixed the issue. I downloaded the xlsx2csvc.py file in the repository and replaced the old file in my local machine with this new one. I added the skip_hidden_rows as False in my function mentioned above and now it works. However, I need to know when this version will be available via pip install command so that all the users can get the updated version instead of manually downloading this file. Can someone please help with that? def read_excel_file(path: str, sheet_index: int) -> pd.DataFrame:
buffer = StringIO()
Xlsx2csv(path, outputencoding="utf-8",skip_hidden_rows = False).convert(buffer,sheetid = sheet_index)
buffer.seek(0)
df = pd.read_csv(buffer)
return df

@tongngo
Copy link

tongngo commented Dec 26, 2022

Looks like dilshod has fixed the issue. I downloaded the xlsx2csvc.py file in the repository and replaced the old file in my local machine with this new one. I added the skip_hidden_rows as False in my function mentioned above and now it works. However, I need to know when this version will be available via pip install command so that all the users can get the updated version instead of manually downloading this file. Can someone please help with that? def read_excel_file(path: str, sheet_index: int) -> pd.DataFrame: buffer = StringIO() Xlsx2csv(path, outputencoding="utf-8",skip_hidden_rows = False).convert(buffer,sheetid = sheet_index) buffer.seek(0) df = pd.read_csv(buffer) return df

Thank you again @dilshod !!

@abhi250372
Copy link
Author

@dilshod When will the latest version be released so that I can do pip install?

@abhi250372
Copy link
Author

@dilshod is there an update on this?

@hendrikschafer
Copy link

It is available via pip now, this issue can be closed!

@mcrumiller
Copy link

mcrumiller commented Jun 27, 2023

I think skip_hidden_rows=False should be the default. Nobody expects filtered rows to be removed when reading an Excel.

@tongngo
Copy link

tongngo commented Jun 27, 2023

I think skip_hidden_rows=False should be the default. Nobody expects filtered rows to be removed when reading an Excel.

Yes, also, this is the default beahvior of excel when saviong to csv file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants