Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pd.options.display.float_format did not follow left side or before decimal places format #59876

Open
2 of 3 tasks
yasirroni opened this issue Sep 23, 2024 · 11 comments
Open
2 of 3 tasks
Assignees
Labels
Bug good first issue Output-Formatting __repr__ of pandas objects, to_string

Comments

@yasirroni
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

# Set the global float format
pd.options.display.float_format = '{:6.3f}'.format

# Example DataFrame
df = pd.DataFrame({
    'A': [123.456, 789.1011],
    'B': [2.71828, 3.14159]
})

df

Issue Description

Pandas pd.options.display.float_format did not follow left side or before decimal places format.

Expected Behavior

If also follows the left side or before decimal places format.

Installed Versions

INSTALLED VERSIONS

commit : d9cdd2e
python : 3.10.11.final.0
python-bits : 64
OS : Darwin
OS-release : 23.6.0
Version : Darwin Kernel Version 23.6.0: Mon Jul 29 21:14:21 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T8103
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8

pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : 69.1.0
pip : 24.0
Cython : None
pytest : 8.1.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.23.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.8.4
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : None
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.11.4
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None

@yasirroni yasirroni added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 23, 2024
@rhshadrach
Copy link
Member

Thanks for the report. Can you include what output you currently get and the output you expect to get.

@rhshadrach rhshadrach added Output-Formatting __repr__ of pandas objects, to_string Needs Info Clarification about behavior needed to assess issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 24, 2024
@yasirroni
Copy link
Author

Current output:

image

Expected output:

The width of all column should be the same based on the allocated space by "format".

image

Example code to generate my output:

import pandas as pd

# Set the global float format
fmt = '{:6.3f}'.format
pd.options.display.float_format = fmt

df = pd.DataFrame({
    'A': [123.456, 789.1011],
    'B': [2.71828, 3.14159]
})

for index, row in df.iterrows():
    formatted_row = [fmt(value) for value in row]  # Format each value in the row
    print(f"Col {index}: {formatted_row}")

@rhshadrach
Copy link
Member

Thanks for the information. It appears to me you are using something akin to Juptyer's display which is a different method than printing.

pd.options.display.float_format = '{:12.3f}'.format

# Example DataFrame
df = pd.DataFrame({
    'A': [123.456, 789.1011],
    'B': [2.71828, 3.14159]
})

print(df)
#              A            B
# 0      123.456        2.718
# 1      789.101        3.142

You can see the option is having the expected impact on printed DataFrames. It's not clear to me whether this is due to a limitation on the Jupyter (or other notebooks) side. Further investigations are welcome!

@rhshadrach rhshadrach added Needs Discussion Requires discussion from core team before further action and removed Needs Info Clarification about behavior needed to assess issue labels Sep 25, 2024
@yasirroni
Copy link
Author

Thank you. I'm using VSCode Jupyter Notebook and I can confirm that print is working as expected but display is not. Using jupeyter lab also works the same.

import pandas as pd

from IPython.display import display

pd.options.display.float_format = '{:.3f}'.format

# Example DataFrame
df = pd.DataFrame({
    'A': [123.456, 789.1011],
    'B': [2.71828, 3.14159]
})
print(df)
display(df)

pd.options.display.float_format = '{:12.3f}'.format

# Example DataFrame
df = pd.DataFrame({
    'A': [123.456, 789.1011],
    'B': [2.71828, 3.14159]
})
print(df)
display(df)
image

@yasirroni
Copy link
Author

So, I think we should close this and pass it to jupyter developer? Please give me feedback on where is the best place to bring this (is it pandas or jupyter).

@yasirroni
Copy link
Author

yasirroni commented Sep 25, 2024

After some investigation, even string format didn't respected by display.

pd.options.display.float_format = '{:.3f}'.format.  # if float, use .3f
df_formatted = df.map(lambda x: str(f'{x:12.3f}')).astype('string').  # change to string to ignore float_format
display(df_formatted)
print(df_formatted)  # correctly using str(f'{x:12.3f}')

The workaround is to directly change Styler:

styled_df = df_formatted.style.set_table_styles(
    [{'selector': 'td', 'props': [('min-width', '80px')]}]
)

display(styled_df)

@rhshadrach rhshadrach reopened this Sep 25, 2024
@rhshadrach
Copy link
Member

Thanks for the investigation - I think your investigation suggests this is an issue with HTML formatting. We still control the HTML that is produced by display(df), so I suspect we may be able to fix it. Even if that is the case, perhaps we should consider having some formatting options only for printed DataFrames.

Leaving this open for now. I plan to investigate it in the near future.

@rhshadrach
Copy link
Member

Two things need to change in order to implement this. First, is passing get_option("display.float_format") to DataFrameFormatter in frame.DataFrame._repr_html. The 2nd is adding " ": " " to esc in io.formats.html.HTMLFormatter._write_cell.

For the 2nd, we also fix other issues with multiple spaces in strings, e.g.

df = pd.DataFrame({"A": ["foo      foo", "bar"]})
display(df)

fixed:

image

main:

image

I think each of these are not controversial, marking as a good first issue for now. But cc @pandas-dev/pandas-core for any thoughts.

@rhshadrach rhshadrach added good first issue and removed Needs Discussion Requires discussion from core team before further action labels Sep 29, 2024
@saldanhad
Copy link
Contributor

Are you expecting a pytest script to cover this change?

@rhshadrach
Copy link
Member

Yes - I think something along the lines of test_info_repr_html would be sufficient.

@saldanhad
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug good first issue Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

3 participants