Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MS Excel opens CSV files with incorrect encoding #167

Open
currycoder opened this issue Jun 9, 2020 · 1 comment
Open

MS Excel opens CSV files with incorrect encoding #167

currycoder opened this issue Jun 9, 2020 · 1 comment

Comments

@currycoder
Copy link

This issue is only tangentially related to the notify service, but I wanted to flag it for discussion.

When CSV files are opened by MS Excel, Excel assumes that the file is ASCII encoded - unless a Byte Order Mark is specified in the first 3 octets of the file to denote the encoding. Further details can be found here: https://stackoverflow.com/a/155176

The result of Excel assuming the incorrect encoding is that special characters are rendered incorrectly to the user.

Excel's behaviour here is quite different to other standard spreadsheet applications. Calc on Linux and Numbers on Mac do a pretty good job of automatically detecting the encoding of the file.

I believe that this issue is a good one to discuss now that the Notify service supports CSV file uploads/downloads fully with the addition of is_csv to prepare_upload.

A very simple idea that I have is to call this behaviour out in the client docs (there's a new section on CSV uploads) - in order to get ahead of this issue for developers that are looking to send CSVs from their apps.

Example Code: CSV which is badly rendered in Excel:

import io
from notifications_python_client import prepare_upload
from notifications_python_client.notifications import NotificationsAPIClient

notifications_client = NotificationsAPIClient("XXX")

csv_contents = 'Büyükdere Cad,foo,bar'
buf = io.BytesIO(csv_contents.encode('utf-8'))
file_content = prepare_upload(buf, is_csv=True)

notifications_client.send_email_notification(
    email_address='[email protected]',
    template_id='XXX',
    personalisation={
        'link_to_file': file_content,
    },
)

Example Code: CSV which is rendered correctly in Excel:

import codecs
import io
from notifications_python_client import prepare_upload
from notifications_python_client.notifications import NotificationsAPIClient

notifications_client = NotificationsAPIClient("XXX")

csv_contents = 'Büyükdere Cad,foo,bar'
buf = io.BytesIO(codecs.BOM_UTF8 + csv_contents.encode('utf-8'))
file_content = prepare_upload(buf, is_csv=True)

notifications_client.send_email_notification(
    email_address='[email protected]',
    template_id='XXX',
    personalisation={
        'link_to_file': file_content,
    },
)
@NickFitz
Copy link

NickFitz commented Jul 4, 2022

Python's encode() method will add the UTF-8 BOM if the encoding is specified as utf-8-sig rather than utf-8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants