Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SVCS-531] Improve tabular renderer to handle more TSV cases #308

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open

[SVCS-531] Improve tabular renderer to handle more TSV cases #308

wants to merge 4 commits into from

Conversation

TomBaxter
Copy link
Member

Ticket

SVCS-531

Purpose

[SVCS-531] Improve tabular renderer to handle more TSV cases

Changes

mfr/extensions/tabular/libs/init.py
mfr/extensions/tabular/libs/stdlib_tools.py
mfr/extensions/tabular/settings.py

  • separate tsv_stdlib from csv_stdlib
  • improve exceptions

tests/extensions/tabular/files/invalid_null.csv
tests/extensions/tabular/test_stdlib_tools.py

  • add tests

tests/extensions/ipynb/files/no_metadata.ipynb - quiet test (incidental)

tests/extensions/zip/test_renderer.py - quiet test (incidental)

Side effects

Possible increase in system resource use due to full file being sniffed

QA Notes

In addition to the tsv files found in https://osf.io/pexrv/?view_only=af771fef666049a9aefa85642230e80a also test json_test.csv attached to ticket.
This ticket only effects .tsv and .csv files rendered in MFR.

Deployment Notes

None

AddisonSchiller and others added 4 commits November 22, 2017 13:12
Csv.sniff could cause random characters or spaces to be used
as the delimiter. Separating  these functions and using a hard coded
dialect fixes this display problem.
SVCS-531
Our handling of CVS/TSV files was convoluted and unnecessary, this
cleaned up and simplified.

Collateral-
Updated two tests that were working fine, but producing unnecessary noise in
the test logs.
@coveralls
Copy link

coveralls commented Dec 21, 2017

Coverage Status

Coverage increased (+0.5%) to 68.501% when pulling 3048f77 on TomBaxter:feature/SVCS-531 into 8bb2dd4 on CenterForOpenScience:develop.

@cslzchen
Copy link
Contributor

cslzchen commented Jan 5, 2018

@felliott The previous PR was already Phase 2. Do you want to review this or should I do another round of phase 1?

@felliott
Copy link
Member

felliott commented Jan 8, 2018

I'll take it. Phase 2 is good.

@cslzchen
Copy link
Contributor

cslzchen commented Jan 8, 2018

Sounds good! Added Phase 2 label to the PR.

@jonathon-love
Copy link
Contributor

hey guys,

i think you may have some issues with this:

dialect = csv.Sniffer().sniff(fp.read(), ',')

https://github.com/CenterForOpenScience/modular-file-renderer/pull/308/files#diff-875eb910328d4d6758865a7db8ef38cdR13

i remember having issues with the jamovi csv importer when the data returned didn't end with a complete line. i worked around it thus:

some_data = csvfile.read(4096)
if len(some_data) == 4096:  # csv sniffer doesn't like partial lines
    some_data = trim_after_last_newline(some_data)
    dialect = csv.Sniffer().sniff(some_data, ', \t;')

and

def trim_after_last_newline(text):

    index = text.rfind('\r\n')
    if index == -1:
        index = text.rfind('\n')
        if index == -1:
            index = text.rfind('\r')

    if index != -1:
        text = text[:index]

return text

https://github.com/jamovi/jamovi/blob/master/server/jamovi/server/formatio/csv.py#L85

i'd also take exception to this:

# CSVs are always values seperated by commas

https://github.com/CenterForOpenScience/modular-file-renderer/pull/308/files#diff-875eb910328d4d6758865a7db8ef38cdR11

i don't think excel saves .csv files separated by commas by default

Copy link
Contributor

@cslzchen cslzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned by @jonathon-love, there are some issues with out implementation and we need to double check our assumptions about CSV and TSV files.

Credits goes to @AddisonSchiller and @TomBaxter for the original work. I will take over for the update/fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants