Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Preprocess Text: set highest absolute frequency #807

Merged
merged 2 commits into from
Apr 25, 2022

Conversation

noahnovsak
Copy link
Contributor

@noahnovsak noahnovsak commented Apr 5, 2022

Issue

Fixes #669

Description of changes

Changes the behaviour of absolute frequency filter to return no upper cutoff point when the values of both boxes are equal.

Includes
  • Code changes
  • Tests
  • Documentation

@noahnovsak noahnovsak changed the title [FIX] OWPreprocess: max abs frequency [FIX] Preprocess Text: max abs frequency Apr 5, 2022
@noahnovsak noahnovsak changed the title [FIX] Preprocess Text: max abs frequency [FIX] Preprocess Text: set highest absolute frequency Apr 5, 2022
@ajdapretnar ajdapretnar requested a review from VesnaT April 14, 2022 13:19
@noahnovsak noahnovsak closed this Apr 21, 2022
@noahnovsak noahnovsak deleted the fix_max_abs_freq branch April 21, 2022 07:46
@PrimozGodec
Copy link
Collaborator

@needNewUsername, was this PR closed by mistake?

@noahnovsak
Copy link
Contributor Author

Yes, i thought it had been dealt with, but I now remember I was waiting for @ajdapretnar to check if this is the type of solution we are looking for, and make sure it doesn't break anything else.

@noahnovsak noahnovsak restored the fix_max_abs_freq branch April 21, 2022 07:56
@PrimozGodec PrimozGodec reopened this Apr 21, 2022
@codecov-commenter
Copy link

codecov-commenter commented Apr 21, 2022

Codecov Report

Merging #807 (1393608) into master (4a28f85) will increase coverage by 0.96%.
The diff coverage is 78.26%.

❗ Current head 1393608 differs from pull request most recent head a65f149. Consider uploading reports for the commit a65f149 to get more accurate results

@@            Coverage Diff             @@
##           master     #807      +/-   ##
==========================================
+ Coverage   74.95%   75.92%   +0.96%     
==========================================
  Files          75       76       +1     
  Lines       10189    10644     +455     
  Branches     1626     1788     +162     
==========================================
+ Hits         7637     8081     +444     
- Misses       2303     2308       +5     
- Partials      249      255       +6     

@ajdapretnar
Copy link
Collaborator

There are some issues with the spin box, unfortunately.

If I set min=2 and max=max, the results are ok.
Then I increase min to 3. The spin box value of max changes to 3 as well (visually), but the output is really max.
The spin box also allows me to pass max value smaller than min (i.e. min=5, max=3). There are a couple of options how to handle this (raising an error, reversing the values). But I think the nicest one is simply preventing the spin box to go below min value.

What do you think, @PrimozGodec?

@PrimozGodec
Copy link
Collaborator

@ajdapretnar I agree. I think that when the value of one spin changes the minimum/maximum values of the other one can be adjusted via setMinimum/setMaximum

@ajdapretnar
Copy link
Collaborator

This looks good to me, but I would sleep much easier if there was a test that would check this behavior. 😅
For example passing:

  • 5, None (check that it turns to max and that the output is ok)
  • 5, 3 (check that it turns to max and that the output is ok)
  • raising the final 3 to 6 and checking that it remains a 6 (not max)

@ajdapretnar
Copy link
Collaborator

I can tell you that for deerwester you should get:

  • 1 word ("of")
  • 1 word ("of")
  • 1 word ("of")
  • then perhaps try also 3-4, where you'd get "system, user, the, trees, graph".

@ajdapretnar
Copy link
Collaborator

As I need this quite urgently, I'll open another issue with a request for tests.

@ajdapretnar ajdapretnar merged commit 57c2009 into biolab:master Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Preprocess Text: set highest absolute frequency
4 participants