Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom clustering #66

Open
wants to merge 449 commits into
base: master
Choose a base branch
from
Open

Custom clustering #66

wants to merge 449 commits into from

Conversation

whalebot-helmsman
Copy link
Contributor

In this PR:

  • fix for default_clustering_score to reduce score for clustering with threshold=0
  • ability to pass custom distance and position functions to clustering procedure
  • new _get_tree_position and _get_tree_distance functions provide much better grouping results

I didn't set these new functions as default functions. Using them as default will improve grouping quality.

tpeng and others added 30 commits April 22, 2014 21:31
It is better to put models somewhere else, and notebooks were broken.
add base classifier and global ngrams feature functions
1. rename DEFAULT_TAGSET to EXAMPLE_TAGSET;
2. rename DEFAULT_FEATURES to EXAMPLE_TOKEN_FEATURES;
3. make token_features empty by default in create_wapiti_pipeline.
except model_filename must be kwargs now. Also, this fixes the example
from the tutorial.
score_func=None,
score_kwargs=None,
get_position_func=None,
get_distance_func=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please document these two new parameters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov
Copy link

codecov bot commented Aug 28, 2018

Codecov Report

Merging #66 into master will decrease coverage by 0.59%.
The diff coverage is 68.18%.

@@            Coverage Diff            @@
##           master      #66     +/-   ##
=========================================
- Coverage   81.02%   80.43%   -0.6%     
=========================================
  Files          40       40             
  Lines        2092     2131     +39     
=========================================
+ Hits         1695     1714     +19     
- Misses        397      417     +20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants