-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: new tests added for tsne to expand test coverage #2229
base: main
Are you sure you want to change the base?
feature: new tests added for tsne to expand test coverage #2229
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Flags with carried forward coverage won't be shown. Click here to find out more. |
@@ -16,11 +16,175 @@ | |||
|
|||
import numpy as np | |||
from numpy.testing import assert_allclose | |||
|
|||
import pytest | |||
#Note: n_componets must be 2 for now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#Note: n_componets must be 2 for now | |
#Note: n_components must be 2 for now | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ethan,
Thx for the comments this is fixed
Best,
Yue
/intelci: run |
/intelci: run |
assert tsne.embedding_.shape == (4, 2) | ||
|
||
# Test with random data | ||
np.random.seed(42) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe some tests might be getting executed in parallel by pytest
(or at least through joblib
as I see in some logs). Better to change to the non-global np.random.Generator
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant here was this module: https://numpy.org/doc/stable/reference/random/generator.html
To use it, you need to generate a Generator
object, for example like this:
rng = np.random.default_rng(seed=123)
Then, you need to call methods from that generator object, like this:
rng.random(size=(100,10))
tsne_perplexity = TSNE(n_components=2, perplexity=9).fit(X_perplexity) | ||
assert tsne_perplexity.embedding_.shape == (10, 2) | ||
|
||
# Test large data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like this one is perhaps not needed, considering that there's already a similar test earlier on with shape (100,10).
|
||
def test_basic_tsne_functionality(): | ||
"""Test TSNE with valid data: basic functionality, random data, reproducibility, and edge cases.""" | ||
# Test basic functionality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance that all of these could be parameterized instead? Otherwise, if one of them fails, then the rest wouldn't execute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into it. But what I meant was to parameterize this whole function by turning the inputs and expectations into parameters, so that one parameterization would be "Test basic functionality", another parameterization "Test with random data", and so on.
@pytest.mark.parametrize("dtype", [np.float32, np.float64]) | ||
def test_tsne_gpu_validation(dataframe, queue, dtype): | ||
""" | ||
GPU validation test for TSNE with a specific complex dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this one be merged into the one above? (test_tsne_with_specific_complex_dataset
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the tests for non-missingness were lost after merging them. Could be good to have them just in case, considering that it tests GPU execution where more things can go wrong.
Description
Added additional tests in sklearnex/manifold/tests/test_tsne.py to expand the test coverage for t-SNE algorithm.
PR completeness and readability
Testing