Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: new tests added for tsne to expand test coverage #2229

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

yuejiaointel
Copy link

@yuejiaointel yuejiaointel commented Dec 17, 2024

Description

Added additional tests in sklearnex/manifold/tests/test_tsne.py to expand the test coverage for t-SNE algorithm.

PR completeness and readability

  • I have reviewed my changes thoroughly before submitting this pull request.
  • I have commented my code, particularly in hard-to-understand areas.
  • Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).

Testing

  • I have run it locally and tested the changes extensively.
  • All CI jobs are green or I have provided justification why they aren't.
  • I have extended testing suite if new functionality was introduced in this PR.

Copy link

codecov bot commented Dec 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Flag Coverage Δ
github 83.18% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

@ethanglaser ethanglaser marked this pull request as draft December 17, 2024 19:02
@@ -16,11 +16,175 @@

import numpy as np
from numpy.testing import assert_allclose

import pytest
#Note: n_componets must be 2 for now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#Note: n_componets must be 2 for now
#Note: n_components must be 2 for now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ethan,
Thx for the comments this is fixed
Best,
Yue

@yuejiaointel
Copy link
Author

/intelci: run

@yuejiaointel yuejiaointel marked this pull request as ready for review December 19, 2024 00:00
@ethanglaser
Copy link
Contributor

/intelci: run

sklearnex/manifold/tests/test_tsne.py Show resolved Hide resolved
assert tsne.embedding_.shape == (4, 2)

# Test with random data
np.random.seed(42)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe some tests might be getting executed in parallel by pytest (or at least through joblib as I see in some logs). Better to change to the non-global np.random.Generator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant here was this module: https://numpy.org/doc/stable/reference/random/generator.html

To use it, you need to generate a Generator object, for example like this:

rng = np.random.default_rng(seed=123)

Then, you need to call methods from that generator object, like this:

rng.random(size=(100,10))

tsne_perplexity = TSNE(n_components=2, perplexity=9).fit(X_perplexity)
assert tsne_perplexity.embedding_.shape == (10, 2)

# Test large data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this one is perhaps not needed, considering that there's already a similar test earlier on with shape (100,10).

sklearnex/manifold/tests/test_tsne.py Outdated Show resolved Hide resolved

def test_basic_tsne_functionality():
"""Test TSNE with valid data: basic functionality, random data, reproducibility, and edge cases."""
# Test basic functionality
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance that all of these could be parameterized instead? Otherwise, if one of them fails, then the rest wouldn't execute.

Copy link
Contributor

@david-cortes-intel david-cortes-intel Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into it. But what I meant was to parameterize this whole function by turning the inputs and expectations into parameters, so that one parameterization would be "Test basic functionality", another parameterization "Test with random data", and so on.

sklearnex/manifold/tests/test_tsne.py Outdated Show resolved Hide resolved
@pytest.mark.parametrize("dtype", [np.float32, np.float64])
def test_tsne_gpu_validation(dataframe, queue, dtype):
"""
GPU validation test for TSNE with a specific complex dataset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this one be merged into the one above? (test_tsne_with_specific_complex_dataset)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the tests for non-missingness were lost after merging them. Could be good to have them just in case, considering that it tests GPU execution where more things can go wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants