Pipeline follwed for creation of mice_of dataset #1046

auesro · 2022-11-14T15:16:41Z

auesro
Nov 14, 2022

I have a similar dataset to the one in mice_of.

I have downloaded it and was wondering what was the strategy followed for selecting and labeling frames for training. From what I can understand the initial frames were selected using image features method under Generate suggestions, in that case I would like to know the parameters used for each of the options (is there are description these available?).
After that, I assume an initial model was trained and frames predicted: how many and what (labeled, suggested or random) frames were predicted? And how did it proceed from here?

Thanks!

A

Answered by talmo

Nov 22, 2022

I believe we used image features for the first round (though I don't think we stored which parameters were used), and for subsequent rounds we just labeled using the prediction score as a guide. We made sure to include many different kinds of data in the annotations with varying numbers of animals and social interactions.

View full answer

roomrys · 2022-11-22T20:29:30Z

roomrys
Nov 22, 2022
Maintainer

Hi @auesro,

Apologies for the late response. I don't believe we have it documented how the suggested labels were generated. @talmo might have the intermediate datasets from initial labeling to the final dataset which could answer your final question.

Sorry I can't be of more help

1 reply

auesro Nov 25, 2022
Author

Thanks for your help, Liezl!

talmo · 2022-11-22T20:41:08Z

talmo
Nov 22, 2022
Maintainer

I believe we used image features for the first round (though I don't think we stored which parameters were used), and for subsequent rounds we just labeled using the prediction score as a guide. We made sure to include many different kinds of data in the annotations with varying numbers of animals and social interactions.

1 reply

auesro Nov 25, 2022
Author

Thanks for the info Talmo, I believe that should be enough to get me going. However, it would be good to have some kind of documentation about how each of the different parameters in the Image features work.

auesro · 2022-12-09T14:23:40Z

auesro
Dec 9, 2022
Author

Hi again,

More questions regarding the training of this dataset (considering it a good benchmark for our own data, very similar only that we have 1 mouse per video):
-Could you provide a brief explanation of the following parameters in the image features dialog?
*Brisk Keypoint Threshold
*Bag of Features Vocab Size
*PCA Components
*K-Means Clusters
*Samples per Cluster
-How many and what (labeled, suggested or random) frames were predicted during the following training iterations?

Thanks!

0 replies

roomrys · 2022-12-13T21:39:24Z

roomrys
Dec 13, 2022
Maintainer

Hi @auesro,

I don't believe we documented how many and what (labeled, suggested or random) frames were predicted during the following training iterations...

...but, here is a brief description of each parameter:

Brisk Keypoint Threshold

The brisk keypoint threshold determines the sensitivity of the keypoint detection process. If the threshold is set too low, the algorithm may include many keypoints that are not distinctive. If the threshold is set too high, the algorithm may miss some keypoints that are actually distinctive.

Bag of Features Vocab Size

The term "bag of features" refers to a method of representing an image as a collection of local features. The "vocab size" of a bag of features refers to the number of visual features that are used to represent an image. A larger vocab size allows the bag of features to capture more detailed information about the visual features of the training images, but also increases the computational complexity of the algorithm.

PCA Components

Principal component analysis (PCA) is a technique for reducing the dimensionality of a dataset by identifying the underlying patterns in the data and projecting the data onto a lower-dimensional space. The PCA components of an image are the vectors that represent the most important features of the image in the lower-dimensional space. The number of PCA components used to represent an image is a hyperparameter that can be adjusted to trade off between accuracy and efficiency. A larger number of components can capture more detailed information about the image, but it can also increase the computational complexity of the algorithm.

K-Means Clusters

K-means clustering is a method for grouping a set of data points into a specified number of clusters. The algorithm works by first randomly selecting a set of K cluster centers, and then iteratively assigning each data point to the closest cluster center and updating the cluster centers to the mean of the points in each cluster. The number of K-means clusters, K, is an important parameter in the K-means algorithm. A larger value of K will result in more, smaller clusters, while a smaller value of K will result in fewer, larger clusters.

Samples per Cluster

The samples per cluster is the number of data points in each cluster after the clustering algorithm has been applied. In general, a larger number of samples per cluster can result in more detailed and accurate cluster assignments, but it can also increase the computational complexity of the algorithm.

Thanks,
Liezl

0 replies

auesro · 2022-12-14T10:17:25Z

auesro
Dec 14, 2022
Author

Thanks a lot, Liezl!

That should be enough to get us going. In the reference dataset, I assume you used K-means cluster=40 since there are 40 different groups of frames, right? How do you come up with that number? What would be a good strategy to follow here? Go for a high enough number of clusters that would capture the different poses found in your dataset? Is there any possibility to check whether you fell short of it or, instead, went for too large a number of clusters?

3 replies

roomrys Dec 14, 2022
Maintainer

Hi @auesro,

Yes, using a K=40 will result in 40 different groups of frames. You can roughly estimate the number of clusters to use in the k-means clustering algorithm by visually inspecting the dataset and try to identify natural groupings or clusters within the data - as you assumed, the natural groupings would be the distinct poses. However, each frame will be grouped into just a single cluster regardless if it contains multiple different poses (i.e. multi-animal projects).

If K is too small, then each cluster will contain frames which seemingly have (too) little correlation - each cluster is too generalized. Visually, you might see that a cluster contains frames which hold a variety of poses where we expect just a single pose per cluster.

If K is too large, then the video may be divided into clusters that are too fine-grained, separating a single pose into multiple clusters using features from individual body parts rather than more general poses. Visually, you would see multiple clusters which categorize variations of the same pose.

If you would like to take some time to fine-tune the value of k, then you could write a script that uses the elbow method to determine the optimal k value.

Thanks,
Liezl

auesro Dec 15, 2022
Author

Thanks Liezl!

It would be great to fine-tune k, but I don't know even what kind of variable sleap is considering when applying k-means clustering...

roomrys Dec 15, 2022
Maintainer

Hi @auesro,

You may have to tweak some parameters, but you can create a new python file with this code to get started:

import sleap
from sleap.info.feature_suggestions import (
    FeatureSuggestionPipeline,
    ItemStack,
)

import os
import numpy as np
from tqdm import tqdm
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
from typing import cast


def run_pipeline_up_to_kmeans(
    per_video,
    scale,
    sample_method,
    feature_type,
    brisk_threshold,
    vocab_size,
    n_components,
    n_clusters,
):
    # Set-up pipeline to generate frames based on image features
    pipeline = FeatureSuggestionPipeline(
        per_video=per_video,
        scale=scale,
        sample_method=sample_method,
        feature_type=feature_type,
        brisk_threshold=brisk_threshold,
        vocab_size=vocab_size,
        n_components=pca_components,
        n_clusters=n_clusters,
        per_cluster=np.nan,  # Not used in k-means analysis
    )

    # Get frame data
    if pipeline.frame_data is None:
        pipeline.run_disk_stage(videos=videos)
    pipeline.frame_data = cast(
        ItemStack, pipeline.frame_data
    )  # For typehinting purposes

    # Generate feature data for each frame
    if pipeline.feature_type == "brisk":
        # Get bag of features vector for each image from brisk descriptors
        # for brisk keypoints on each image.
        pipeline.frame_data.brisk_bag_of_features(
            brisk_threshold=pipeline.brisk_threshold, vocab_size=pipeline.vocab_size
        )
    elif pipeline.feature_type == "hog":
        # Get bag of features vector for each image from hog descriptors
        # at brisk keypoints.
        pipeline.frame_data.hog_bag_of_features(
            brisk_threshold=pipeline.brisk_threshold, vocab_size=pipeline.vocab_size
        )
    else:
        # Flatten the raw image matrix for each image
        pipeline.frame_data.flatten()

    # Transform data using PCA
    pipeline.frame_data.pca(n_components=pipeline.n_components)

    # Generate groups of frames using k-means
    kmeans = KMeans(n_clusters=n_clusters).fit(pipeline.frame_data.data)

    return kmeans, pipeline.frame_data.data


def k_means_elbow_method(
    per_video: int = 100,
    scale: float = 1.0,
    sample_method: str = "random",
    feature_type: str = "brisk",
    brisk_threshold: int = 40,
    vocab_size: int = 20,
    pca_components: int = 5,
    min_num_clusters: int = 1,
    max_num_clusters: int = 10,
):
    K = range(min_num_clusters, max_num_clusters)
    inertias = []
    X = -1 * np.ones((per_video, pca_components, len(K)))
    for idx, k in enumerate(tqdm(K)):
        # Building and fitting the model
        kmeanModel, data = run_pipeline_up_to_kmeans(
            per_video=per_video,
            scale=scale,
            sample_method=sample_method,
            feature_type=feature_type,
            brisk_threshold=brisk_threshold,
            vocab_size=vocab_size,
            n_components=pca_components,
            n_clusters=k,
        )

        inertias.append(kmeanModel.inertia_)
        X[:, :, idx] = data

    # Plot the inertias
    plt.plot(K, inertias, "bx-")
    plt.xlabel("Values of K")
    plt.ylabel("Inertia")
    plt.title("The Elbow Method using Inertia")
    plt.show()


if __name__ == "__main__":

    # Load dataset
    ds = os.environ["ds-dmc"]  # Replace with path to your slp
    labels = sleap.load_file(ds)
    videos = labels.videos

    # Params: Initial samples
    per_video: int = 20
    sample_method: str = "random"  # "stride" or "random"
    scale: float = 1.0  # Pre-scale image

    # Params: image feature extraction method
    feature_type: str = "brisk"  # "brisk", "hog", or "raw image"

    # Params: Binary Robust Invariant Scalable Keypoints (BRISK)
    brisk_threshold: int = 40

    # Parms: BRISK and Histogram of Oriented Gradients (HOG)
    vocab_size: int = 20

    # Params: Principle Component Analysis (PCA)
    pca_components: int = 7

    # Params: K-means clustering
    min_num_clusters: int = 6
    max_num_clusters: int = 10

    assert min_num_clusters > 0
    assert max_num_clusters - min_num_clusters > 2  # Otherwise, plot is always linear
    assert max_num_clusters < per_video  # Otherwise, each frame is its own cluster

    # Run k-means elbow method
    k_means_elbow_method(
        per_video,
        scale,
        sample_method,
        feature_type,
        brisk_threshold,
        vocab_size,
        pca_components,
        min_num_clusters,
        max_num_clusters,
    )

I included an X in k_means_elbow_method that isn't used for anything at the moment, but it's there if you want to used it for anything.

This is the plot using using the brisk features (on a different dataset - sorry, I should have tested on the mice_of dataset), and I don't see an elbow point (yet?)...

If this elbow method isn't doing the trick then we might have to switch over to a different method or be content with eye-balling it. Let me know if you have any luck (or trouble) with this.

Thanks,
Liezl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline follwed for creation of mice_of dataset #1046

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Pipeline follwed for creation of mice_of dataset #1046

auesro Nov 14, 2022

Replies: 5 comments · 5 replies

roomrys Nov 22, 2022 Maintainer

auesro Nov 25, 2022 Author

talmo Nov 22, 2022 Maintainer

auesro Nov 25, 2022 Author

auesro Dec 9, 2022 Author

roomrys Dec 13, 2022 Maintainer

Brisk Keypoint Threshold

Bag of Features Vocab Size

PCA Components

K-Means Clusters

Samples per Cluster

auesro Dec 14, 2022 Author

roomrys Dec 14, 2022 Maintainer

auesro Dec 15, 2022 Author

roomrys Dec 15, 2022 Maintainer

auesro
Nov 14, 2022

Replies: 5 comments 5 replies

roomrys
Nov 22, 2022
Maintainer

auesro Nov 25, 2022
Author

talmo
Nov 22, 2022
Maintainer

auesro Nov 25, 2022
Author

auesro
Dec 9, 2022
Author

roomrys
Dec 13, 2022
Maintainer

auesro
Dec 14, 2022
Author

roomrys Dec 14, 2022
Maintainer

auesro Dec 15, 2022
Author

roomrys Dec 15, 2022
Maintainer