Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing documentation for vector_field_function #633

Closed
Baschdl opened this issue Dec 8, 2023 · 6 comments
Closed

Missing documentation for vector_field_function #633

Baschdl opened this issue Dec 8, 2023 · 6 comments
Labels
bug Something isn't working Stale

Comments

@Baschdl
Copy link

Baschdl commented Dec 8, 2023

I would like to use the function f(x) like you describe it in the "10min to dynamo" tutorial [1] to get the velocity for unseen gene expression vectors x. Unfortunately, the dynamo.vf.vector_field_function (which I guess is f) is missing it's documentation on readthedocs [2] despite having a docstring:

def vector_field_function(
x: np.ndarray,
vf_dict: VecFldDict,
dim: Optional[Union[int, np.ndarray]] = None,
kernel: str = "full",
X_ctrl_ind: Optional[List] = None,
**kernel_kwargs,
) -> np.ndarray:
"""vector field function constructed by sparseVFC.
Reference: Regularized vector field learning with sparse approximation for mismatch removal, Ma, Jiayi, etc. al, Pattern Recognition
Args:
x: Set of cell expression state samples
vf_dict: VecFldDict with stored parameters necessary for reconstruction
dim: Index or indices of dimensions of the K gram matrix to return. Defaults to None.
kernel: one of {"full", "df_kernel", "cf_kernel"}. Defaults to "full".
X_ctrl_ind: Indices of control points at which kernels will be centered. Defaults to None.
Raises:
ValueError: If the kernel value specified is not one of "full", "df_kernel", or "cf_kernel"
Returns:
np.ndarray storing the `dim` dimensions of m x m gram matrix K storing the kernel evaluated at each pair of control points
"""

I tried running it according to this documentation but it's unclear to me what vf_dict is. Probably something generated from dyn.vf.VectorField:

|-----> <insert> velocity_S_SparseVFC to layers in AnnData Object.
|-----> <insert> VecFld to uns in AnnData Object.
|-----> <insert> control_point to obs in AnnData Object.
|-----> <insert> inlier_prob to obs in AnnData Object.
|-----> <insert> obs_vf_angle to obs in AnnData Object.

[1] https://dynamo-release.readthedocs.io/en/latest/ten_minutes_to_dynamo.html#vector-field-reconstruction
[2] https://dynamo-release.readthedocs.io/en/latest/_autosummary/dynamo.vf.vector_field_function.html

@Baschdl Baschdl added the bug Something isn't working label Dec 8, 2023
@Sichao25
Copy link
Collaborator

Sichao25 commented Dec 8, 2023

Thanks for reporting the issue. Your guess is correct, dynamo.vf.vector_field_function is the vector field function. The docstring has not been generated correctly. I will add it soon. Actually, we have a helper function to retrieve the vector field. You may be able to call it like vf_dict, f = dynamo.vf.utils.vecfld_from_adata(adata, basis='umap'). Let me know if this helps.

@Baschdl
Copy link
Author

Baschdl commented Dec 9, 2023

Thanks a lot for your help. vf_dict, f = dyn.vf.utils.vecfld_from_adata(adata, basis='umap') currently fails with
ValueError: Vector field function VecFld_umap is not included in the adata object! Try firstly running dyn.vf.VectorField(adata, basis='umap').

The output of dyn.vf.VectorField says that everything worked correctly

dyn.vf.VectorField(adata[:,adata.var.use_for_transition], velocity_key="velocity_S", layer="spliced", basis='umap')

|-----> VectorField reconstruction begins...
|-----> Retrieve X and V based on basis: UMAP. 
        Vector field will be learned in the UMAP space.
|-----> Generating high dimensional grids and convert into a row matrix.
|-----> Learning vector field with method: sparsevfc.
|-----> [SparseVFC] begins...
|-----> Sampling control points based on data velocity magnitude...
|-----> [SparseVFC] in progress: 100.0000%
|-----> [SparseVFC] finished [10.4133s]
|-----> <insert> velocity_umap_SparseVFC to obsm in AnnData Object.
|-----> <insert> X_umap_SparseVFC to obsm in AnnData Object.
|-----> <insert> VecFld_umap to uns in AnnData Object.
|-----> <insert> control_point_umap to obs in AnnData Object.
|-----> <insert> inlier_prob_umap to obs in AnnData Object.
|-----> <insert> obs_vf_angle_umap to obs in AnnData Object.
|-----> [VectorField] in progress: 100.0000%
|-----> [VectorField] finished [10.6760s]

but the inserts do not end up in adata, e.g. X_umap_SparseVFC should be in obsm but my adata looks like this: obsm: 'X_pca', 'X_umap', 'velocity_umap', 'X'.

@Sichao25
Copy link
Collaborator

VectorField will automatically store the result to the given AnnData, but I feel like this might not work when given a view like adata[:,adata.var.use_for_transition]. Could you try to set the copy to True and set it back to the adata like:

adata = dyn.vf.VectorField(adata[:, adata.var.use_for_transition], basis='umap', copy=True)
vecfld_dict, vecfld = dyn.vf.utils.vecfld_from_adata(adata, basis='umap')

Let me know if this helps you.

@Baschdl
Copy link
Author

Baschdl commented Dec 13, 2023

Great this works now. The view is also not needed anymore once PR #619 is merged.

Is there any way of learning the vector field for all original genes and not only the once in adata.var.use_for_transition? I tried to select all genes as transition genes but this doesn't work currently, see PR #638.

@Sichao25
Copy link
Collaborator

You may be able to do that by setting transition_genes=adata.var_names. Index is preferred instead of boolean value.

Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Stale
Projects
None yet
Development

No branches or pull requests

2 participants