Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues running batch mode on own data #38

Open
ZeroLi-Bio opened this issue Jun 6, 2023 · 6 comments
Open

Issues running batch mode on own data #38

ZeroLi-Bio opened this issue Jun 6, 2023 · 6 comments

Comments

@ZeroLi-Bio
Copy link

Hello,

Thank you for this great software.
I'm tring to run the SPROD on my own data which have 88097 spots, so I run in batch mode. But met a error. commands below:

/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/bin/python /jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/SPROD/sprod.py /jdfssz1/ST_SUPERCELLS/P21Z10200N0134/Project/25.ESCA/lizeyu/30.SPROD/inputs/ /jdfssz1/ST_SUPERCELLS/P21Z10200N0134/Project/25.ESCA/lizeyu/30.SPROD/outputs/ --input_type batch

The outputs of work.sh is:

Starting dirichlet process clustering...
Iteration 10
Iteration 20
Iteration 30
Iteration 40
Iteration 50
Iteration 60
Iteration 70
Iteration 80
Iteration 90
Iteration 100
Iteration 110
Iteration 120
Iteration 130
Iteration 140
Iteration 150
Iteration 160
Iteration 170
Iteration 180
Iteration 190
Iteration 200

And the sprod_log.txt is:

16:58:01,INFO:::Cold start, processing from counts and image data.
16:58:01,INFO:::Use spot cluster probability as pseudo image features
16:58:01,INFO:::Loading counts data and performing dimension reduction using UMAP.
17:10:59,ERROR:::/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/lib/python3.7/site-packages/pandas/core/indexing.py:1732: SettingWithCopyWarning:
17:10:59,ERROR:::A value is trying to be set on a copy of a slice from a DataFrame
17:10:59,ERROR:::
17:10:59,ERROR:::See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
17:10:59,ERROR:::  self._setitem_single_block(indexer, value, name)
17:10:59,ERROR:::/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/lib/python3.7/site-packages/pandas/core/indexing.py:723: SettingWithCopyWarning:
17:10:59,ERROR:::A value is trying to be set on a copy of a slice from a DataFrame
17:10:59,ERROR:::
17:10:59,ERROR:::See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
17:10:59,ERROR:::  iloc._setitem_with_indexer(indexer, value, self.name)
17:15:19,INFO:::Making subsamples from counts data.
17:15:19,INFO:::Image derived features not found, will use pseudo image features.
17:35:47,ERROR:::Traceback (most recent call last):
17:35:47,ERROR:::  File "/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/SPROD/sprod.py", line 356, in <module>
17:35:47,ERROR:::subsample_patches(input_path, intermediate_path, feature_fn, pn, pb)
17:35:47,ERROR:::  File "/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/SPROD/sprod/slideseq_make_patches.py", line 124, in subsample_patches
17:35:47,ERROR:::_ = p.map(mpl_writer, patches)
17:35:47,ERROR:::  File "/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/lib/python3.7/multiprocessing/pool.py", line 268, in map
17:35:47,ERROR:::return self._map_async(func, iterable, mapstar, chunksize).get()
17:35:47,ERROR:::  File "/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/lib/python3.7/multiprocessing/pool.py", line 657, in get
17:35:47,ERROR:::raise self._value
17:35:47,ERROR:::  File "/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks
17:35:47,ERROR:::put(task)
17:35:47,ERROR:::  File "/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/lib/python3.7/multiprocessing/connection.py", line 206, in send
17:35:47,ERROR:::self._send_bytes(_ForkingPickler.dumps(obj))
17:35:47,ERROR:::  File "/jdfssz1/ST_SUPERCELLS/P21Z10200N0134/USER/lizeyu/00.software/Miniconda3/envs/sprod/lib/python3.7/multiprocessing/connection.py", line 393, in _send_bytes
17:35:47,ERROR:::header = struct.pack("!i", n)
17:35:47,ERROR:::struct
17:35:47,ERROR:::.
17:35:47,ERROR:::error
17:35:47,ERROR::::
17:35:47,ERROR:::'i' format requires -2147483648 <= number <= 2147483647

My input file, Counts.txt and Spot_metadata.csv, formats are attached below:
image
image

Could you give a hint about it? Thanks a lot

@yunguan-wang
Copy link
Owner

The rownames of both the count and spota_meta.csv should be the cells, not the numbers.
Also the number batches should be bigger to ensure each batch only have a few thousand cells.
Can you try remove the first column and run sprod again?
Thanks

@ZeroLi-Bio
Copy link
Author

@yunguan-wang Hi, thank you for your advice.
The first column is actually cell names in "CELL.xx" format. The picture shows 1234 is due to I was using less -N to view so it shows line number.
I did try to increase --num_of_batches to 100 to run, while it came out the same error above.

@yunguan-wang
Copy link
Owner

can you share the pseudo_image_features.csv with me @ [email protected]? I will try to figure out what is causing the issue.

@ZeroLi-Bio
Copy link
Author

can you share the pseudo_image_features.csv with me @ [email protected]? I will try to figure out what is causing the issue.

Thank you a lot. I've sent this file to you.

@yunguan-wang
Copy link
Owner

I inspected the feature file and it was fine.
Can you check if the individual batch log files have a error in it?
If possible, can you zip all log.txt and share it with me?

@ZeroLi-Bio
Copy link
Author

I inspected the feature file and it was fine. Can you check if the individual batch log files have a error in it? If possible, can you zip all log.txt and share it with me?

Hello.
I checked the output path, but it didn't came out with individual log files. The intermediate folder in output is empty.
The files in input and output directory are as below.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants