Replies: 1 comment
-
They are both quite different methods for comparing the clusters to one another. The intertopic distance map uses UMAP to reduce the representations to 2d thereby decreasing much of its representational capabilities. In contrast, the hierarchical cluster tree works on the full embeddings and therefore tends to be much more accurate. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello Maarten and BERTopic community,
I'm trying to use the hierarchical cluster tree and the intertopic distance map to identify potential topics for merging (topic reduction). However, I'm noticing that these two graphs show different results in terms of what is considered similar. For instance, topics that appear very close in the hierarchy tree are shown quite far apart in the intertopic distance map, and vice versa.
Does anyone have any suggestions for a good resource that could help me understand why this is happening? Or perhaps know why this might be happening?
Just to provide some context, I'm using BERTopic on a small dataset of 3000 documents, all short text (sentences). The content of these sentences is quite field-specific.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions