Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporting doc information to use in models like SBERT #1271

Open
mikesol opened this issue Aug 22, 2024 · 0 comments
Open

Exporting doc information to use in models like SBERT #1271

mikesol opened this issue Aug 22, 2024 · 0 comments

Comments

@mikesol
Copy link

mikesol commented Aug 22, 2024

I'm experimenting with a different indexing method of the docs using SBERT and FAISS. This is a popular approach for indexing/querying data these days, and while lots of companies offer paid versions of it, it's also fairly straightforward to do with open-source tools.

I'm wondering if it's possible to tweak doc search to dump the entire index in xml form. For example:

<entry>
  <package>purescript-lists</package>
  <module>Data.List</module>
  <id>singleton</id>
  <def>singleton :: forall a. a -> List a</def>
  <doc>Create a list with a single element.

Running time: `O(1)`</doc>
</entry>

It's easier to construct a data set for SBERT when data is formatted like this.

I see that a lot of the plumbing to do something like this is already there, but as I don't know the code well, it's tough to come up with a plan for a clean way to do it.

If you have pointers on how to hack at the repo to get there, I can give it a shot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant