-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client Side Embeddings Search #119
Comments
@sshivaditya2019 rfc on time estimate and spec |
This can be accomplished using natural library1, which is highly optimized. The main challenge would be generating the 1024-size embeddings on the client side. Rather than retrieving embeddings from the database, we could use the wink-js embedding model2 to generate embeddings for both the query and the entries. These embeddings could be computed at load time, potentially increasing page load time by 15 to 35 seconds or more in some cases, and then used in the search process. Vector-based search may not be particularly beneficial here; instead, heuristic-based retrieval methods, such as NDCG, along with a more effective search algorithm, would likely yield better results. Footnotes |
Lets do your recommendation |
I think it will take around a day to set up the heuristic-based search functionality. I'm not sure if there's a gating mechanism for tasks or something similar, but I can incorporate that into this task to create an integrated task recommender, if that's an requirement. @0x4007 rfc |
This is not implemented anywhere now. However it will soon be implemented based on contributor/collaborator status and priority level (or time level) But that will only be on GitHub and not our UI I think. We still need to figure that out.
Integrated task recommender sounds very cool on the UI level. I'm onboard with exploring this although as of right now implementation details are not clear to me. |
/start |
@0x4007 could you assign this issue to me ? |
@sshivaditya2019 the deadline is at Sun, Oct 27, 5:30 PM UTC |
@gentlementlegen Not working again Start officially is our most unreliable plugin |
Error was { "message": "Validation Failed", "errors": [ { "message": "The listed users cannot be searched either because the users do not exist or you do not have permission to view the users.", "resource": "Search", "field": "q", "code": "invalid" } ], "documentation_url": "https://docs.github.com/v3/search/", "status": "422" } with the search arguments like { "q": "org:ubiquity author:sshivaditya2019 state:open", "per_page": 100, "order": "desc", "sort": "created" } URL for reference |
Okay you should figure the root problem and fix |
I've mentioned this before re: user privacy settings affecting our attempts via GQL and rest but the root problem is shivs account' privacy settings being restricted which we don't control unfortunately. So perhaps we should just assume defaults in this situation and apply the lowest contributor limits and then use an alt search query for PRs/Issue in the network and then filter using their username as they would be public as that's our org settings then. I assume it's the assigned issues query that's caused it here. |
If it's something the contributor can fix then the solution is to write a detailed error explaining that they can't self assign until they fix their settings, explain exactly what to fix, and then provide a link to where they can fix. |
It is still weird to me that the user privacy affects a search because the profile is public. Can we consider using GQL with issues search instead of the search API? Something like query($organization: String!, $author: String!) {
organization(login: $organization) {
repositories(first: 100) {
nodes {
issues(first: 100, states: OPEN, filterBy: {createdBy: $author}) {
nodes {
title
url
createdAt
}
}
}
}
}
} with {
"organization": "ubiquity",
"author": "sshivaditya2019"
} would achieve the same result. I don't know if that would resolve the issue but it's worth a try. |
You can test and verify pretty quickly. I suggest you do that and let us know. |
using the explorer and my login for access to the explorer {
"data": {
"organization": null
},
"errors": [
{
"type": "FORBIDDEN",
"path": [
"organization",
"repositories"
],
"extensions": {
"saml_failure": false
},
"locations": [
{
"line": 3,
"column": 5
}
],
"message": "Although you appear to have the correct authorization credentials, the `ubiquity` organization has enabled OAuth App access restrictions, meaning that data access to third-parties is limited. For more information on these restrictions, including how to enable this app, visit https://docs.github.com/articles/restricting-access-to-your-organization-s-data/" |
If OAuth app access is required to read user data, let's use the app for logging in on devpool.directory. The error can explain that the user needs to sign in on devpool.directory if there is a problem reading their data. |
It'd be better to just test locally, sorry didn't have time to do so today. |
Aye it likely would be sorrry bud |
Perhaps we can improve our search experience by:
If performance is bad running all of these calculations, we can compile to wasm potentially.
The text was updated successfully, but these errors were encountered: