Replies: 3 comments 4 replies
-
Fair issue with providing such personal data to a company. I'm guessing this adoption constraint is just for Khoj Chat? Context
|
Beta Was this translation helpful? Give feedback.
-
A way to limit what data get sent to OpenAI would definitely be ideal. Another alternative would be a |
Beta Was this translation helpful? Give feedback.
-
Is this a chat model that could be used? I came across this project right before learning about Khoj Edit: actually I think this goes back to openai as well. So never mind me. My bad. |
Beta Was this translation helpful? Give feedback.
-
I had a privacy-oriented question for the
/chat
endpoint, and I wanted to document my explorations here.Does OpenAI's ChatGPT API store my data? Yes, for 30 days, after which it is deleted (as of March 1, 2023). Data passed to the API is not used for training.
With that information, I'm hesitant to share sensitive data to Khoj. When everything is hosted locally and running directly from my machine, I have no concerns with sensitive information. This causes a slight adoption constraint for me for the
/chat
endpoint, as I'd have to split out any of my more sensitive data to prevent it from being indexed, and subsequently passed to the ChatGPT API.What data is sent to the ChatGPT API?
Khoj will first execute the search based on the generated embeddings I have locally. Then, choosing the top two results - (see
/chat
endpoint inapi.py
), it will pass those to ChatGPT as context. As such, GPT will be exposed to as much information as is deemed relevant by the top two results in Khoj.Solution
To continue using Khoj and integrate the
/chat
endpoint in my workflow, I'll remove any sensitive data from my notes that are being indexed.Beta Was this translation helpful? Give feedback.
All reactions