-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can this model be used for the Generative Question Answering? #197
Comments
It sounds like you are looking for something like https://github.com/sanjeevanahilan/nanoChatGPT |
Prompt / Completion tasks are usually trained with a 0.01 factor in the loss for the prompt. At least this is the default in openAI API. I do not see such parameter in the fine tuning here. |
@arivero so you mean this can not be fine-tuned for the Question Answering? |
@AayushSameerShah my guess is that the loss function must be customised to decide how to evaluate the prediction of prompt tokens. |
@arivero Thanks for the response, I have followed some of the threads in this library but now I am thinking to shift on the huggingface. I can understand I am taking the discussion out of the context of this thread, but please pardon me if I do so. I am less experienced with huggingface transformers. But from my findings, I have observed that it provides 2 types of pipelines which can be helpful in my case:
The text-generation simply writes the text from the prompt like: "When I was 12 I went to" and the rest is filled by model. There is no question answering. Even if I question something in the prompt it continues the question instead of answering it. That makes sense. Then there is the question-answering pipeline which takes 2 inputs: 1st Question and 2nd Context. Now based on these two it "extracts" the answer from the context. This seems to be working but it fails in generalization because it "extracts" the answer and does not generate it. Additionally, we need to give the context with the question to get the answer, which is not intuitive. What I am asking for is...If there are models which can be fine-tuned on some specific dataset (say medical) and then can answer the question by themselves. I am not sure once fine-tuned, it requires the context to be given or not, but either way, it should return some response to the question by generating it. Like how "DaVinci" does for example in the notebooks. Do you have any idea how can I move forward with this? I have found that these models on huggingface to workwith because they seem promissing:
Can be my open-source mates and I can go forward with them, but don't know how to solve my problem with these models and how to get the training done. It will really be a huge help from you buddy, |
hi @AayushSameerShah , seems like you want to kind of replicate what BioGPT is doing. You can check out their paper: |
Hello @timothylimyl So it is the GenerativeQA approach. It is to use "haystack". This framework has many QA pipelines. And I am interested in 2 of them.
Where I have found that the RAG generates small (one-liner answers) while LFQA is giving answers in the passage. Which is really something I was looking for. So, on giving a bunch of different documents and storing them in the data storage, the pipeline retrieves the documents from the question embeddings and the reader reads those documents and then generates the answers. Though the answer quality isn't on the level of GPT-3 that can work pretty well. -- Now I have another query. Along with my unstructured data (wiki pages, blogs ..) I also want to feed the structured data in a tabular fashion. But there this LFQA pipeline fails because it is unable to find meaning! For that Haystack has a "TableReader" which can take tables as inputs, My hopes rose! But when I tried that, it returns a single word or "extractive" response. Such as on asking "Which country won the highest number of medals in the Olympics 2022?" it returns "USA". I am looking for a generative response here. With some explanation. Is it possible? Please direct. If I summarise my whole situation, then I am looking for a "generative way of answering" where I should be able to put the unstructured+ structured data as the context and then on querying, the model should generate some answer. Thanks 🙏 |
seems like a prompt engineering problem, you can try out different instruction prompts for starters |
Hi @AayushSameerShah : I've just stumbled upon this page since I am trying to do exactly what you're describing ) We're now 5 months ahead :
thanks very much for your help ;) |
Hie @laurentm255 👋, Luckily, I have provided a comprehensive response by clicking my response link where I have tried to explore the available options that we currently have by giving examples. Hopefully, that might give a bit of direction. |
I am looking for this model to fine-tune my own data (such as medical science) and after the training, I want it to be able to answer the questions. Then I am not looking for the "extractive answers" where it returns the start and end sequence (which is pretty much related to the given context scenario) but a "generative case" where I train the model with my data and then answer the question, and from its (the model's) own understanding for my data, it should be able to give me the answers.
Please let me know if anybody knows how to achieve that with this model!
Thank you so much 🤗
The text was updated successfully, but these errors were encountered: