-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments and tips for the prompt #55
Comments
Thank you very much for your recognition of our work. We will make revisions in the next version. |
thank you very much, my name is Salvatore Raieli |
Thank you, I have seen the updated version and I would suggest some new research that could be interesting to mention: Mistral 7B has been released with the technical article. It claims it has better performance than LLaMA version 2 (7B and 13B) parameters. It is interesting to note they are used for reducing inference costs. They created this prompt as Guardrail enforcement (to avoid that the model generated dangerous answer):
They claim this is not impacting performance while avoiding the model is not responding to unsafe prompts: Another interesting topic that could be incorporated is machine unlearning (which is important since many LLMs are trained on copyright data, harmful content, or personal data). Microsoft recently showed how to make “forget” Harry Potter to LLaMA 7B. The authors used a reinforced model that is trained further on the target data to better identify the data you want your model to forget (the tokens related to the topic). Then they compare the logits between the baseline model and the reinforced model for the target data. They replace idiosyncratic expressions in the target data with generic counterparts (using GPT-4 for the mapping in a dictionary) and use the model to generate alternative labels for the tokens. Lastly, they fine-tune the model on these alternative labels, erasing the memory of target data from the model. This approach seems to work without affecting the model performance on the reasoning benchmark Another interesting article from Microsoft shows that there are some surprising failure of generalization for LLM: a model trained on a sentence of the form “A is B”, often fail to generalize to the reverse direction “B is A” The author of this work finds an approach that finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response. The approached worked with different models (ChatGPT, Bard, and Claude but also open-source LLaMA-2-Chat, Pythia, Falcon). They developed an algorithm able to hijack the model constraint without the need of manual engineering |
Thanks again for your continued interest and valuable suggestions regarding our survey. We are currently undergoing a new revision, and it is expected to be updated in nearly one month. |
your work is actually amazing and thank you for keeping updated that incredible survey. |
Hi,
very solid and useful works.
In this repository they suggest an approach similar to tree-of-thoughts but which should be done in one prompt
an example of this type of prompt:
Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, etc. If any expert realises they're wrong at any point then they leave. The question is...
Another interesting approach has been described in this paper: Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models where they have collected impressive datasets about college questions. The authors decided to test different techniques such as self-critique, the chain of thought, and few-shot to see how these impact. In addition, the authors decided to test a new approach the authors call expert prompting. In short, the authors ask the model to nominate experts for a question and what the response of these experts would be. Finally, based on these responses make a collective decision.
example of expert prompting:
About CoT an interesting article about CoT has just been published by anthropic [here](Measuring Faithfulness in Chain-of-Thought Reasoning) that could be interesting to include in the review
The text was updated successfully, but these errors were encountered: