You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to add a feature to steer the output of models towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation (as demonstrated here https://github.com/DLR-SC/style-vectors-for-steering-llms)
How can I access and modify the model layers in the code to apply the style vector to them?
I attempted this using the model object in models.py, but there doesn't seem to be an interface for accessing the layers.
Any guidance or suggestions would be greatly appreciated.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am trying to add a feature to steer the output of models towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation (as demonstrated here https://github.com/DLR-SC/style-vectors-for-steering-llms)
How can I access and modify the model layers in the code to apply the style vector to them?
I attempted this using the model object in models.py, but there doesn't seem to be an interface for accessing the layers.
Any guidance or suggestions would be greatly appreciated.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions