Skip to content

Releases: lucidrains/PaLM-rlhf-pytorch

0.0.23

20 Dec 00:36
Compare
Choose a tag to compare
rename to ActorCritic and cleanup

0.0.22

19 Dec 22:18
Compare
Choose a tag to compare
critic model could be completely different if need be

0.0.21

19 Dec 22:14
Compare
Choose a tag to compare
only calculate if kl div loss weight set to greater than 0

0.0.20

19 Dec 21:38
Compare
Choose a tag to compare
critic now has its own LoRA parameters

0.0.18

19 Dec 20:46
Compare
Choose a tag to compare
fix a bunch of things

0.0.17

19 Dec 20:37
Compare
Choose a tag to compare
make a guess, think this is what the blogpost meant

0.0.16

19 Dec 20:25
Compare
Choose a tag to compare
everything in readme runs at least

0.0.15

19 Dec 20:24
Compare
Choose a tag to compare
everything in readme runs at least

0.0.14

19 Dec 19:32
Compare
Choose a tag to compare
get everything prepped for actual ppo code

0.0.11

18 Dec 19:10
Compare
Choose a tag to compare
take care of calculating values, rewards, entropies, kl div under var…