You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
在discussion 21中有提到对手摸切的讨论,我这里想提到的是手摸切对于模型的online training是否会有负面影响。
假设
如果模型在online阶段的对手都是基于mortal的模型,可能会学习到一些错误的先验(i.e. 对手手切不会是空切,手切之后牌型一定发生变化)。
实验
在mortal.rs中,以一定的概率可以手切:
让模型进行自战,model(p)表示模型以p的概率在能够摸切,手切的时候选择手切
这里使用的模型A是经过一段时间的offline与online训练,具有一定强度的模型
结果:
我也不知道这个结果是否显著,不过我觉得加入随机手摸切对模型的训练应该会有好处
Beta Was this translation helpful? Give feedback.
All reactions