VerifiableRewardsForScalableLogicalReasoning / VerifiableRewardsForScalableLogicalReasoning.py

Commit History

reduce timer back to 5 seconds
a0f954f

LukasHug commited on

typo
7fd2051

LukasHug commited on

increase timeout for parallel
4260a70

LukasHug commited on

allow multiple rules
e4484f6

LukasHug commited on

update reward, prevent reward hacking
88c2435

LukasHug commited on

only accept one rule as a solution, we select the first one. Do not allow groundings
999258b

LukasHug commited on

use multi processing only for 500 or more samples
c8de9ce

LukasHug commited on

make eval config obligatory
0f1c352

Lukas Helff commited on

make eval config not obligatory
1fe4885

Lukas Helff commited on

update error message
ac97ee4

Lukas Helff commited on

extract rule from NL
58596cd

Lukas Helff commited on