VerifiableRewardsForScalableLogicalReasoning / VerifiableRewardsForScalableLogicalReasoning.py

Commit History

fix validation programm to avoid short cut learning
f6ddc92
Running

LukasHug commited on

select last rule
62ba87c

LukasHug commited on

test last rule only
bff577f

LukasHug commited on

rule validation post extraction
bf1b494

LukasHug commited on

fix single eval
78670b1

LukasHug commited on

timer back to 10 sec, remove logging messages
499cdbd

LukasHug commited on

reduce timer back to 5 seconds
a0f954f

LukasHug commited on

typo
7fd2051

LukasHug commited on

increase timeout for parallel
4260a70

LukasHug commited on

allow multiple rules
e4484f6

LukasHug commited on

update reward, prevent reward hacking
88c2435

LukasHug commited on

only accept one rule as a solution, we select the first one. Do not allow groundings
999258b

LukasHug commited on

use multi processing only for 500 or more samples
c8de9ce

LukasHug commited on

make eval config obligatory
0f1c352

Lukas Helff commited on

make eval config not obligatory
1fe4885

Lukas Helff commited on

update error message
ac97ee4

Lukas Helff commited on

extract rule from NL
58596cd

Lukas Helff commited on