Spaces:

AIML-TUDA
/

VerifiableRewardsForScalableLogicalReasoning

Running

App Files Files Community

VerifiableRewardsForScalableLogicalReasoning / VerifiableRewardsForScalableLogicalReasoning.py

Commit History

fix validation programm to avoid short cut learning

f6ddc92

Running

LukasHug commited on about 7 hours ago

select last rule

62ba87c

LukasHug commited on 10 days ago

test last rule only

bff577f

LukasHug commited on 10 days ago

rule validation post extraction

bf1b494

LukasHug commited on 10 days ago

fix single eval

78670b1

LukasHug commited on 10 days ago

timer back to 10 sec, remove logging messages

499cdbd

LukasHug commited on 10 days ago

reduce timer back to 5 seconds

a0f954f

LukasHug commited on 10 days ago

typo

7fd2051

LukasHug commited on 10 days ago

increase timeout for parallel

4260a70

LukasHug commited on 10 days ago

allow multiple rules

e4484f6

LukasHug commited on 10 days ago

update reward, prevent reward hacking

88c2435

LukasHug commited on 10 days ago

only accept one rule as a solution, we select the first one. Do not allow groundings

999258b

LukasHug commited on 10 days ago

use multi processing only for 500 or more samples

c8de9ce

LukasHug commited on 24 days ago

make eval config obligatory

0f1c352

Lukas Helff commited on Jun 25

make eval config not obligatory

1fe4885

Lukas Helff commited on Jun 25

update error message

ac97ee4

Lukas Helff commited on Jun 25

extract rule from NL

58596cd

Lukas Helff commited on Jun 25

rename

ad4ae3a

LukasHug commited on Jun 24

Commit History

fix validation programm to avoid short cut learning f6ddc92 Running

select last rule 62ba87c

test last rule only bff577f

rule validation post extraction bf1b494

fix single eval 78670b1

timer back to 10 sec, remove logging messages 499cdbd

reduce timer back to 5 seconds a0f954f

typo 7fd2051

increase timeout for parallel 4260a70

allow multiple rules e4484f6

update reward, prevent reward hacking 88c2435

only accept one rule as a solution, we select the first one. Do not allow groundings 999258b

use multi processing only for 500 or more samples c8de9ce

make eval config obligatory 0f1c352

make eval config not obligatory 1fe4885

update error message ac97ee4

extract rule from NL 58596cd

rename ad4ae3a

fix validation programm to avoid short cut learning

f6ddc92

Running

select last rule

62ba87c

test last rule only

bff577f

rule validation post extraction

bf1b494

fix single eval

78670b1

timer back to 10 sec, remove logging messages

499cdbd

reduce timer back to 5 seconds

a0f954f

typo

7fd2051

increase timeout for parallel

4260a70

allow multiple rules

e4484f6

update reward, prevent reward hacking

88c2435

only accept one rule as a solution, we select the first one. Do not allow groundings

999258b

use multi processing only for 500 or more samples

c8de9ce

make eval config obligatory

0f1c352

make eval config not obligatory

1fe4885

update error message

ac97ee4

extract rule from NL

58596cd

rename

ad4ae3a