Autoformalization Alignment Evaluation

Introduction

The Autoformalization Alignment Evaluation repository provides a complete framework for training and evaluating models in the task of aligning informal mathematical descriptions with formal proofs and statements. This framework supports data preparation, model training and fine-tuning, as well as evaluation of models on standard theorem proving datasets.

Repository Structure

The repository is organized into the following main directories:

CLIP/
├── configs           # Configuration files for training and evaluation
├── data              # Dataset directories for various experiments
│   ├── minif2f
│   ├── mma
│   └── model_generation
│       ├── test
│       └── train
├── eval_results      # Storage for evaluation results
│   └── math_train
│       └── generator_with_verifier
│           └── test
│               ├── lean_val
│               └── lean_val_clip
├── miniF2F           # Mini formal-to-formal dataset and relevant scripts
│   ├── hollight
│   ├── informal
│   ├── isabelle
│   ├── lean
│   └── metamath
├── theorem_proving   # Core directory for theorem proving experiments
│   ├── generate_result
│   ├── log
│   ├── miniF2F-lean4  # MiniF2F subset for Lean 4 theorem prover
│   └── pass_rate_results
└── utils             # Utility scripts and libraries

Pre-installation

run following bash firstly

pip install -r requirements.txt

Training Script

To initiate the training process for Autoformalization models, navigate to the theorem_proving directory and run the following script:

#!/bin/bash

# Move to the theorem proving directory
cd ./theorem_proving

# Define the training configuration parameters
n_solution=10
generator_id=mistral
verifierID=mma_lean
checkpoint_dir=../Mistral-7B-v0.1/  # Base model directory

# Set the unique run identifier and output model directory
final_id=aae_test
save_dir=/opt/tiger/models/${final_id}

# Activate multi-GPU training using CUDA and Accelerate
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch \
  --main_process_port=20114 \
  --config_file ../configs/zero1.yaml \
  train_clip.py \
  --model_name_or_path ${checkpoint_dir} \
  --project_dim 4096 \
  --data_dir ../data/mma/lean_train.jsonl \
  --data_id lean4test \
  --target_set train \
  --save_dir ${save_dir} \
  --generator_id ${generator_id} \
  --verifier_id ${verifierID} \
  --dedup True \
  --per_problem_sampling_solution ${n_solution} \
  --loss_level token \
  --loss_on_llm True \
  --num_train_epoches 1 \
  --eval_steps 10000 \
  --per_device_train_batch_size 16 \
  --per_device_eval_batch_size 64 \
  --gradient_accumulation_steps 1 \
  --gradient_checkpointing True \
  --learning_rate 2e-6 \
  --weight_decay 0 \
  --save_steps 1000 \
  --lr_scheduler_type "linear" \
  --warmup_steps 0 \
  --save_epoches 1 \
  --save_best False \
  --save_total_limit 0 \
  --logging_dir None \
  --logging_steps 1 \
  --seed 42

Before running the script, ensure that you have the necessary dependencies installed, including CUDA and Accelerate if you plan to perform multi-GPU training.

Note: In your training script, make sure that checkpoint_dir is properly set to the directory containing your base model checkpoints. This directory serves as the starting point for further fine-tuning in your task.

Eval Script


generator_id=mistral
verifierID=mma_lean
generator_id=lean4_random_15k
data_dirs=(data/minif2f/valid_lean4_half.json)

verifier_model_name_or_path=../models/{trained_verifier}
dataset=math_train
acc_thres=0.85
# Execute the command with the current generator ID
# And redirect stdout and stderr to log file
for data_dir in "${data_dirs[@]}"; do
  echo "eval on dataset $data_dir"
  CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch \
  --main_process_port=20114 \
  eval_with_clip.py \
  --model_name_or_path ${verifier_model_name_or_path} \
  --data_dir ${data_dir} \
  --verifier_output_dir eval_results/${dataset}/verifier \
  --generator_metric_dir eval_results/${dataset}/generator_with_verifier \
  --generator_id ${generator_id} \
  --target_set test \
  --acc_thres ${acc_thres} \
  --batch_size 8 \
  --seed 42 
 done

Note: In your training script, make sure that verifier_model_name_or_path is properly set to the directory containing your trained verifeir model checkpoints. This directory serves as the starting point for further fine-tuning in your task.

Getting Started

Clone the repository using git clone or download the ZIP archive.
Follow the installation instructions in the INSTALL.md file to set up your environment.
Explore the configurations under the configs directory to customize your training and evaluation.
Train your model using the provided script and evaluate it with the tools provided under eval_results.

For more information, refer to the specific README files within each subdirectory. Thank you for considering this framework for your research on theorem proving and autoformalization.