YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

LM-Combiner

All the code and model are released link. Thank you for your patience!

Model Weight

  • cbart_large.zip

    • Weight of Bart baseline model.
  • lm_combiner.zip

    • Weight of LM-Combiner for Bart baseline on FCGEC dataset.

Requirements

The part of the model is implemented using the huggingface framework and the required environment is as follows:

  • Python
  • torch
  • transformers
  • datasets
  • tqdm

For the evaluation, we refer to the relevant environment configurations of ChERRANT.

Training Stage

Preprocessing

Baseline Model

  • Firstly, we train a baseline model (Chinese-Bart-large) for LM-Combiner on the FCGEC dataset using the Seq2Seq format.
sh ./script/run_bart_baseline.sh

Candidate Datasets

  1. Candidate Sentence Generation
  • We use the baseline model to generate candidate sentences for the training and test sets
  • On tasks where the model fits better (spelling correction, etc.), we recommend using the K-fold cross-inference from the paper to generate candidate sentences separately.
python ./src/predict_bl_tsv.py
  1. Golden Labels Merging
  • We use the ChERRANT tool to fully decouple the error correction task and the rewriting task by merging the correct labels.
python ./scorer_wapper/golden_label_merging.py

LM-combiner (gpt2)

  • Subsequently, we train LM-Combiner on the constructed candidate dataset
  • In particular, we supplement the gpt2 vocab (mainly double quotes) to better fit the FCGEC dataset, see ./pt_model/gpt2-base/vocab.txt for details.
sh ./script/run_lm_combiner.py

Evaluation

  • We use the official ChERRANT script to evaluate the model on the FCGEC-dev.
sh ./script/compute_score.sh
method Prec Rec F0.5
bart_baseline 28.88 38.95 40.46
+lm_combiner 52.15 37.41 48.34

Citation

If you find this work is useful for your research, please cite our paper:

@inproceedings{wang-etal-2024-lm-combiner,
    title = "{LM}-Combiner: A Contextual Rewriting Model for {C}hinese Grammatical Error Correction",
    author = "Wang, Yixuan  and
      Wang, Baoxin  and
      Liu, Yijun  and
      Wu, Dayong  and
      Che, Wanxiang",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.934",
    pages = "10675--10685",
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.