balhafni commited on
Commit
2cdbade
ยท
verified ยท
1 Parent(s): 8db3ae3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -5,4 +5,31 @@ language:
5
  base_model:
6
  - aubmindlab/bert-base-arabertv02
7
  pipeline_tag: token-classification
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  base_model:
6
  - aubmindlab/bert-base-arabertv02
7
  pipeline_tag: token-classification
8
+ ---
9
+
10
+ # SWEET MADAR CODA Model
11
+
12
+ ## Model Description
13
+ `CAMeL-Lab/text-editing-coda' is a text editing model tailored for grammatical error correction (GEC) in dialectal Arabic (DA).
14
+ The model is based on [AraBERTv02](https://huggingface.co/aubmindlab/bert-base-arabertv02), which we fine-tuned using the MADAR CODA corpus.
15
+ This model was introduced in our ACL 2025 paper, [Enhancing Text Editing for Grammatical Error Correction: Arabic as a Case Study](https://arxiv.org/abs/2503.00985), where we refer to it as SWEET (Subword Edit Error Tagger).
16
+ It achieved SOTA performance on the MADAR CODA dataset. Details about the training procedure, data preprocessing, and hyperparameters are available in the paper.
17
+ The fine-tuning code and associated resources are publicly available on our GitHub repository: https://github.com/CAMeL-Lab/text-editing.
18
+
19
+
20
+
21
+ ## Citation
22
+ ```bibtex
23
+ @inter{alhafni-habash-2025-enhancing,
24
+ title={Enhancing Text Editing for Grammatical Error Correction: Arabic as a Case Study},
25
+ author={Bashar Alhafni and Nizar Habash},
26
+ year={2025},
27
+ eprint={2503.00985},
28
+ archivePrefix={arXiv},
29
+ primaryClass={cs.CL},
30
+ url={https://arxiv.org/abs/2503.00985},
31
+ }
32
+ ```
33
+
34
+
35
+