balhafni commited on
Commit
63d0c56
ยท
verified ยท
1 Parent(s): df3a50b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -0
README.md CHANGED
@@ -18,6 +18,41 @@ The fine-tuning code and associated resources are publicly available on our GitH
18
 
19
 
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Citation
22
  ```bibtex
23
  @inter{alhafni-habash-2025-enhancing,
 
18
 
19
 
20
 
21
+ ## Intended uses
22
+ To use the `CAMeL-Lab/text-editing-coda` model, you must clone our text editing [GitHub repository](https://github.com/CAMeL-Lab/text-editing) and follow the installation requirements.
23
+ We used this `SWEET` model to report results on the MADAR CODA dev and test sets in our [paper](https://arxiv.org/abs/2503.00985).
24
+
25
+ ## How to use
26
+ Clone our text editing [GitHub repository](https://github.com/CAMeL-Lab/text-editing) and follow the installation requirements
27
+
28
+ ```python
29
+ from transformers import BertTokenizer, BertForTokenClassification
30
+ import torch
31
+ import torch.nn.functional as F
32
+ from gec.tag import rewrite
33
+
34
+ tokenizer = BertTokenizer.from_pretrained('CAMeL-Lab/text-editing-coda')
35
+ model = BertForTokenClassification.from_pretrained('CAMeL-Lab/text-editing-coda')
36
+ edits_map = model.config.id2label
37
+
38
+ text = 'ุฃู†ุง ุจุนุทูŠูƒ ุฑู‚ู… ุชู„ููˆู†ูˆ ูˆ ุนู†ูˆุงู†ูˆ'.split()
39
+
40
+ tokenized_text = tokenizer(text, return_tensors="pt", is_split_into_words=True)
41
+
42
+ with torch.no_grad():
43
+ logits = model(**tokenized_text).logits
44
+ preds = F.softmax(logits.squeeze(), dim=-1)
45
+ preds = torch.argmax(preds, dim=-1).cpu().numpy()
46
+ edits = [edits_map[p] for p in preds[1:-1]]
47
+ assert len(edits) == len(tokenized_text['input_ids'][0][1:-1])
48
+
49
+ subwords = tokenizer.convert_ids_to_tokens(tokenized_text['input_ids'][0][1:-1])
50
+ output_sent = rewrite(subwords=[subwords], edits=[edits])[0][0]
51
+ print(output_sent) # ุงู†ุง ุจุงุนุทูŠูƒ ุฑู‚ู… ุชู„ููˆู†ู‡ ูˆุนู†ูˆุงู†ู‡
52
+ ```
53
+
54
+
55
+
56
  ## Citation
57
  ```bibtex
58
  @inter{alhafni-habash-2025-enhancing,