File size: 1,693 Bytes
8cc4b26
 
 
 
 
 
e0e0133
8cc4b26
 
 
 
9ef82ab
e0e0133
 
 
 
 
 
 
 
 
8cc4b26
 
 
d3ff4e0
8cc4b26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d3ff4e0
 
8cc4b26
d3ff4e0
 
 
8cc4b26
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
license: afl-3.0
datasets:
- WillHeld/hinglish_top
language:
- en
- hi
metrics:
- accuracy
library_name: transformers
pipeline_tag: fill-mask
widget:
- text: please <mask> ko cancel kardo
  example_title: Example 1
- text: New York me <mask> kesa he?
  example_title: Example 2
- text: Thoda <mask> bajao
  example_title: Example 3
tags:
- Hinglish
- MaskedLM
---

### HingMaskedLM
MaskedLM is a pre-training technique used in Natural Language Processing (NLP) for deep-learning models like Transformers. It is a variant of language modeling where a portion of the input text is masked, and the model is trained to predict the masked tokens based on the context provided by the unmasked tokens. This model is trained for Masked Language Modeling for `Hinglish Data`.

### Dataset
Hinglish-Top [Dataset](https://huggingface.co/datasets/WillHeld/hinglish_top) columns
- en_query
- cs_query
- en_parse 
- cs_parse 
- domain 

### Training
|Epoch|Loss|
|:--:|:--:|
|1	|0.0465|
|2	|0.0262|
|3	|0.0116|
|4	|0.00385|
|5	|0.0103|
|6	|0.00738|
|7	|0.00892|
|8	|0.00379|
|9	|0.00126|
|10	|0.000684|


### Inference 
```python
from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline

tokenizer = AutoTokenizer.from_pretrained("SRDdev/HingMaskedLM")

model = AutoModelForMaskedLM.from_pretrained("SRDdev/HingMaskedLM")

fill = pipeline('fill-mask', model=model, tokenizer=tokenizer)
```
```python
fill(f'please {fill.tokenizer.mask_token} ko cancel kardo')
```

### Citation
Author: @[SRDdev](https://huggingface.co/SRDdev)
```
Name: Shreyas Dixit
framework: Pytorch
Year: Jan 2023
Pipeline: fill-mask
Github: https://github.com/SRDdev
LinkedIn: https://www.linkedin.com/in/srddev/ 
```