Korean-PII-Masking-BERT

GitHub Repository: alphagyuu/Korean-PII-Masking-BERT

Korean-PII-Masking-BERT is a token classification model fine-tuned on KcBERTโ€™s TokenClassifier using a processed version of "Korean SNS" dataset from AI-Hub.

๐Ÿ–ฅ๏ธ Python Implementation

  • Tokenizer:

    BertTokenizer.from_pretrained('beomi/kcbert-base', do_lower_case=False)
    
  • Model:

    TFBertForTokenClassification.from_pretrained('alphagyuu/Korean-PII-Masking-BertForTokenClassification', num_labels=len(tag2idx))
    
  • LabelMap:

    LabelMAP = {
      'O': 'LABEL0',
      'B-URL': 'LABEL1',
      'I-URL': 'LABEL2',
      'B-๊ณ„์ •': 'LABEL3',
      'I-๊ณ„์ •': 'LABEL4',
      'B-๊ธˆ์œต': 'LABEL5',
      'I-๊ธˆ์œต': 'LABEL6',
      'B-๋ฒˆํ˜ธ': 'LABEL7',
      'I-๋ฒˆํ˜ธ': 'LABEL8',
      'B-์†Œ์†': 'LABEL9',
      'I-์†Œ์†': 'LABEL10',
      'B-์‹ ์›': 'LABEL11',
      'I-์‹ ์›': 'LABEL12',
      'B-์ด๋ฆ„': 'LABEL13',
      'I-์ด๋ฆ„': 'LABEL14',
      'B-์ฃผ์†Œ': 'LABEL15',
      'I-์ฃผ์†Œ': 'LABEL16'
    }
    
Downloads last month
36
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for alphagyuu/Korean-PII-Masking-BertForTokenClassification

Base model

beomi/kcbert-base
Finetuned
(154)
this model