cbdb
/

ClassicalChineseLetterClassification

@@ -12,13 +12,13 @@ tags:
 license: cc-by-nc-sa-4.0
 ---
-# BertForSequenceClassification model (Classical Chinese)
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jVu2LrNwkLolItPALKGNjeT6iCfzF8Ic?usp=sharing/)
-This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is a letter title (书信标题) or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
- * Labels: 0 = non-letter, 1 = letter
-## Model description
 The BertForSequenceClassification model architecture inherits the BERT base model and concatenates a fully-connected linear layer to perform a binary-class classification task.More precisely, it
 was pretrained with two objectives:
@@ -27,17 +27,17 @@ was pretrained with two objectives:
 - Sequence classification: the model concatenates a fully-connected linear layer to output the probability of each class. In our binary classification task, the final linear layer has two classes.
-## Intended uses & limitations
 Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
-### How to use
 Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
 Here is how to use this model to get the features of a given text in PyTorch:
-1. Import model and packages
 ```python
 from transformers import BertTokenizer
 from transformers import BertForSequenceClassification
@@ -51,7 +51,7 @@ model = BertForSequenceClassification.from_pretrained('cbdb/ClassicalChineseLett
                                                      output_hidden_states=False)
 ```
-2. Make a prediction
 ```python
 max_seq_len = 512
@@ -86,7 +86,7 @@ label2idx = {'not-letter': 0,'letter': 1}
 idx2label = {v:k for k,v in label2idx.items()}
 ```
-3. Change your sentence here
 ```python
 label2idx = {'not-letter': 0,'letter': 1}
 idx2label = {v:k for k,v in label2idx.items()}
@@ -97,8 +97,10 @@ print(f'The predicted probability for the {list(pred_class_proba.keys())[0]} cla
 print(f'The predicted probability for the {list(pred_class_proba.keys())[1]} class: {list(pred_class_proba.values())[1]}')
 >>> The predicted probability for the not-letter class: 0.002029061783105135
 >>> The predicted probability for the letter class: 0.9979709386825562
 pred_class = idx2label[np.argmax(list(pred_class_proba.values()))]
 print(f'The predicted class is: {pred_class}')
 >>> The predicted class is: letter
-```

 license: cc-by-nc-sa-4.0
 ---
+# <font color="IndianRed"> BertForSequenceClassification model (Classical Chinese) </font>
 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jVu2LrNwkLolItPALKGNjeT6iCfzF8Ic?usp=sharing/)
+This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is <font color="IndianRed"> a letter title (书信标题) </font> or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
+ * <font color="Salmon"> Labels: 0 = non-letter, 1 = letter </font>
+## <font color="IndianRed"> Model description </font>
 The BertForSequenceClassification model architecture inherits the BERT base model and concatenates a fully-connected linear layer to perform a binary-class classification task.More precisely, it
 was pretrained with two objectives:
 - Sequence classification: the model concatenates a fully-connected linear layer to output the probability of each class. In our binary classification task, the final linear layer has two classes.
+## <font color="IndianRed"> Intended uses & limitations </font>
 Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
+### <font color="IndianRed"> How to use </font>
 Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
 Here is how to use this model to get the features of a given text in PyTorch:
+<font color="cornflowerblue"> 1. Import model and packages </font>
 ```python
 from transformers import BertTokenizer
 from transformers import BertForSequenceClassification
                                                      output_hidden_states=False)
 ```
+<font color="cornflowerblue"> 2. Make a prediction </font>
 ```python
 max_seq_len = 512
 idx2label = {v:k for k,v in label2idx.items()}
 ```
+<font color="cornflowerblue"> 3. Change your sentence here </font>
 ```python
 label2idx = {'not-letter': 0,'letter': 1}
 idx2label = {v:k for k,v in label2idx.items()}
 print(f'The predicted probability for the {list(pred_class_proba.keys())[1]} class: {list(pred_class_proba.values())[1]}')
 >>> The predicted probability for the not-letter class: 0.002029061783105135
 >>> The predicted probability for the letter class: 0.9979709386825562
+```
+```python
 pred_class = idx2label[np.argmax(list(pred_class_proba.values()))]
 print(f'The predicted class is: {pred_class}')
 >>> The predicted class is: letter
+```