File size: 1,301 Bytes
eee6778
 
 
 
 
 
 
 
30a5190
eee6778
c344e37
eee6778
 
 
 
 
3220431
 
eee6778
3220431
 
eee6778
3220431
 
 
eee6778
3220431
eee6778
3220431
 
 
 
eee6778
3220431
 
eee6778
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
license: apache-2.0
language:
- en
pipeline_tag: text-classification
library_name: transformers
---

# yizhao-risk-en-scorer 
## Introduction
This is a BERT model fine-tuned on a high-quality English financial dataset. It generates a security risk score, which helps to identify and remove data with security risks from financial datasets, thereby reducing the proportion of illegal or undesirable data. For the complete data cleaning process, please refer to [YiZhao](https://github.com/HITsz-TMG/YiZhao).
## Quickstart
Here is an example code snippet for generating security risk scores using this model.
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

text = "You are a smart robot"
risk_model_name = "risk-model-en-v0.1"

risk_tokenizer = AutoTokenizer.from_pretrained(risk_model_name)
risk_model = AutoModelForSequenceClassification.from_pretrained(risk_model_name)

risk_inputs = risk_tokenizer(text, return_tensors="pt", padding="longest", truncation=True)
risk_outputs = risk_model(**risk_inputs)
risk_logits = risk_outputs.logits.squeeze(-1).float().detach().numpy()

risk_score = risk_logits.item()

result = {
    "text": text,
    "risk_score": risk_score
}

print(result)
# {'text': 'You are a smart robot', 'risk_score': 0.11226219683885574}
```