Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,78 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
metrics:
|
6 |
+
- accuracy
|
7 |
+
pipeline_tag: text-classification
|
8 |
+
tags:
|
9 |
+
- advertising
|
10 |
---
|
11 |
+
|
12 |
+
# Tiny Bert Domain Advertising Classifier
|
13 |
+
|
14 |
+
## Overview
|
15 |
+
|
16 |
+
AdTargetingBERTClassifier is a small-scale BERT-based classifier designed for the task of ad targeting classification. The model is trained to predict multi-class labels associated with domains, as provided in the DAC693K dataset.
|
17 |
+
|
18 |
+
## Model Architecture
|
19 |
+
|
20 |
+
The classifier is built on the BERT (Bidirectional Encoder Representations from Transformers) architecture. It takes domain text as input and outputs logits for each class, enabling multi-class classification for ad targeting.
|
21 |
+
|
22 |
+
## Model Training
|
23 |
+
|
24 |
+
The model is trained on the "AdTargetingDataset" using a supervised learning approach. The training involves optimizing for the categorical cross-entropy loss, and the model is fine-tuned on the specific ad targeting classes associated with each domain.
|
25 |
+
|
26 |
+
## Usage
|
27 |
+
|
28 |
+
### Loading the Model
|
29 |
+
|
30 |
+
To use the trained classifier in your Python environment, you can load it using the following code:
|
31 |
+
|
32 |
+
```python
|
33 |
+
from transformers import BertTokenizer, BertForSequenceClassification
|
34 |
+
import torch
|
35 |
+
|
36 |
+
# Load the pre-trained model and tokenizer
|
37 |
+
model = BertForSequenceClassification.from_pretrained("ansi-code/bert-domain-advertising-classifier")
|
38 |
+
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
|
39 |
+
|
40 |
+
# Example inference
|
41 |
+
text = "google.com"
|
42 |
+
inputs = tokenizer(text, return_tensors="pt")
|
43 |
+
outputs = model(**inputs)
|
44 |
+
logits = outputs.logits
|
45 |
+
```
|
46 |
+
|
47 |
+
## Prediction
|
48 |
+
To make predictions with the loaded model, you can use the obtained logits. Convert the logits to probabilities and determine the predicted class based on the highest probability.
|
49 |
+
|
50 |
+
```python
|
51 |
+
Copy code
|
52 |
+
probabilities = torch.nn.functional.sigmoid(logits, dim=-1)
|
53 |
+
predicted_class = torch.argmax(probabilities).item()
|
54 |
+
```
|
55 |
+
|
56 |
+
## Model Evaluation
|
57 |
+
|
58 |
+
The model's performance can be assessed using standard evaluation metrics such as accuracy, precision, recall, and F1-score on a separate validation set or through cross-validation.
|
59 |
+
|
60 |
+
## License
|
61 |
+
|
62 |
+
This model is released under the Apache 2.0 License.
|
63 |
+
|
64 |
+
## Citation
|
65 |
+
|
66 |
+
If you use this model in your work, please cite it using the following BibTeX entry:
|
67 |
+
|
68 |
+
```bibtex
|
69 |
+
@model{silvi_2023_bert-domain-advertising-classifier,
|
70 |
+
title = {bert-domain-advertising-classifier},
|
71 |
+
author = {Andrea Silvi},
|
72 |
+
year = {2023},
|
73 |
+
}
|
74 |
+
```
|
75 |
+
|
76 |
+
## Acknowledgements
|
77 |
+
|
78 |
+
We would like to thank the developers of the Hugging Face Transformers library for providing the BERT model implementation.
|