Update README.md
Browse files
README.md
CHANGED
@@ -1,11 +1,11 @@
|
|
1 |
---
|
2 |
# Model Card generated based on AutoTrain run
|
3 |
-
# Date: 2025-04-
|
4 |
language:
|
5 |
- en # Primarily English from EDGAR
|
6 |
-
- multilingual #
|
7 |
library_name: transformers
|
8 |
-
license: apache-2.0 # Or appropriate license
|
9 |
tags:
|
10 |
- text-classification
|
11 |
- financial-filings
|
@@ -19,7 +19,7 @@ widget:
|
|
19 |
datasets:
|
20 |
- custom # Combined Labelbox and EDGAR data
|
21 |
model-index:
|
22 |
-
- name:
|
23 |
results:
|
24 |
- task:
|
25 |
type: text-classification
|
@@ -28,37 +28,36 @@ model-index:
|
|
28 |
type: custom
|
29 |
name: Combined Financial Filings (Labelbox + EDGAR)
|
30 |
split: validation
|
31 |
-
# Corrected metrics format (array of objects, removed config object)
|
32 |
metrics:
|
33 |
- type: accuracy
|
34 |
value: 0.9617
|
35 |
name: Accuracy
|
36 |
- type: f1
|
37 |
value: 0.6470
|
38 |
-
name: F1 (Macro)
|
39 |
- type: f1
|
40 |
value: 0.9597
|
41 |
-
name: F1 (Weighted)
|
42 |
- type: loss
|
43 |
value: 0.1687
|
44 |
name: Loss
|
45 |
---
|
46 |
|
47 |
-
# Model Card:
|
48 |
|
49 |
## Model Details
|
50 |
|
51 |
-
* **Model Name:** `
|
52 |
* **Description:** This model is a fine-tuned version of `FacebookAI/xlm-roberta-large` designed for multi-class text classification of financial filing documents. It classifies input text (expected in markdown format) into one of 37 predefined filing type categories.
|
53 |
* **Base Model:** [FacebookAI/xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large)
|
54 |
-
* **Developed by:** [
|
55 |
-
* **Model Version:** 1.0
|
56 |
* **Fine-tuning Framework:** Hugging Face AutoTrain
|
57 |
|
58 |
## Intended Use
|
59 |
|
60 |
* **Primary Use:** To automatically classify financial filing documents based on their textual content into one of 37 categories (e.g., Annual Report, Quarterly Report, Directors' Dealings, etc.).
|
61 |
-
* **Primary Users:** Financial analysts, data providers, regulatory compliance teams, researchers.
|
62 |
* **Out-of-Scope Uses:** This model is not designed for sentiment analysis, named entity recognition, or classification tasks outside the defined 37 financial filing types. Performance on filing types significantly different from those in the training data is not guaranteed.
|
63 |
|
64 |
## Training Data
|
@@ -113,8 +112,8 @@ You can use this model via the Hugging Face `transformers` library:
|
|
113 |
```python
|
114 |
from transformers import pipeline
|
115 |
|
116 |
-
# Load the classifier pipeline (replace with your actual model repo ID)
|
117 |
-
model_repo_id = "
|
118 |
classifier = pipeline("text-classification", model=model_repo_id)
|
119 |
|
120 |
# Example usage
|
@@ -144,12 +143,11 @@ print(predictions)
|
|
144 |
# results = [{"label": model.config.id2label[i], "score": prob.item()} for i, prob in enumerate(probabilities)]
|
145 |
# results.sort(key=lambda x: x["score"], reverse=True)
|
146 |
# print(results)
|
147 |
-
|
148 |
-
|
149 |
-
author = {[Your Name/Organization]},
|
150 |
title = {XLM-RoBERTa-Large Financial Filing Classifier},
|
151 |
year = {2025},
|
152 |
publisher = {Hugging Face},
|
153 |
journal = {Hugging Face Model Hub},
|
154 |
-
howpublished = {\url{[https://huggingface.co/](https://huggingface.co/
|
155 |
}
|
|
|
1 |
---
|
2 |
# Model Card generated based on AutoTrain run
|
3 |
+
# Date: 2025-04-07
|
4 |
language:
|
5 |
- en # Primarily English from EDGAR
|
6 |
+
- multilingual # Assumed multilingual from European sources & XLM-R base
|
7 |
library_name: transformers
|
8 |
+
license: apache-2.0 # Or appropriate license
|
9 |
tags:
|
10 |
- text-classification
|
11 |
- financial-filings
|
|
|
19 |
datasets:
|
20 |
- custom # Combined Labelbox and EDGAR data
|
21 |
model-index:
|
22 |
+
- name: FinancialReports/filing-classification-xlmr # Model Repo ID
|
23 |
results:
|
24 |
- task:
|
25 |
type: text-classification
|
|
|
28 |
type: custom
|
29 |
name: Combined Financial Filings (Labelbox + EDGAR)
|
30 |
split: validation
|
|
|
31 |
metrics:
|
32 |
- type: accuracy
|
33 |
value: 0.9617
|
34 |
name: Accuracy
|
35 |
- type: f1
|
36 |
value: 0.6470
|
37 |
+
name: F1 (Macro)
|
38 |
- type: f1
|
39 |
value: 0.9597
|
40 |
+
name: F1 (Weighted)
|
41 |
- type: loss
|
42 |
value: 0.1687
|
43 |
name: Loss
|
44 |
---
|
45 |
|
46 |
+
# Model Card: FinancialReports Filing Classifier
|
47 |
|
48 |
## Model Details
|
49 |
|
50 |
+
* **Model Name:** `FinancialReports/filing-classification-xlmr` (Assumed Repo ID based on AutoTrain project & org)
|
51 |
* **Description:** This model is a fine-tuned version of `FacebookAI/xlm-roberta-large` designed for multi-class text classification of financial filing documents. It classifies input text (expected in markdown format) into one of 37 predefined filing type categories.
|
52 |
* **Base Model:** [FacebookAI/xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large)
|
53 |
+
* **Developed by:** FinancialReports ([financialreports.eu](https://financialreports.eu))
|
54 |
+
* **Model Version:** 1.0
|
55 |
* **Fine-tuning Framework:** Hugging Face AutoTrain
|
56 |
|
57 |
## Intended Use
|
58 |
|
59 |
* **Primary Use:** To automatically classify financial filing documents based on their textual content into one of 37 categories (e.g., Annual Report, Quarterly Report, Directors' Dealings, etc.).
|
60 |
+
* **Primary Users:** Financial analysts, data providers, regulatory compliance teams, researchers associated with FinancialReports.
|
61 |
* **Out-of-Scope Uses:** This model is not designed for sentiment analysis, named entity recognition, or classification tasks outside the defined 37 financial filing types. Performance on filing types significantly different from those in the training data is not guaranteed.
|
62 |
|
63 |
## Training Data
|
|
|
112 |
```python
|
113 |
from transformers import pipeline
|
114 |
|
115 |
+
# Load the classifier pipeline (replace with your actual model repo ID on the Hub)
|
116 |
+
model_repo_id = "FinancialReports/filing-classification-xlmr"
|
117 |
classifier = pipeline("text-classification", model=model_repo_id)
|
118 |
|
119 |
# Example usage
|
|
|
143 |
# results = [{"label": model.config.id2label[i], "score": prob.item()} for i, prob in enumerate(probabilities)]
|
144 |
# results.sort(key=lambda x: x["score"], reverse=True)
|
145 |
# print(results)
|
146 |
+
Citation@misc{financialreports_filing_classifier_2025,
|
147 |
+
author = {FinancialReports},
|
|
|
148 |
title = {XLM-RoBERTa-Large Financial Filing Classifier},
|
149 |
year = {2025},
|
150 |
publisher = {Hugging Face},
|
151 |
journal = {Hugging Face Model Hub},
|
152 |
+
howpublished = {\url{[https://huggingface.co/FinancialReports/filing-classification-xlmr](https://www.google.com/search?q=https://huggingface.co/FinancialReports/filing-classification-xlmr)}}, # Assumed URL
|
153 |
}
|