|
--- |
|
language: |
|
- en |
|
tags: |
|
- financial NLP |
|
- named entity recognition |
|
- sequence labeling |
|
- structured extraction |
|
- hierarchical taxonomy |
|
- XBRL |
|
- iXBRL |
|
- SEC filings |
|
- financial-information-extraction |
|
datasets: |
|
- AAU-NLP/HiFi-KPI |
|
model_name: "Pre-BERT-SL1000" |
|
library_name: "transformers" |
|
pipeline_tag: "token-classification" |
|
base_model: "bert-base-uncased" |
|
task_categories: |
|
- token-classification |
|
task_ids: |
|
- named-entity-recognition |
|
- financial-information-extraction |
|
pretty_name: "Pre-BERT-SL1000: Sequence Labeling for Presentation Taxonomy KPI Extraction" |
|
size_categories: "1M<n<10M" |
|
languages: |
|
- en |
|
dataset_name: "HiFi-KPI" |
|
model_description: | |
|
Pre-BERT-SL1000 is a **BERT-based sequence labeling model** fine-tuned on the **HiFi-KPI dataset** for extracting |
|
**financial key performance indicators (KPIs)** from **SEC earnings filings (10-K & 10-Q)**. It specializes in identifying |
|
entities that are one level up the **presentation taxonomy**, such as revenueAbstract, earnings, and financial ratios, using **token classification**. |
|
|
|
This model is trained specifically on n=1 with the **presentation taxonomy labels** from **HiFi-KPI**, focusing on entity identification. |
|
|
|
dataset_link: "https://huggingface.co/datasets/AAU-NLP/HiFi-KPI" |
|
repo_link: "https://github.com/rasmus393/HiFi-KPI" |
|
--- |
|
|
|
## **Pre-BERT-SL1000** |
|
|
|
### **Model Description** |
|
Pre-BERT-SL1000 is a **BERT-based sequence labeling model** fine-tuned on the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** for extracting **financial key performance indicators (KPIs)** from **SEC earnings filings (10-K & 10-Q)**. It specializes in identifying entities, such as revenue, earnings, etc. |
|
|
|
This model is trained on the [HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI) and is focused on the **presentation layer taxonomy** with **n=1**. |
|
|
|
### **Use Cases** |
|
- Extracting **financial KPIs** using **iXBRL presentation taxonomy** |
|
- **Financial document parsing** with entity recognition |
|
|
|
### **Performance** |
|
- Trained on **1,000 most frequent labels** from the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** with n=1 in the **presentation taxonomy** |
|
|
|
### **Dataset & Code** |
|
- **Dataset**: [HiFi-KPI on Hugging Face](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI) |
|
- **Code example**: [HiFi-KPI GitHub Repository](https://github.com/rasmus393/HiFi-KPI) |
|
|