File size: 1,805 Bytes
89994f9
22b8d5e
 
 
 
 
 
 
 
89994f9
 
22b8d5e
 
 
89994f9
941c421
 
 
b996a15
941c421
 
 
b996a15
941c421
 
 
89994f9
22b8d5e
 
 
89994f9
22b8d5e
 
 
89994f9
 
22b8d5e
89994f9
22b8d5e
 
89994f9
22b8d5e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
license: apache-2.0
language:
- en
base_model:
- nasa-impact/nasa-smd-ibm-v0.1
pipeline_tag: token-classification
tags:
- astronomy
---

# INDUS - NER-DEAL 
Indus-NER-DEAL (nasa-smd-ibm-v0.1_NER_DEAL) is a RoBERTa-based, Encoder-only transformer model, domain-adapted for NASA Science Mission Directorate (SMD) applications. It's fine-tuned on scientific journals and articles relevant to NASA SMD, aiming to enhance natural language technologies like information retrieval and intelligent search.  
This specific fork was finetuned on SciX Digital Library (https://scixplorer.org/, formerly NASA-ADS) proprietary data to label text with DEAL labels (https://ui.adsabs.harvard.edu/WIESP/2022/LabelDefinitions)

## Usage
```python
from transformers import AutoModelForTokenClassification, AutoTokenizer
INDUS_NER_DEAL = AutoModelForTokenClassification.from_pretrained(pretrained_model_name_or_path='adsabs/nasa-smd-ibm-v0.1_NER_DEAL',
                                                                 revision=None,
                                                                )

INDUS_tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path='adsabs/nasa-smd-ibm-v0.1_NER_DEAL',
                                                 do_lower_case=False,
                                                )
```
## Model Details
- **Base Model**: RoBERTa
- **Tokenizer**: Custom
- **Parameters**: 125M

## Training Data
- 5K acknowledgements and full-text fragments from astronomy papers provided by NASA-SciX  with manually tagged astronomical facilities and other entities of interest (e.g., celestial objects).
- approximately 1.6M words


<!-- ## Note -->

<!-- ## Citation -->
<!-- If you find this work useful, please cite using the following bibtex citation: -->

<!-- ## Disclaimer -->