AcroBERT can do end-to-end acronym linking (see the Demo here). Given a sentence, our framework first recognize acronyms by using MadDog, and then disambiguate them by using AcroBERT:

from inference.acrobert important acronym_linker

# input sentence with acronyms, the maximum length is 400 sub-tokens
sentence = "This new genome assembly and the annotation are tagged as a RefSeq genome by NCBI."

# mode = ['acrobert', 'pop']
# AcroBERT has a better performance while the pop method is faster but with a low accuracy.
results = acronym_linker(sentence, mode='acrobert')
print(results)

## expected output: [('NCBI', 'National Center for Biotechnology Information')]

Github: https://github.com/tigerchen52/GLADIS

Model: [https://zenodo.org/record/7568937#.Y9vtrXaZMuU]

Apart from the AcroBERT, we constructed a new benchmark named GLADIS for accelerating the research on acronym disambiguation, which contains the below data:

Source Desc
Acronym Dictionary Pile (MIT license), Wikidata, UMLS 1.6 million acronyms and 6.4 million long forms
Three Datasets WikilinksNED Unseen, SciAD(CC BY-NC-SA 4.0), Medmentions(CC0 1.0) three AD datasets that cover general, scientific, biomedical domains
A Pre-training Corpus Pile (MIT license) 160 million sentences with acronyms

usage

  1. git clone https://github.com/tigerchen52/GLADIS.git
  2. download the acronym dictionary and AcroBERT, and put them into this path: input/
  3. use the function inference.acrobert.acronym_linker() to do end-to-end acronym linking.

citation

@inproceedings{chen2023gladis,
  title={GLADIS: A General and Large Acronym Disambiguation Benchmark},
  author={Chen, Lihu and Varoquaux, Ga{\"e}l and Suchanek, Fabian M},
  booktitle={EACL 2023-The 17th Conference of the European Chapter of the Association for Computational Linguistics},
  year={2023}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.