dmis-lab
/

TinySapBERT-from-TinyPubMedBERT-v1.0

Feature Extraction

text-embeddings-inference

Model card Files Files and versions Community

TinySapBERT-from-TinyPubMedBERT-v1.0 / README.md

dmis-lab's picture

Update README.md

5434902 about 2 years ago

|

history blame contribute delete

2.4 kB

	This model repository presents "TinySapBERT", tiny-sized biomedical entity representations (language model) trained using [official SapBERT code and instructions (Liu et al., NAACL 2021)](https://github.com/cambridgeltl/sapbert).
	We used our [TinyPubMedBERT](https://huggingface.co/dmis-lab/TinyPubMedBERT-v1.0), a tiny-sized LM, as an initial starting point to train using the SapBERT scheme.
	<br>
	cf) TinyPubMedBERT is a distillated [PubMedBERT (Gu et al., 2021)](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract), open-sourced along with the release of the KAZU (Korea University and AstraZeneca) framework.

	* For details, please visit [KAZU framework](https://github.com/AstraZeneca/KAZU) or see our paper entitled Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework, (EMNLP 2022 industry track).
	* For the demo of KAZU framework, please visit http://kazu.korea.ac.kr

	### Citation info
	Joint-first authorship of Richard Jackson (AstraZeneca) and WonJin Yoon (Korea University).
	<br>Please cite the simplified version using the following section, or find the [full citation information here](https://aclanthology.org/2022.emnlp-industry.63.bib)
	```
	@inproceedings{YoonAndJackson2022BiomedicalNER,
	title="Biomedical {NER} for the Enterprise with Distillated {BERN}2 and the Kazu Framework",
	author="Yoon, Wonjin and Jackson, Richard and Ford, Elliot and Poroshin, Vladimir and Kang, Jaewoo",
	booktitle="Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track",
	month = dec,
	year = "2022",
	address = "Abu Dhabi, UAE",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2022.emnlp-industry.63",
	pages = "619--626",
	}
	```

	The model used resources of [SapBERT paper](https://aclanthology.org/2021.naacl-main.334.pdf). We appreciate the authors for making the resources publicly available!
	```
	Liu, Fangyu, et al. "Self-Alignment Pretraining for Biomedical Entity Representations."
	Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021.
	```

	### Contact Information
	For help or issues using the codes or model (NER module of KAZU) in this repository, please contact WonJin Yoon (wonjin.info (at) gmail.com) or submit a GitHub issue.