Spaces:

calvesca
/

gtp-final

Runtime error

gtp-final / gradio /guides /09_other-tutorials /named-entity-recognition.md

calvesca

Upload folder using huggingface_hub

0bd62e5 verified about 1 year ago

preview code

raw

history blame contribute delete

3.36 kB

	# Named-Entity Recognition

	Related spaces: https://huggingface.co/spaces/rajistics/biobert_ner_demo, https://huggingface.co/spaces/abidlabs/ner, https://huggingface.co/spaces/rajistics/Financial_Analyst_AI
	Tags: NER, TEXT, HIGHLIGHT

	## Introduction

	Named-entity recognition (NER), also known as token classification or text tagging, is the task of taking a sentence and classifying every word (or "token") into different categories, such as names of people or names of locations, or different parts of speech.

	For example, given the sentence:

	> Does Chicago have any Pakistani restaurants?

	A named-entity recognition algorithm may identify:

	- "Chicago" as a location
	- "Pakistani" as an ethnicity

	and so on.

	Using `gradio` (specifically the `HighlightedText` component), you can easily build a web demo of your NER model and share that with the rest of your team.

	Here is an example of a demo that you'll be able to build:

	$demo_ner_pipeline

	This tutorial will show how to take a pretrained NER model and deploy it with a Gradio interface. We will show two different ways to use the `HighlightedText` component -- depending on your NER model, either of these two ways may be easier to learn!

	### Prerequisites

	Make sure you have the `gradio` Python package already [installed](/getting_started). You will also need a pretrained named-entity recognition model. You can use your own, while in this tutorial, we will use one from the `transformers` library.

	### Approach 1: List of Entity Dictionaries

	Many named-entity recognition models output a list of dictionaries. Each dictionary consists of an _entity_, a "start" index, and an "end" index. This is, for example, how NER models in the `transformers` library operate:

	```py
	from transformers import pipeline
	ner_pipeline = pipeline("ner")
	ner_pipeline("Does Chicago have any Pakistani restaurants")
	```

	Output:

	```bash
	[{'entity': 'I-LOC',
	'score': 0.9988978,
	'index': 2,
	'word': 'Chicago',
	'start': 5,
	'end': 12},
	{'entity': 'I-MISC',
	'score': 0.9958592,
	'index': 5,
	'word': 'Pakistani',
	'start': 22,
	'end': 31}]
	```

	If you have such a model, it is very easy to hook it up to Gradio's `HighlightedText` component. All you need to do is pass in this list of entities, along with the original text to the model, together as dictionary, with the keys being `"entities"` and `"text"` respectively.

	Here is a complete example:

	$code_ner_pipeline
	$demo_ner_pipeline

	### Approach 2: List of Tuples

	An alternative way to pass data into the `HighlightedText` component is a list of tuples. The first element of each tuple should be the word or words that are being classified into a particular entity. The second element should be the entity label (or `None` if they should be unlabeled). The `HighlightedText` component automatically strings together the words and labels to display the entities.

	In some cases, this can be easier than the first approach. Here is a demo showing this approach using Spacy's parts-of-speech tagger:

	$code_text_analysis
	$demo_text_analysis

	---

	And you're done! That's all you need to know to build a web-based GUI for your NER model.

	Fun tip: you can share your NER demo instantly with others simply by setting `share=True` in `launch()`.