Spaces:

exbert-project
/

exbert

Running on CPU Upgrade

App Files Files Community

exbert / server /transformers /docs /source /model_doc /bert.rst

bhoov

git subrepo clone (merge) --branch=exbert-mods https://github.com/bhoov/transformers.git server/transformers

75466df almost 6 years ago

raw

history blame

4.57 kB

	BERT
	----------------------------------------------------

	Overview
	~~~~~~~~~~~~~~~~~~~~~

	The BERT model was proposed in `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`__
	by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. It's a bidirectional transformer
	pre-trained using a combination of masked language modeling objective and next sentence prediction
	on a large corpus comprising the Toronto Book Corpus and Wikipedia.

	The abstract from the paper is the following:

	*We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations
	from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional
	representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result,
	the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models
	for a wide range of tasks, such as question answering and language inference, without substantial task-specific
	architecture modifications.*

	*BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural
	language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI
	accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute
	improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).*

	Tips:

	- BERT is a model with absolute position embeddings so it's usually advised to pad the inputs on
	the right rather than the left.
	- BERT was trained with a masked language modeling (MLM) objective. It is therefore efficient at predicting masked
	tokens and at NLU in general, but is not optimal for text generation. Models trained with a causal language
	modeling (CLM) objective are better in that regard.
	- Alongside MLM, BERT was trained using a next sentence prediction (NSP) objective using the [CLS] token as a sequence
	approximate. The user may use this token (the first token in a sequence built with special tokens) to get a sequence
	prediction rather than a token prediction. However, averaging over the sequence may yield better results than using
	the [CLS] token.

	BertConfig
	~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertConfig
	:members:


	BertTokenizer
	~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertTokenizer
	:members:


	BertModel
	~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertModel
	:members:


	BertForPreTraining
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertForPreTraining
	:members:


	BertForMaskedLM
	~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertForMaskedLM
	:members:


	BertForNextSentencePrediction
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertForNextSentencePrediction
	:members:


	BertForSequenceClassification
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertForSequenceClassification
	:members:


	BertForMultipleChoice
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertForMultipleChoice
	:members:


	BertForTokenClassification
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertForTokenClassification
	:members:


	BertForQuestionAnswering
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.BertForQuestionAnswering
	:members:


	TFBertModel
	~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertModel
	:members:


	TFBertForPreTraining
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertForPreTraining
	:members:


	TFBertForMaskedLM
	~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertForMaskedLM
	:members:


	TFBertForNextSentencePrediction
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertForNextSentencePrediction
	:members:


	TFBertForSequenceClassification
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertForSequenceClassification
	:members:


	TFBertForMultipleChoice
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertForMultipleChoice
	:members:


	TFBertForTokenClassification
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertForTokenClassification
	:members:


	TFBertForQuestionAnswering
	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

	.. autoclass:: transformers.TFBertForQuestionAnswering
	:members: