Arofat
/

uzbek-dependency-parser

dependency-parsing

universal-dependencies

Model card Files Files and versions Community

uzbek-dependency-parser / README.md

Arofat's picture

Upload README.md with huggingface_hub

415a39b verified 5 months ago

|

history blame contribute delete

1.85 kB

	---
	language: uz
	license: apache-2.0
	tags:
	- uzbek
	- dependency-parsing
	- universal-dependencies
	- nlp
	datasets:
	- universal_dependencies
	metrics:
	- accuracy
	- f1
	---

	# Uzbek Dependency Parser

	This model predicts Universal Dependencies dependency relations for Uzbek text.

	## Model details

	The model was fine-tuned on a Universal Dependencies treebank containing approximately 600 annotated sentences.
	It is based on the [XLM-RoBERTa base model](https://huggingface.co/xlm-roberta-base) and adapted for token classification.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	import torch

	# Load model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained("Arofat/uzbek-dependency-parser")
	model = AutoModelForTokenClassification.from_pretrained("Arofat/uzbek-dependency-parser")

	# Prepare text
	text = "Men O'zbekistonda yashayman."
	tokens = text.split()

	# Get predictions
	inputs = tokenizer(tokens, is_split_into_words=True, return_tensors="pt")
	with torch.no_grad():
	outputs = model(**inputs)

	# Process outputs
	predictions = torch.argmax(outputs.logits, dim=2)
	id2label = model.config.id2label

	# Get dependency relations
	dep_tags = []
	word_ids = inputs.word_ids(batch_index=0)
	prev_word_id = None
	for idx, word_id in enumerate(word_ids):
	if word_id is None or word_id == prev_word_id:
	continue
	dep_tags.append(id2label[predictions[0, idx].item()])
	prev_word_id = word_id

	# Print results
	for token, tag in zip(tokens, dep_tags):
	print(f"{token}: {tag}")
	```

	## Limitations

	This model was trained on a relatively small dataset and may not generalize well to all domains of Uzbek text.
	Note that this model only predicts dependency relations (labels) and not the dependency tree structure (heads).
	For a complete dependency parse, additional processing is needed.