ruT5-base_headline_generation

Model Details

T5 Base for news headline generation (Russian). The model is finetuned for best performance on short news texts (128 words or less), but it has decent metrics on longer articles as well. The model generates abstractive headlines that on average include 6-11 words.

Base Model: ai-forever/ruT5-base

Training Details

Training Data: 247 000 news articles in Russian

Training Procedure: 6 epochs, all details and hyperparameters in this Google Colab notebook

Testing Metrics

  • Rouge1: 40.24
  • Rouge2: 23.05
  • RougeL: 37.57

How to Use

from transformers import AutoTokenizer, T5ForConditionalGeneration

model_name = "wanderer-msk/ruT5-base_headline_generation"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

news_text = """Земляне продолжают осваивать Марс.
Колонисты уже посадили на красной планете 42 яблони."""

model_input = tokenizer(
    news_text,
    truncation=True,
    max_length=1024,
    return_tensors="pt"
)
model_output = model.generate(model_input["input_ids"])
news_headline = tokenizer.decode(
    model_output.squeeze(),
    skip_special_tokens=True
)

print(news_headline)
Downloads last month
4,784
Safetensors
Model size
223M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for wanderer-msk/ruT5-base_headline_generation

Finetuned
(13)
this model