--- library_name: transformers tags: [] --- # Model Card for Model ID Source code: [Google Colab](https://colab.research.google.com/drive/1qnocYiNrF3udkxx1YRwyxTSaeN7F35DK) ## Model Details ### Model Description Can do abstractive summarization of legal/contractual documents. Fine tuned on BART-LARGE-CNN. - **Developed by:** [Siddhesh Kulthe](https://huggingface.co/siddheshtv) - **License:** MIT - **Finetuned from model:** [Facebook/BART-LARGE-CNN](https://huggingface.co/facebook/bart-large-cnn) ## Uses - Abstractive summarization for legal docs (Banking, Legal, Contractual, etc.) ## Sample Usage Load model config and safetensors: ```python from transformers import BartForConditionalGeneration, BartTokenizer import torch model_name = "siddheshtv/bart-multi-lexsum" model = BartForConditionalGeneration.from_pretrained(model_name) tokenizer = BartTokenizer.from_pretrained(model_name) device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = model.to(device) ``` Generate Summary Function ```python def generate_summary(model, tokenizer, text, max_length=512): device = next(model.parameters()).device inputs = tokenizer.encode("summarize: " + text, return_tensors="pt", max_length=1024, truncation=True) inputs = inputs.to(device) summary_ids = model.generate( inputs, max_length=max_length, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True, no_repeat_ngram_size=3, forced_bos_token_id=0, forced_eos_token_id=2 ) summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) return summary ``` Generate summary ```python generated_summary = generate_summary(model, tokenizer, example_text) print("Generated Summary:") print(generated_summary) ``` ## Training Data - **Dataset URL:** [Multi-Lexsum](https://multilexsum.github.io/)