--- datasets: - HuggingFaceFW/fineweb language: - en --- # Encoder-Decoder model with DeBERTa encoder ## pre-trained models - `deliciouscat/deberta-v3-base-encoder-decoder-v0.2` -> 297511524(298M) params ## Data used - `HuggingFaceFW/fineweb` - AiHub ko-en translation corpus (English part) - Some papers that I kept ## Training hparams - optimizer: AdamW, lr=3e-5, betas=(0.875, 0.997) - batch size: 12 -> training on denoising objective (BART), 29523 step ## How to use ``` from transformers import AutoTokenizer, EncoderDecoderModel model = EncoderDecoderModel.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.3") tokenizer = AutoTokenizer.from_pretrained("deliciouscat/deberta-v3-base-encoder-decoder-v0.3") ``` ## Future work! - train more scientific data - fine-tune on keyword extraction task