neavo
/

modern_bert_multilingual

Fill-Mask

Safetensors

modernbert

Model card Files Files and versions Community

neavo commited on Jan 31

Commit

437c78f

verified ·

1 Parent(s): 2b82e96

Create README.md

Browse files

Files changed (1) hide show

README.md +69 -0

README.md ADDED Viewed

	@@ -0,0 +1,69 @@

+---
+language:
+- zh
+- en
+- ja
+- ko
+pipeline_tag: fill-mask
+---
+### Overview
+- ModernBertMultilingual is a multilingual model trained from scratch, using the [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) architecture.
+- It supports four languages and their variants, including `Simplified Chinese`, `Traditional Chinese`, `English`, `Japanese`, and `Korean`
+- And can effectively handle mixed-text tasks in East Asian languages.
+### Technical Metrics
+- Trained for approximately `100` hours on `L40*7` devices, with a training volume of about `60B` tokens.
+- Main training parameters:
+  - Batch Size: 1792
+  - Learning Rate: 4e-05
+  - Maximum Sequence Length: 512
+  - Optimizer: adamw_torch
+  - LR Scheduler: warmup_stable_decay
+  - Train Precision: bf16 mix
+- For additional technical metrics, please refer to the original release information and papers of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base).
+### Release Versions
+- Three different weight versions are provided:
+  - base: The version trained with general base data, suitable for various domain texts (default).
+  - nodecay: The checkpoint before the annealing phase begins, which allows you to add domain-specific data for annealing to better adapt to the target domain.
+  - keyword_gacha_multilingual: The version annealed with ACGN-related texts (e.g., `light novels`, `game scripts`, `comic scripts`, etc.).
+| Model | Version | Description |
+| :--: | :--: | :--: |
+| [modern_bert_multilingual](https://huggingface.co/neavo/modern_bert_multilingual) | 20250128 | base |
+| [modern_bert_multilingual_nodecay](https://huggingface.co/neavo/modern_bert_multilingual_nodecay) | 20250128 | nodecay |
+| [keyword_gacha_base_multilingual](https://huggingface.co/neavo/keyword_gacha_base_multilingual) | 20250128 | keyword_gacha_multilingual |
+### Others
+- Training script available on [Github](https://github.com/neavo/KeywordGachaModel).
+### 综述
+- ModernBertMultilingual 是一个从零开始训练的多语言模型，使用 [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) 架构
+- 支持 `简体中文`、`繁体中文`、`英文`、`日文`、`韩文` 等四种语言及其变种，可以很好处理东亚语言混合文本任务
+### 技术指标
+- 在 `L40*7` 的设备上训练了大约 `100` 个小时，训练量大约 `60B` Token
+- 主要训练参数
+  - Batch Size : 1792
+  - Learing Rate : 4e-05
+  - Maximum Sequence Length : 512
+  - Optimizer : adamw_torch
+  - LR Scheduler: warmup_stable_decay
+  - Train Precision : bf16 mix
+- 其余技术指标可以参考 [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) 原始发布信息与论文
+### 发布版本
+- 提供 3 个不同的权重版本
+  - base - 使用通用基础预料进行完整训练的版本，可以较好的适用于各种不同领域文本（默认）
+  - nodecay - 退火阶段开始前的检查点，你可以在这个权重的基础上添加领域语料进行退火以使其更适应目标领域
+  - keyword_gacha_multilingual - 使用 ACGN（例如 `轻小说`、`游戏脚本`、`漫画脚本`等）类型文本进行退火的版本
+| 模型 | 版本 | 说明 |
+| :--: | :--: | :--:|
+| [modern_bert_multilingual](https://huggingface.co/neavo/modern_bert_multilingual) | 20250128 | base |
+| [modern_bert_multilingual_nodecay](https://huggingface.co/neavo/modern_bert_multilingual_nodecay)  | 20250128 | nodecay |
+| [keyword_gacha_base_multilingual](https://huggingface.co/neavo/keyword_gacha_base_multilingual)  | 20250128 | keyword_gacha_multilingual |
+### 其他
+- 训练脚本 [Github](https://github.com/neavo/KeywordGachaModel)