neavo commited on
Commit
437c78f
·
verified ·
1 Parent(s): 2b82e96

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ - ja
6
+ - ko
7
+ pipeline_tag: fill-mask
8
+ ---
9
+
10
+ ### Overview
11
+ - ModernBertMultilingual is a multilingual model trained from scratch, using the [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) architecture.
12
+ - It supports four languages and their variants, including `Simplified Chinese`, `Traditional Chinese`, `English`, `Japanese`, and `Korean`
13
+ - And can effectively handle mixed-text tasks in East Asian languages.
14
+
15
+ ### Technical Metrics
16
+ - Trained for approximately `100` hours on `L40*7` devices, with a training volume of about `60B` tokens.
17
+ - Main training parameters:
18
+ - Batch Size: 1792
19
+ - Learning Rate: 4e-05
20
+ - Maximum Sequence Length: 512
21
+ - Optimizer: adamw_torch
22
+ - LR Scheduler: warmup_stable_decay
23
+ - Train Precision: bf16 mix
24
+ - For additional technical metrics, please refer to the original release information and papers of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base).
25
+
26
+ ### Release Versions
27
+ - Three different weight versions are provided:
28
+ - base: The version trained with general base data, suitable for various domain texts (default).
29
+ - nodecay: The checkpoint before the annealing phase begins, which allows you to add domain-specific data for annealing to better adapt to the target domain.
30
+ - keyword_gacha_multilingual: The version annealed with ACGN-related texts (e.g., `light novels`, `game scripts`, `comic scripts`, etc.).
31
+
32
+ | Model | Version | Description |
33
+ | :--: | :--: | :--: |
34
+ | [modern_bert_multilingual](https://huggingface.co/neavo/modern_bert_multilingual) | 20250128 | base |
35
+ | [modern_bert_multilingual_nodecay](https://huggingface.co/neavo/modern_bert_multilingual_nodecay) | 20250128 | nodecay |
36
+ | [keyword_gacha_base_multilingual](https://huggingface.co/neavo/keyword_gacha_base_multilingual) | 20250128 | keyword_gacha_multilingual |
37
+
38
+ ### Others
39
+ - Training script available on [Github](https://github.com/neavo/KeywordGachaModel).
40
+
41
+ ### 综述
42
+ - ModernBertMultilingual 是一个从零开始训练的多语言模型,使用 [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) 架构
43
+ - 支持 `简体中文`、`繁体中文`、`英文`、`日文`、`韩文` 等四种语言及其变种,可以很好处理东亚语言混合文本任务
44
+
45
+ ### 技术指标
46
+ - 在 `L40*7` 的设备上训练了大约 `100` 个小时,训练量大约 `60B` Token
47
+ - 主要训练参数
48
+ - Batch Size : 1792
49
+ - Learing Rate : 4e-05
50
+ - Maximum Sequence Length : 512
51
+ - Optimizer : adamw_torch
52
+ - LR Scheduler: warmup_stable_decay
53
+ - Train Precision : bf16 mix
54
+ - 其余技术指标可以参考 [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) 原始发布信息与论文
55
+
56
+ ### 发布版本
57
+ - 提供 3 个不同的权重版本
58
+ - base - 使用通用基础预料进行完整训练的版本,可以较好的适用于各种不同领域文本(默认)
59
+ - nodecay - 退火阶段开始前的检查点,你可以在这个权重的基础上添加领域语料进行退火以使其更适应目标领域
60
+ - keyword_gacha_multilingual - 使用 ACGN(例如 `轻小说`、`游戏脚本`、`漫画脚本`等)类型文本进行退火的版本
61
+
62
+ | 模型 | 版本 | 说明 |
63
+ | :--: | :--: | :--:|
64
+ | [modern_bert_multilingual](https://huggingface.co/neavo/modern_bert_multilingual) | 20250128 | base |
65
+ | [modern_bert_multilingual_nodecay](https://huggingface.co/neavo/modern_bert_multilingual_nodecay) | 20250128 | nodecay |
66
+ | [keyword_gacha_base_multilingual](https://huggingface.co/neavo/keyword_gacha_base_multilingual) | 20250128 | keyword_gacha_multilingual |
67
+
68
+ ### 其他
69
+ - 训练脚本 [Github](https://github.com/neavo/KeywordGachaModel)