Commit
·
ecf10ed
1
Parent(s):
570e635
Add vision tag
Browse files
README.md
CHANGED
@@ -113,6 +113,33 @@ af = torch.mean(torch.stack([bella, sarah]), dim=0)
|
|
113 |
assert torch.equal(af, torch.load('voices/af.pt', weights_only=True))
|
114 |
```
|
115 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
116 |
### Training Details
|
117 |
|
118 |
**Compute:** Darwin-AI v0.19 was trained on A100 80GB vRAM instances for approximately 500 total GPU hours. The average cost for each GPU hour was around $0.80, so the total cost was around $400.
|
|
|
113 |
assert torch.equal(af, torch.load('voices/af.pt', weights_only=True))
|
114 |
```
|
115 |
|
116 |
+
### Pretraining
|
117 |
+
|
118 |
+
For all pre-training related hyperparameters, we refer to page 15 of the [original paper](https://arxiv.org/abs/2106.08254).
|
119 |
+
|
120 |
+
## Evaluation results
|
121 |
+
|
122 |
+
For evaluation results on several image classification benchmarks, we refer to tables 1 and 2 of the original paper. Note that for fine-tuning, the best results are obtained with a higher resolution. Of course, increasing the model size will result in better performance.
|
123 |
+
|
124 |
+
### BibTeX entry and citation info
|
125 |
+
|
126 |
+
```@article{DBLP:journals/corr/abs-2106-08254,
|
127 |
+
author = {Hangbo Bao and
|
128 |
+
Li Dong and
|
129 |
+
Furu Wei},
|
130 |
+
title = {BEiT: {BERT} Pre-Training of Image Transformers},
|
131 |
+
journal = {CoRR},
|
132 |
+
volume = {abs/2106.08254},
|
133 |
+
year = {2021},
|
134 |
+
url = {https://arxiv.org/abs/2106.08254},
|
135 |
+
archivePrefix = {arXiv},
|
136 |
+
eprint = {2106.08254},
|
137 |
+
timestamp = {Tue, 29 Jun 2021 16:55:04 +0200},
|
138 |
+
biburl = {https://dblp.org/rec/journals/corr/abs-2106-08254.bib},
|
139 |
+
bibsource = {dblp computer science bibliography, https://dblp.org}
|
140 |
+
}
|
141 |
+
```
|
142 |
+
|
143 |
### Training Details
|
144 |
|
145 |
**Compute:** Darwin-AI v0.19 was trained on A100 80GB vRAM instances for approximately 500 total GPU hours. The average cost for each GPU hour was around $0.80, so the total cost was around $400.
|