Update README_code_model.md
Browse files- README_code_model.md +16 -0
README_code_model.md
CHANGED
@@ -17,6 +17,22 @@ tags:
|
|
17 |
CrystalChat-7B based multi-modal large language model (MLLM) mimics the training recipe used for Vicuna-7B based [LLaVa-v1.5](https://huggingface.co/docs/transformers/main/model_doc/llava). CrystalChat-7B based MLLMs models are entirely transparent, having open-sourced all materials, including code, data, model checkpoint, intermediate results, and more at [TODO: Add paper link](). CrystalChat-7B-Web2Code MLLM is specialized in webpage images-to-html code generation.
|
18 |
|
19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
### About CrystalChat-7B-Web2Code:
|
21 |
* 7 billion parameter LLM
|
22 |
* CLIP ViT-L/14-336px vision encoder
|
|
|
17 |
CrystalChat-7B based multi-modal large language model (MLLM) mimics the training recipe used for Vicuna-7B based [LLaVa-v1.5](https://huggingface.co/docs/transformers/main/model_doc/llava). CrystalChat-7B based MLLMs models are entirely transparent, having open-sourced all materials, including code, data, model checkpoint, intermediate results, and more at [TODO: Add paper link](). CrystalChat-7B-Web2Code MLLM is specialized in webpage images-to-html code generation.
|
18 |
|
19 |
|
20 |
+
## Evaluations
|
21 |
+
|
22 |
+
| LLM Backbone | DWCG | DWU | DWCG<sub>R</sub> | DWU<sub>R</sub> | VSA β | CAD β | TCC β | UII β | Overall β |
|
23 |
+
|------------------------|------|-----|------------------|------------------|--------|--------|--------|--------|------------|
|
24 |
+
| **CrystalChat-7B** | | | | | 4.714 | 4.572 | 4.865 | 5.147 | 4.825 |
|
25 |
+
| | β | | | | 7.900 | 8.001 | 8.204 | 8.215 | 8.080 |
|
26 |
+
| | β | β | | | 7.900 | 8.001 | 8.204 | 8.215 | 8.080 |
|
27 |
+
| | β | β | β | β | **8.384** | **8.287** | **8.417** | **8.488** | **8.394** |
|
28 |
+
| **Vicuna-7B** | | | | | 3.042 | 3.250 | 3.333 | 3.167 | 3.198 |
|
29 |
+
| | β | | | | 6.871 | 6.660 | 6.589 | 6.897 | 6.754 |
|
30 |
+
| | | β | | | 3.898 | 3.489 | 3.340 | 3.651 | 3.595 |
|
31 |
+
| | β | β | β | β | **7.876** | **7.687** | **7.267** | **7.563** | **7.598** |
|
32 |
+
|
33 |
+
**Table:** Performance comparison of various LLM Backbones with different datasets. The arrows (β) indicate that higher values are better.
|
34 |
+
|
35 |
+
|
36 |
### About CrystalChat-7B-Web2Code:
|
37 |
* 7 billion parameter LLM
|
38 |
* CLIP ViT-L/14-336px vision encoder
|