Spaces:
Paused
Paused
Xin Lai
commited on
Commit
·
69812b4
1
Parent(s):
c85567c
Update README.md
Browse filesFormer-commit-id: 83bd1145fdd2f20f02446944fbd36633fe982e6d
README.md
CHANGED
@@ -1,10 +1,10 @@
|
|
1 |
# LISA: Reasoning Segmentation via Large Language Model
|
2 |
|
3 |
-
<
|
4 |
|
5 |
-
<font size=50><div align='center'
|
6 |
|
7 |
-
|
8 |
|
9 |
<p align="center"> <img src="imgs/teaser.png" width="100%"> </p>
|
10 |
|
@@ -17,10 +17,6 @@
|
|
17 |
- [ ] ReasonSeg Dataset Release
|
18 |
- [ ] Training Code Release
|
19 |
|
20 |
-
## Abstract
|
21 |
-
In this work, we propose a new segmentation task --- ***reasoning segmentation***. The task is designed to output a segmentation mask given a complex and implicit query text. We establish a benchmark comprising over one thousand image-instruction pairs, incorporating intricate reasoning and world knowledge for evaluation purposes. Finally, we present LISA: Large-language Instructed Segmentation Assistant, which inherits the language generation capabilities of the multi-modal Large Language Model (LLM) while also possessing the ability to produce segmentation masks.
|
22 |
-
For more details, please refer to:
|
23 |
-
|
24 |
**LISA: Reasoning Segmentation Via Large Language Model [[Paper](https://arxiv.org/pdf/2308.00692.pdf)]** <br />
|
25 |
[Xin Lai](https://scholar.google.com/citations?user=tqNDPA4AAAAJ&hl=zh-CN),
|
26 |
[Zhuotao Tian](https://scholar.google.com/citations?user=mEjhz-IAAAAJ&hl=en),
|
@@ -30,7 +26,9 @@ For more details, please refer to:
|
|
30 |
[Shu Liu](https://scholar.google.com.hk/citations?user=BUEDUFkAAAAJ&hl=zh-CN),
|
31 |
[Jiaya Jia](https://scholar.google.com/citations?user=XPAkzTEAAAAJ&hl=en)<br />
|
32 |
|
33 |
-
|
|
|
|
|
34 |
|
35 |
## Highlights
|
36 |
**LISA** unlocks the new segmentation capabilities of multi-modal LLMs, and can handle cases involving:
|
|
|
1 |
# LISA: Reasoning Segmentation via Large Language Model
|
2 |
|
3 |
+
<div align='center'><b>LISA</b>: Large <b>L</b>anguage <b>I</b>nstructed <b>S</b>egmentation <b>A</b>ssistant</div>
|
4 |
|
5 |
+
<font size=50><div align='center' > <a href=https://arxiv.org/abs/2308.00692>Paper</a> | <a href=https://huggingface.co/xinlai/LISA-13B-llama2-v0>Model</a> | <a>Demo (Comming Soon)</a> </div></font>
|
6 |
|
7 |
+
<p align="center"> <img src="imgs/fig_overview.png" width="100%"> </p>
|
8 |
|
9 |
<p align="center"> <img src="imgs/teaser.png" width="100%"> </p>
|
10 |
|
|
|
17 |
- [ ] ReasonSeg Dataset Release
|
18 |
- [ ] Training Code Release
|
19 |
|
|
|
|
|
|
|
|
|
20 |
**LISA: Reasoning Segmentation Via Large Language Model [[Paper](https://arxiv.org/pdf/2308.00692.pdf)]** <br />
|
21 |
[Xin Lai](https://scholar.google.com/citations?user=tqNDPA4AAAAJ&hl=zh-CN),
|
22 |
[Zhuotao Tian](https://scholar.google.com/citations?user=mEjhz-IAAAAJ&hl=en),
|
|
|
26 |
[Shu Liu](https://scholar.google.com.hk/citations?user=BUEDUFkAAAAJ&hl=zh-CN),
|
27 |
[Jiaya Jia](https://scholar.google.com/citations?user=XPAkzTEAAAAJ&hl=en)<br />
|
28 |
|
29 |
+
## Abstract
|
30 |
+
In this work, we propose a new segmentation task --- ***reasoning segmentation***. The task is designed to output a segmentation mask given a complex and implicit query text. We establish a benchmark comprising over one thousand image-instruction pairs, incorporating intricate reasoning and world knowledge for evaluation purposes. Finally, we present LISA: Large-language Instructed Segmentation Assistant, which inherits the language generation capabilities of the multi-modal Large Language Model (LLM) while also possessing the ability to produce segmentation masks.
|
31 |
+
For more details, please refer to:
|
32 |
|
33 |
## Highlights
|
34 |
**LISA** unlocks the new segmentation capabilities of multi-modal LLMs, and can handle cases involving:
|