Spaces:
Paused
Paused
x-lai
commited on
Commit
·
077a7c2
1
Parent(s):
ac3ddc8
update README.md
Browse filesFormer-commit-id: 553af2f52342569706b7c82b005f6f66617f0934
README.md
CHANGED
@@ -1,7 +1,19 @@
|
|
1 |
# LISA: Reasoning Segmentation Via Large Language Model
|
2 |
|
3 |
-
This is the official implementation of ***LISA (
|
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
In this work, we propose a new segmentation task --- ***reasoning segmentation***. The task is designed to output a segmentation mask given a complex and implicit query text. We establish a benchmark comprising over one thousand image-instruction pairs, incorporating intricate reasoning and world knowledge for evaluation purposes. Finally, we present LISA: Large-language Instructed Segmentation Assistant, which inherits the language generation capabilities of the multi-modal Large Language Model (LLM) while also possessing the ability to produce segmentation masks.
|
6 |
For more details, please refer to:
|
7 |
|
@@ -16,15 +28,9 @@ For more details, please refer to:
|
|
16 |
|
17 |
<p align="center"> <img src="imgs/fig_overview_v6_crop.png" width="100%"> </p>
|
18 |
|
19 |
-
|
20 |
-
LISA can handle cases involving: 1) complex reasoning; 2) world knowledge; 3) explanatory answers; 4) multi-turn conversation.It demonstrates robust zero-shot capability when trained exclusively on reasoning-free datasets.
|
21 |
-
<p align="center"> <img src="imgs/fig_teaser4_crop.png" width="100%"> </p>
|
22 |
-
|
23 |
<p align="center"> <img src="imgs/Table1.png" width="80%"> </p>
|
24 |
|
25 |
-
### Others
|
26 |
-
Code and models will be released in the future.
|
27 |
-
|
28 |
## Citation
|
29 |
If you find this project useful in your research, please consider citing:
|
30 |
|
@@ -38,6 +44,5 @@ If you find this project useful in your research, please consider citing:
|
|
38 |
|
39 |
```
|
40 |
|
41 |
-
|
42 |
## Acknowledgement
|
43 |
- This work is built upon the [LLaMA](https://github.com/facebookresearch/llama), [SAM](https://github.com/facebookresearch/segment-anything), and [LLaVA](https://github.com/haotian-liu/LLaVA).
|
|
|
1 |
# LISA: Reasoning Segmentation Via Large Language Model
|
2 |
|
3 |
+
This is the official implementation of ***LISA (large Language Instructed Segmentation Assistant)***.
|
4 |
|
5 |
+
## News
|
6 |
+
- [x] [2023.8.2] Paper is released and github repo is created.
|
7 |
+
|
8 |
+
## TODO
|
9 |
+
- [ ] Huggingface Demo
|
10 |
+
- [ ] ReasonSeg Dataset Release
|
11 |
+
- [ ] Codes and models Release
|
12 |
+
|
13 |
+
LISA can handle cases involving: 1) complex reasoning; 2) world knowledge; 3) explanatory answers; 4) multi-turn conversation.It demonstrates robust zero-shot capability when trained exclusively on reasoning-free datasets.
|
14 |
+
<p align="center"> <img src="imgs/fig_teaser4_crop.png" width="100%"> </p>
|
15 |
+
|
16 |
+
## Abstract
|
17 |
In this work, we propose a new segmentation task --- ***reasoning segmentation***. The task is designed to output a segmentation mask given a complex and implicit query text. We establish a benchmark comprising over one thousand image-instruction pairs, incorporating intricate reasoning and world knowledge for evaluation purposes. Finally, we present LISA: Large-language Instructed Segmentation Assistant, which inherits the language generation capabilities of the multi-modal Large Language Model (LLM) while also possessing the ability to produce segmentation masks.
|
18 |
For more details, please refer to:
|
19 |
|
|
|
28 |
|
29 |
<p align="center"> <img src="imgs/fig_overview_v6_crop.png" width="100%"> </p>
|
30 |
|
31 |
+
## Experimental results
|
|
|
|
|
|
|
32 |
<p align="center"> <img src="imgs/Table1.png" width="80%"> </p>
|
33 |
|
|
|
|
|
|
|
34 |
## Citation
|
35 |
If you find this project useful in your research, please consider citing:
|
36 |
|
|
|
44 |
|
45 |
```
|
46 |
|
|
|
47 |
## Acknowledgement
|
48 |
- This work is built upon the [LLaMA](https://github.com/facebookresearch/llama), [SAM](https://github.com/facebookresearch/segment-anything), and [LLaVA](https://github.com/haotian-liu/LLaVA).
|