Update app.py
Browse files
app.py
CHANGED
@@ -93,36 +93,39 @@ def predict(img):
|
|
93 |
|
94 |
|
95 |
with gr.Blocks() as demo:
|
96 |
-
gr.Markdown(
|
97 |
-
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
|
103 |
-
|
104 |
-
|
105 |
-
|
106 |
-
|
107 |
-
|
108 |
-
|
109 |
-
|
110 |
-
|
111 |
-
|
112 |
-
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
|
117 |
-
|
118 |
-
|
|
|
|
|
119 |
with gr.Row():
|
120 |
with gr.Column():
|
121 |
gr.Markdown(
|
122 |
-
|
123 |
Upload an image or use some of the example to let the model count your crowd. The estimated density map is plotted as well. Have fun!
|
124 |
Visit my [**github**](https://github.com/MalteLeuschner/CrowdCounting_SASNet) for more!
|
125 |
-
|
|
|
126 |
with gr.Column():
|
127 |
text_output = gr.Label()
|
128 |
with gr.Row():
|
@@ -140,11 +143,12 @@ with gr.Blocks() as demo:
|
|
140 |
|
141 |
gr.Examples(["IMG_1.jpg", "IMG_2.jpg", "IMG_3.jpg"], image_input)
|
142 |
|
143 |
-
gr.Markdown(
|
144 |
-
|
145 |
-
|
146 |
-
|
147 |
-
|
|
|
148 |
""")
|
149 |
|
150 |
image_button.click(predict, inputs=image_input, outputs=[text_output, image_output])
|
|
|
93 |
|
94 |
|
95 |
with gr.Blocks() as demo:
|
96 |
+
gr.Markdown(
|
97 |
+
"""
|
98 |
+
# Crowd Counting based on SASNet
|
99 |
+
<p>
|
100 |
+
This space implements crowd counting following the paper of Song et. al (2021). The model is a VGG16 base with MultiBranch-Channels. For more details see the official publication on AAAI.
|
101 |
+
Training data is the Shanghai-Tech A/B data set with Gaussian augmentation for density map creation. The data set annotates more than 300k people.
|
102 |
+
</p>
|
103 |
+
|
104 |
+
## Abstract
|
105 |
+
<p>
|
106 |
+
In this paper, we address the large scale variation problem in crowd counting by taking full advantage of the multi-scale feature representations in a multi-level network. We
|
107 |
+
implement such an idea by keeping the counting error of a patch as small as possible with a proper feature level selection strategy, since a specific feature level tends to perform
|
108 |
+
better for a certain range of scales. However, without scale annotations, it is sub-optimal and error-prone to manually assign the predictions for heads of different scales to
|
109 |
+
specific feature levels. Therefore, we propose a Scale-Adaptive Selection Network (SASNet), which automatically learns the internal correspondence between the scales and the feature
|
110 |
+
levels. Instead of directly using the predictions from the most appropriate feature level as the final estimation, our SASNet also considers the predictions from other feature
|
111 |
+
levels via weighted average, which helps to mitigate the gap between discrete feature levels and continuous scale variation. Since the heads in a local patch share roughly a same
|
112 |
+
scale, we conduct the adaptive selection strategy in a patch-wise style. However, pixels within a patch contribute different counting errors due to the various difficulty degrees of
|
113 |
+
learning. Thus, we further propose a Pyramid Region Awareness Loss (PRA Loss) to recursively select the most hard sub-regions within a patch until reaching the pixel level. With
|
114 |
+
awareness of whether the parent patch is over-estimated or under-estimated, the fine-grained optimization with the PRA Loss for these region-aware hard pixels helps to alleviate the
|
115 |
+
inconsistency problem between training target and evaluation metric. The state-of-the-art results on four datasets demonstrate the superiority of our approach.
|
116 |
+
</p>
|
117 |
+
|
118 |
+
## Demo
|
119 |
+
"""
|
120 |
+
)
|
121 |
with gr.Row():
|
122 |
with gr.Column():
|
123 |
gr.Markdown(
|
124 |
+
"""
|
125 |
Upload an image or use some of the example to let the model count your crowd. The estimated density map is plotted as well. Have fun!
|
126 |
Visit my [**github**](https://github.com/MalteLeuschner/CrowdCounting_SASNet) for more!
|
127 |
+
"""
|
128 |
+
)
|
129 |
with gr.Column():
|
130 |
text_output = gr.Label()
|
131 |
with gr.Row():
|
|
|
143 |
|
144 |
gr.Examples(["IMG_1.jpg", "IMG_2.jpg", "IMG_3.jpg"], image_input)
|
145 |
|
146 |
+
gr.Markdown(
|
147 |
+
"""
|
148 |
+
## References
|
149 |
+
The code will be available at: https://github.com/TencentYoutuResearch/CrowdCounting-SASNet.
|
150 |
+
|
151 |
+
Song, Q., Wang, C., Wang, Y., Tai, Y., Wang, C., Li, J., … Ma, J. (2021). To Choose or to Fuse? Scale Selection for Crowd Counting. The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21).
|
152 |
""")
|
153 |
|
154 |
image_button.click(predict, inputs=image_input, outputs=[text_output, image_output])
|