leuschnm commited on
Commit
cc56f52
·
1 Parent(s): a555162

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +34 -30
app.py CHANGED
@@ -93,36 +93,39 @@ def predict(img):
93
 
94
 
95
  with gr.Blocks() as demo:
96
- gr.Markdown("""
97
- # Crowd Counting based on SASNet
98
- <p>
99
- This space implements crowd counting following the paper of Song et. al (2021). The model is a VGG16 base with MultiBranch-Channels. For more details see the official publication on AAAI.
100
- Training data is the Shanghai-Tech A/B data set with Gaussian augmentation for density map creation. The data set annotates more than 300k people.
101
- </p>
102
-
103
- ## Abstract
104
- <p>
105
- In this paper, we address the large scale variation problem in crowd counting by taking full advantage of the multi-scale feature representations in a multi-level network. We
106
- implement such an idea by keeping the counting error of a patch as small as possible with a proper feature level selection strategy, since a specific feature level tends to perform
107
- better for a certain range of scales. However, without scale annotations, it is sub-optimal and error-prone to manually assign the predictions for heads of different scales to
108
- specific feature levels. Therefore, we propose a Scale-Adaptive Selection Network (SASNet), which automatically learns the internal correspondence between the scales and the feature
109
- levels. Instead of directly using the predictions from the most appropriate feature level as the final estimation, our SASNet also considers the predictions from other feature
110
- levels via weighted average, which helps to mitigate the gap between discrete feature levels and continuous scale variation. Since the heads in a local patch share roughly a same
111
- scale, we conduct the adaptive selection strategy in a patch-wise style. However, pixels within a patch contribute different counting errors due to the various difficulty degrees of
112
- learning. Thus, we further propose a Pyramid Region Awareness Loss (PRA Loss) to recursively select the most hard sub-regions within a patch until reaching the pixel level. With
113
- awareness of whether the parent patch is over-estimated or under-estimated, the fine-grained optimization with the PRA Loss for these region-aware hard pixels helps to alleviate the
114
- inconsistency problem between training target and evaluation metric. The state-of-the-art results on four datasets demonstrate the superiority of our approach.
115
- </p>
116
-
117
- ## Demo
118
- """)
 
 
119
  with gr.Row():
120
  with gr.Column():
121
  gr.Markdown(
122
- """
123
  Upload an image or use some of the example to let the model count your crowd. The estimated density map is plotted as well. Have fun!
124
  Visit my [**github**](https://github.com/MalteLeuschner/CrowdCounting_SASNet) for more!
125
- """
 
126
  with gr.Column():
127
  text_output = gr.Label()
128
  with gr.Row():
@@ -140,11 +143,12 @@ with gr.Blocks() as demo:
140
 
141
  gr.Examples(["IMG_1.jpg", "IMG_2.jpg", "IMG_3.jpg"], image_input)
142
 
143
- gr.Markdown("""
144
- ## References
145
- The code will be available at: https://github.com/TencentYoutuResearch/CrowdCounting-SASNet.
146
-
147
- Song, Q., Wang, C., Wang, Y., Tai, Y., Wang, C., Li, J., … Ma, J. (2021). To Choose or to Fuse? Scale Selection for Crowd Counting. The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21).
 
148
  """)
149
 
150
  image_button.click(predict, inputs=image_input, outputs=[text_output, image_output])
 
93
 
94
 
95
  with gr.Blocks() as demo:
96
+ gr.Markdown(
97
+ """
98
+ # Crowd Counting based on SASNet
99
+ <p>
100
+ This space implements crowd counting following the paper of Song et. al (2021). The model is a VGG16 base with MultiBranch-Channels. For more details see the official publication on AAAI.
101
+ Training data is the Shanghai-Tech A/B data set with Gaussian augmentation for density map creation. The data set annotates more than 300k people.
102
+ </p>
103
+
104
+ ## Abstract
105
+ <p>
106
+ In this paper, we address the large scale variation problem in crowd counting by taking full advantage of the multi-scale feature representations in a multi-level network. We
107
+ implement such an idea by keeping the counting error of a patch as small as possible with a proper feature level selection strategy, since a specific feature level tends to perform
108
+ better for a certain range of scales. However, without scale annotations, it is sub-optimal and error-prone to manually assign the predictions for heads of different scales to
109
+ specific feature levels. Therefore, we propose a Scale-Adaptive Selection Network (SASNet), which automatically learns the internal correspondence between the scales and the feature
110
+ levels. Instead of directly using the predictions from the most appropriate feature level as the final estimation, our SASNet also considers the predictions from other feature
111
+ levels via weighted average, which helps to mitigate the gap between discrete feature levels and continuous scale variation. Since the heads in a local patch share roughly a same
112
+ scale, we conduct the adaptive selection strategy in a patch-wise style. However, pixels within a patch contribute different counting errors due to the various difficulty degrees of
113
+ learning. Thus, we further propose a Pyramid Region Awareness Loss (PRA Loss) to recursively select the most hard sub-regions within a patch until reaching the pixel level. With
114
+ awareness of whether the parent patch is over-estimated or under-estimated, the fine-grained optimization with the PRA Loss for these region-aware hard pixels helps to alleviate the
115
+ inconsistency problem between training target and evaluation metric. The state-of-the-art results on four datasets demonstrate the superiority of our approach.
116
+ </p>
117
+
118
+ ## Demo
119
+ """
120
+ )
121
  with gr.Row():
122
  with gr.Column():
123
  gr.Markdown(
124
+ """
125
  Upload an image or use some of the example to let the model count your crowd. The estimated density map is plotted as well. Have fun!
126
  Visit my [**github**](https://github.com/MalteLeuschner/CrowdCounting_SASNet) for more!
127
+ """
128
+ )
129
  with gr.Column():
130
  text_output = gr.Label()
131
  with gr.Row():
 
143
 
144
  gr.Examples(["IMG_1.jpg", "IMG_2.jpg", "IMG_3.jpg"], image_input)
145
 
146
+ gr.Markdown(
147
+ """
148
+ ## References
149
+ The code will be available at: https://github.com/TencentYoutuResearch/CrowdCounting-SASNet.
150
+
151
+ Song, Q., Wang, C., Wang, Y., Tai, Y., Wang, C., Li, J., … Ma, J. (2021). To Choose or to Fuse? Scale Selection for Crowd Counting. The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21).
152
  """)
153
 
154
  image_button.click(predict, inputs=image_input, outputs=[text_output, image_output])