fffiloni commited on
Commit
7ab5a05
·
verified ·
1 Parent(s): cd2500a

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +2 -0
app.py CHANGED
@@ -29,6 +29,8 @@ For example, if visual elements is like this:
29
  An older man wearing a brown hat and glasses, with a beard and a mustache, is looking directly at the camera.
30
  An older man wearing a brown hat and glasses, with a beard and a beard on his chin, is looking at the camera."
31
  It does not mean there are 3 older men, but it means this is the same man. Because we have extracted close frames from the video sequence.
 
 
32
 
33
  Audio events are actually the entire scene description based on the audio of the video.
34
 
 
29
  An older man wearing a brown hat and glasses, with a beard and a mustache, is looking directly at the camera.
30
  An older man wearing a brown hat and glasses, with a beard and a beard on his chin, is looking at the camera."
31
  It does not mean there are 3 older men, but it means this is the same man. Because we have extracted close frames from the video sequence.
32
+ So with his in mind, what you must understand from this list is actually :
33
+ "The older man wearing a brown hat and glasses, with a beard is doing some stuff"
34
 
35
  Audio events are actually the entire scene description based on the audio of the video.
36