fffiloni commited on
Commit
cbef701
·
verified ·
1 Parent(s): 918d818

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +8 -30
app.py CHANGED
@@ -23,36 +23,14 @@ You will be provided a list of visual events, and an audio description. All thes
23
  List of visual events are actually extracted from this video every 12 frames.
24
  These visual infos are extracted from the video that is usually a short sequence.
25
 
26
- As a smart assistant, you must understand that Repetitive visual element of the same person or group of subject means that it is the same person/subject, filmed without cut.
27
- For example, if visual elements is like this:
28
- "A man wearing glasses sits on a train and looks out the window.
29
- A man with glasses and a beard is looking to the right.
30
- A man is wearing glasses and a beret.
31
- A man is wearing glasses and a beret.
32
- A man wearing glasses and a beret is looking to the right.
33
- A man with glasses and a beard is looking to the left.
34
- A man with glasses and a beard is looking to the right.
35
- A man with glasses is wearing a brown jacket.
36
- A man is wearing glasses and a beret.
37
- A man with glasses and a beard is looking to the left.
38
- A man with glasses and a beard is looking to the left."
39
-
40
- What you MUST understand from this list is actually :
41
- "The man wearing glasses sits on a train and looks out the window.
42
- The man with glasses and a beard is looking to the right.
43
- The man is wearing glasses and a beret.
44
- The man is wearing glasses and a beret.
45
- The man wearing glasses and a beret is looking to the right.
46
- The man with glasses and a beard is looking to the left.
47
- The man with glasses and a beard is looking to the right.
48
- The man with glasses is wearing a brown jacket.
49
- The man is wearing glasses and a beret.
50
- The man with glasses and a beard is looking to the left.
51
- The man with glasses and a beard is looking to the left."
52
-
53
- Audio events are actually the entire scene description based on the audio of the video.
54
-
55
- Your job is to use these informations to smartly deduce and provide a very short resume about what is happening in the video.
56
  Provide a short resume about what you understood.
57
 
58
  """
 
23
  List of visual events are actually extracted from this video every 12 frames.
24
  These visual infos are extracted from the video that is usually a short sequence.
25
 
26
+ Please note that the following list of image descriptions (visual events) was generated by extracting individual frames from a continuous video featuring one or more subjects.
27
+ Depending on the case, all depicted individuals may correspond to the same person(s), with minor variations due to changes in lighting, angle, and facial expressions over time.
28
+ Alternatively, the video may show distinct individuals who share similarities within the given set of descriptors.
29
+ Regardless, assume temporal continuity among the frames unless otherwise specified.
30
+
31
+ Audio events are actually the entire scene description based only on the audio of the video.
32
+
33
+ Your job is to use these informations to smartly deduce and provide a very short resume about what is happening in the origin video.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  Provide a short resume about what you understood.
35
 
36
  """