Update app.py
Browse files
app.py
CHANGED
@@ -23,36 +23,14 @@ You will be provided a list of visual events, and an audio description. All thes
|
|
23 |
List of visual events are actually extracted from this video every 12 frames.
|
24 |
These visual infos are extracted from the video that is usually a short sequence.
|
25 |
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
A man with glasses and a beard is looking to the right.
|
35 |
-
A man with glasses is wearing a brown jacket.
|
36 |
-
A man is wearing glasses and a beret.
|
37 |
-
A man with glasses and a beard is looking to the left.
|
38 |
-
A man with glasses and a beard is looking to the left."
|
39 |
-
|
40 |
-
What you MUST understand from this list is actually :
|
41 |
-
"The man wearing glasses sits on a train and looks out the window.
|
42 |
-
The man with glasses and a beard is looking to the right.
|
43 |
-
The man is wearing glasses and a beret.
|
44 |
-
The man is wearing glasses and a beret.
|
45 |
-
The man wearing glasses and a beret is looking to the right.
|
46 |
-
The man with glasses and a beard is looking to the left.
|
47 |
-
The man with glasses and a beard is looking to the right.
|
48 |
-
The man with glasses is wearing a brown jacket.
|
49 |
-
The man is wearing glasses and a beret.
|
50 |
-
The man with glasses and a beard is looking to the left.
|
51 |
-
The man with glasses and a beard is looking to the left."
|
52 |
-
|
53 |
-
Audio events are actually the entire scene description based on the audio of the video.
|
54 |
-
|
55 |
-
Your job is to use these informations to smartly deduce and provide a very short resume about what is happening in the video.
|
56 |
Provide a short resume about what you understood.
|
57 |
|
58 |
"""
|
|
|
23 |
List of visual events are actually extracted from this video every 12 frames.
|
24 |
These visual infos are extracted from the video that is usually a short sequence.
|
25 |
|
26 |
+
Please note that the following list of image descriptions (visual events) was generated by extracting individual frames from a continuous video featuring one or more subjects.
|
27 |
+
Depending on the case, all depicted individuals may correspond to the same person(s), with minor variations due to changes in lighting, angle, and facial expressions over time.
|
28 |
+
Alternatively, the video may show distinct individuals who share similarities within the given set of descriptors.
|
29 |
+
Regardless, assume temporal continuity among the frames unless otherwise specified.
|
30 |
+
|
31 |
+
Audio events are actually the entire scene description based only on the audio of the video.
|
32 |
+
|
33 |
+
Your job is to use these informations to smartly deduce and provide a very short resume about what is happening in the origin video.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
Provide a short resume about what you understood.
|
35 |
|
36 |
"""
|