Spaces:

facebook
/

vggt

Running on Zero

App Files Files Community

JianyuanWang commited on Mar 15

Commit

0008a58

1 Parent(s): 908a5b7

update head

Browse files

Files changed (1) hide show

app.py +20 -17

app.py CHANGED Viewed

@@ -379,41 +379,44 @@ with gr.Blocks(
     is_example = gr.Textbox(label="is_example", visible=False, value="None")
     num_images = gr.Textbox(label="num_images", visible=False, value="None")
-    gr.Markdown(
-        """
-    # 🏛️ VGGT: Visual Geometry Grounded Transformer
-    [🐙 GitHub Repository](https://github.com/facebookresearch/vggt) | [Project Page]()
     <div style="font-size: 16px; line-height: 1.5;">
-    <p>Upload a video or a set of images to create a 3D reconstruction of a scene or object.  VGGT takes these images and generates a 3D point cloud, along with estimated camera poses.</p>
     <h3>Getting Started:</h3>
     <ol>
-        <li><strong>Upload Your Data:</strong> Use the "Upload Video" or "Upload Images" buttons on the left to provide your input.  Videos will be automatically split into individual frames (one frame per second).</li>
-        <li><strong>Preview:</strong>  Your uploaded images will appear in the gallery on the left.</li>
-        <li><strong>Reconstruct:</strong> Click the "Reconstruct" button to start the 3D reconstruction process.</li>
-        <li><strong>Visualize:</strong>  The 3D reconstruction will appear in the viewer on the right.  You can rotate, pan, and zoom to explore the model, and download the GLB file. Note the visualization of 3D points may be slow for large number of input images. </li>
-    <li>
         <strong>Adjust Visualization (Optional):</strong>
-        After reconstruction, you can fine-tune the visualization using the options below
         <details style="display:inline;">
-        <summary style="display:inline;">(<strong>click to expand</strong>):</summary>
-        <ul>
             <li><em>Confidence Threshold:</em> Adjust the filtering of points based on confidence.</li>
             <li><em>Show Points from Frame:</em> Select specific frames to display in the point cloud.</li>
             <li><em>Show Camera:</em> Toggle the display of estimated camera positions.</li>
             <li><em>Filter Sky / Filter Black Background:</em> Remove sky or black-background points.</li>
-            <li><em>Select a Prediction Mode:</em> Choose between "Depthmap and Camera Branch" or "Pointmap Branch."</li>
-        </ul>
         </details>
-    </li>
     </ol>
     <p><strong>Please note:</strong> Our method usually only needs less than 1 second to reconstruct a scene, but the visualization of 3D points may take tens of seconds, especially when the number of images is large. Please be patient or, for faster visualization, use a local machine to run our demo from our <a href="https://github.com/facebookresearch/vggt">GitHub repository</a>.</p>
     </div>
     """
     )
     target_dir_output = gr.Textbox(label="Target Dir", visible=False, value="None")
     with gr.Row():

     is_example = gr.Textbox(label="is_example", visible=False, value="None")
     num_images = gr.Textbox(label="num_images", visible=False, value="None")
+    gr.HTML(
+    """
+    <h1>🏛️ VGGT: Visual Geometry Grounded Transformer</h1>
+    <p>
+    <a href="https://github.com/facebookresearch/vggt">🐙 GitHub Repository</a> |
+    <a href="#">Project Page</a>
+    </p>
     <div style="font-size: 16px; line-height: 1.5;">
+    <p>Upload a video or a set of images to create a 3D reconstruction of a scene or object. VGGT takes these images and generates a 3D point cloud, along with estimated camera poses.</p>
     <h3>Getting Started:</h3>
     <ol>
+        <li><strong>Upload Your Data:</strong> Use the “Upload Video” or “Upload Images” buttons on the left to provide your input. Videos will be automatically split into individual frames (one frame per second).</li>
+        <li><strong>Preview:</strong> Your uploaded images will appear in the gallery on the left.</li>
+        <li><strong>Reconstruct:</strong> Click the “Reconstruct” button to start the 3D reconstruction process.</li>
+        <li><strong>Visualize:</strong> The 3D reconstruction will appear in the viewer on the right. You can rotate, pan, and zoom to explore the model, and download the GLB file. Note the visualization of 3D points may be slow for a large number of input images.</li>
+        <li>
         <strong>Adjust Visualization (Optional):</strong>
+        After reconstruction, you can fine-tune the visualization using the options below
         <details style="display:inline;">
+            <summary style="display:inline;">(<strong>click to expand</strong>):</summary>
+            <ul>
             <li><em>Confidence Threshold:</em> Adjust the filtering of points based on confidence.</li>
             <li><em>Show Points from Frame:</em> Select specific frames to display in the point cloud.</li>
             <li><em>Show Camera:</em> Toggle the display of estimated camera positions.</li>
             <li><em>Filter Sky / Filter Black Background:</em> Remove sky or black-background points.</li>
+            <li><em>Select a Prediction Mode:</em> Choose between “Depthmap and Camera Branch” or “Pointmap Branch.”</li>
+            </ul>
         </details>
+        </li>
     </ol>
     <p><strong>Please note:</strong> Our method usually only needs less than 1 second to reconstruct a scene, but the visualization of 3D points may take tens of seconds, especially when the number of images is large. Please be patient or, for faster visualization, use a local machine to run our demo from our <a href="https://github.com/facebookresearch/vggt">GitHub repository</a>.</p>
     </div>
     """
     )
     target_dir_output = gr.Textbox(label="Target Dir", visible=False, value="None")
     with gr.Row():