Deepseek-Multimodal

Running on Zero

App Files Files Community

luigi12345 commited on Feb 8

Commit

d6e21f9

verified ·

1 Parent(s): b9ebc0b

Update app.py

Browse files

Files changed (1) hide show

app.py +62 -3

app.py CHANGED Viewed

@@ -63,7 +63,7 @@ def multimodal_understanding(image, question, seed, top_p, temperature):
         pad_token_id=tokenizer.eos_token_id,
         bos_token_id=tokenizer.bos_token_id,
         eos_token_id=tokenizer.eos_token_id,
-        max_new_tokens=512,
         do_sample=False if temperature == 0 else True,
         use_cache=True,
         temperature=temperature,
@@ -193,8 +193,67 @@ with gr.Blocks() as demo:
                 "doge.png",
             ],
             [
-                "Convert the formula into latex code.",
-                "equation.png",
             ],
         ],
         inputs=[question_input, image_input],

         pad_token_id=tokenizer.eos_token_id,
         bos_token_id=tokenizer.bos_token_id,
         eos_token_id=tokenizer.eos_token_id,
+        max_new_tokens=4000,
         do_sample=False if temperature == 0 else True,
         use_cache=True,
         temperature=temperature,
                 "doge.png",
             ],
             [
+                """Analyze the provided fundus image in exhaustive detail, following the standard ophthalmological protocol for fundus examination.  Output an HTML report structured as a formal medical document.  The report MUST:
+1.  **Image Quality Assessment:** Begin with a concise assessment of image quality, noting focus, illumination, field of view, and any artifacts (and their impact on assessability).
+2.  **Detailed Clinical Findings:**  Describe each of the following areas with the utmost precision and specificity, using proper ophthalmological terminology:
+    *   **Optic Disc:**
+        *   Size and shape (including any abnormalities).
+        *   Color (specifically noting any pallor and its location).
+        *   Cup-to-Disc Ratio (CDR), providing both vertical and horizontal estimates.
+        *   Neuroretinal Rim:  Assess rim thickness in all quadrants (superior, inferior, nasal, temporal).  Explicitly state whether the ISNT rule is followed or violated.  Describe any notching or focal thinning.
+        *   Peripapillary Region:  Describe the presence/absence of peripapillary atrophy (PPA), differentiating between alpha and beta zones. Note any hemorrhages.
+    *   **Retinal Vasculature:**
+        *   Arterioles:  Describe caliber (narrowing, dilation), tortuosity, and any focal abnormalities.
+        *   Venules:  Describe caliber, tortuosity, and any abnormalities.
+        *   Arteriovenous (A/V) Ratio: Estimate the A/V ratio.
+        *   Crossing Changes:  Note any arteriovenous nicking or other crossing abnormalities.
+        *  Vessel Course: Describe the course of the major vessels, and check for abnormalities.
+    *   **Macula:**
+        *   Foveal Reflex:  Describe the presence/absence and quality of the foveal reflex.
+        *   Pigment Changes: Note any pigmentary abnormalities, drusen, or other lesions.
+        *   Edema/Exudates:  Describe any signs of macular edema or exudates.
+    *   **Peripheral Retina:**
+        *   Mid-Periphery: Describe any abnormalities (hemorrhages, exudates, tears, etc.).
+        *   Far Periphery: Note the extent of visualization and any findings.
+3.  **Differential Diagnosis:**  Based solely on the image findings, provide a prioritized differential diagnosis.  Include the most likely diagnosis and any other plausible possibilities.  For each diagnosis, explain the reasoning based on the observed features.
+4.  **Diagnostic Confidence:** Indicate the confidence level for the primary diagnosis.  List the key image findings that support the diagnosis.
+5.  **Simulated AI Attention Metrics:**  Create a table representing a *simulated* AI attention distribution.  This should reflect the expected focus areas for the most likely diagnosis, based on the known importance of different features.  Provide percentages for:
+    *   Optic Disc (Total)
+        *   Cup
+        *   Neuroretinal Rim (subdivided by region if significant differences exist)
+    *   Peripapillary Atrophy
+    *   Vessels
+    *   Macula
+    *   Periphery
+6.  **Summary and Impression:**  Provide a concise summary of the key findings and the overall impression.
+7. **Recommendations:**
+     *   Provide specific, actionable recommendations based on the image findings.
+     *    If referral is warranted, clearly state the urgency and the type of specialist.
+     *   List any recommended investigations (e.g., OCT, visual fields).
+8.  **Disclaimer:** Include a disclaimer stating that the report is based on image analysis alone and does not replace a full clinical examination.
+9. **HTML Structure:**  Use semantic HTML elements (h1-h3, p, ol, ul, table, div) to create a well-structured, readable report.  Include:
+    *  A report header with a title ("EyeUnit.ai | AI for Ophthalmology") and a logo placeholder.
+    *  An image comparison section displaying the original fundus image and a placeholder for a heatmap (a canvas element with id "heatmapCanvas"). No actual heatmap generation is required; the canvas is a placeholder.
+    * A placeholder for patient information(PATIENT ID, NAME, AGE, DATE OF EXAM)
+    *  Clearly labeled sections for each part of the analysis.
+    *  Tables for the "Overall Analysis Coverage" and "AI-Driven Attention Metrics."
+10. **CSS Styling:**  Apply CSS styles to make the report visually appealing and professional.  The report should be suitable for both screen viewing and printing (use a `@media print` block to optimize for print).
+     * **Crucial Details:**
+     * **PATIENT ID, NAME, AGE and DATE OF EXAM**
+11. **Crucial Details:** Output ONLY the complete HTML code. Do not provide any surrounding text or explanations. Focus solely on generating the HTML report.
+12. **IMG SOURCE:** Use this image as the image source: `<img src="https://i.imgur.com/kZ35oQV.jpg" alt="Original Fundus
+"""      ,          "equation.png",
             ],
         ],
         inputs=[question_input, image_input],