Spaces:

ronakreddy18
/

Zerotoheroinmachinelearning

Sleeping

App Files Files Community

ronakreddy18 commited on Dec 18, 2024

Commit

d3d0026

verified ·

1 Parent(s): 3ae2c67

Update pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py

Browse files

Files changed (1) hide show

pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py +52 -61

pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py CHANGED Viewed

@@ -132,44 +132,38 @@ print(excel_file.sheet_names)
         st.session_state.page = "structured_data"
 # ----------------- Unstructured Data Page -----------------
 def unstructured_data_page():
     st.title(":blue[Unstructured Data]")
     st.markdown("""
-    **Unstructured data** does not have a predefined format. It consists of various data types like text, images, videos, and audio files.
     Examples include:
-    - Text documents (e.g., .txt, .docx)
     - Images (e.g., .jpg, .png)
     - Videos (e.g., .mp4, .avi)
-    - Audio files (e.g., .mp3, .wav)
     - Social media posts
     """)
-    st.header("📄 Handling Text Data")
-    st.markdown("""
-    Text data can be analyzed using Natural Language Processing (NLP) techniques.
-    """)
-    st.code("""
-# Reading text data
-with open('sample.txt', 'r') as file:
-    text = file.read()
-    print(text)
-# Basic text processing using NLTK
-import nltk
-from nltk.tokenize import word_tokenize
-nltk.download('punkt')
-tokens = word_tokenize(text)
-print(tokens)
-    """, language='python')
     st.header("🖼️ Handling Image Data")
     st.markdown("""
-    Image data can be processed using libraries like OpenCV and PIL (Pillow).
     """)
     st.code("""
 from PIL import Image
 # Open an image file
 image = Image.open('sample_image.jpg')
@@ -178,64 +172,61 @@ image.show()
 # Convert image to grayscale
 gray_image = image.convert('L')
 gray_image.show()
-    """, language='python')
-    st.header("🎥 Handling Video Data")
-    st.markdown("""
-    Videos can be processed frame by frame using OpenCV.
-    """)
-    st.code("""
-import cv2
-# Capture video
-video = cv2.VideoCapture('sample_video.mp4')
-while video.isOpened():
-    ret, frame = video.read()
-    if not ret:
-        break
-    cv2.imshow('Frame', frame)
-    if cv2.waitKey(25) & 0xFF == ord('q'):
-        break
-video.release()
-cv2.destroyAllWindows()
     """, language='python')
-    st.header("🔊 Handling Audio Data")
     st.markdown("""
-    Audio data can be handled using libraries like librosa.
     """)
-    st.code("""
-import librosa
-import librosa.display
-import matplotlib.pyplot as plt
-# Load audio file
-y, sr = librosa.load('sample_audio.mp3')
-librosa.display.waveshow(y, sr=sr)
-plt.title('Waveform')
-plt.show()
-    """, language='python')
     st.markdown("### Challenges with Unstructured Data")
     st.write("""
-    - **Noise and Inconsistency**: Data is often incomplete or noisy.
-    - **Storage Requirements**: Large size and variability in data types.
-    - **Processing Time**: Analyzing unstructured data is computationally expensive.
     """)
     st.markdown("### Solutions")
     st.write("""
-    - **Data Cleaning**: Preprocess data to remove noise.
-    - **Efficient Storage**: Use NoSQL databases (e.g., MongoDB) or cloud storage.
-    - **Parallel Processing**: Utilize frameworks like Apache Spark.
     """)
-    # Back to Data Collection
     if st.button("Back to Data Collection"):
         st.session_state.page = "data_collection"
 # ----------------- Semi-Structured Data Page -----------------
 def semi_structured_data_page():
     st.title(":orange[Semi-Structured Data]")

         st.session_state.page = "structured_data"
 # ----------------- Unstructured Data Page -----------------
+from PIL import Image
+import numpy as np
+import matplotlib.pyplot as plt
 def unstructured_data_page():
     st.title(":blue[Unstructured Data]")
     st.markdown("""
+    *Unstructured data* does not have a predefined format. It consists of various data types like text, images, videos, and audio files.
     Examples include:
     - Images (e.g., .jpg, .png)
     - Videos (e.g., .mp4, .avi)
     - Social media posts
     """)
+    ### Handling Image Data Section
     st.header("🖼️ Handling Image Data")
     st.markdown("""
+    Image data can be processed using libraries like OpenCV and PIL (Pillow). Images often need to be preprocessed for tasks like analysis, classification, or feature extraction. Common operations include:
+    - **Reading and displaying images**
+    - **Converting to grayscale**
+    - **Resizing and cropping**
+    - **Rotating and flipping**
+    - **Applying filters**
+    - **Edge detection and other transformations**
     """)
     st.code("""
 from PIL import Image
+import numpy as np
+import matplotlib.pyplot as plt
 # Open an image file
 image = Image.open('sample_image.jpg')
 # Convert image to grayscale
 gray_image = image.convert('L')
 gray_image.show()
+# Resize the image
+resized_image = image.resize((200, 200))
+resized_image.show()
+# Rotate the image by 90 degrees
+rotated_image = image.rotate(90)
+rotated_image.show()
+# Convert the image to a NumPy array and display its shape
+image_array = np.array(image)
+print(image_array.shape)
+# Display the image array as a plot
+plt.imshow(image)
+plt.title("Original Image")
+plt.axis('off')
+plt.show()
     """, language='python')
     st.markdown("""
+    **Common Image Processing Techniques:**
+    - **Resizing**: Adjust the dimensions of an image for uniformity in models.
+    - **Cropping**: Extract a region of interest (ROI) from an image.
+    - **Grayscale Conversion**: Simplify image data by reducing it to a single channel.
+    - **Rotation/Flipping**: Perform augmentations to increase the dataset for model training.
+    - **Edge Detection**: Identify edges in images using filters like the Sobel or Canny filters.
     """)
+    ### Challenges and Solutions Section
     st.markdown("### Challenges with Unstructured Data")
     st.write("""
+    - *Noise and Inconsistency*: Data is often incomplete or noisy.
+    - *Storage Requirements*: Large size and variability in data types.
+    - *Processing Time*: Analyzing unstructured data is computationally expensive.
     """)
     st.markdown("### Solutions")
     st.write("""
+    - *Data Cleaning*: Preprocess data to remove noise.
+    - *Efficient Storage*: Use NoSQL databases (e.g., MongoDB) or cloud storage.
+    - *Parallel Processing*: Utilize frameworks like Apache Spark.
     """)
+    # Button to Navigate to Introduction to Image
+    if st.button("Introduction to Image"):
+        st.session_state.page = "introduction_to_image"
+    # Navigation Button
     if st.button("Back to Data Collection"):
         st.session_state.page = "data_collection"
 # ----------------- Semi-Structured Data Page -----------------
 def semi_structured_data_page():
     st.title(":orange[Semi-Structured Data]")