Spaces:

ronakreddy18
/

Zerotoheroinmachinelearning

Build error

App Files Files Community

ronakreddy18 commited on Dec 18, 2024

Commit

27ba035

verified ·

1 Parent(s): d3d0026

Update pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py

Browse files

Files changed (1) hide show

pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py +248 -1

pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py CHANGED Viewed

@@ -227,6 +227,252 @@ plt.show()
 # ----------------- Semi-Structured Data Page -----------------
 def semi_structured_data_page():
     st.title(":orange[Semi-Structured Data]")
@@ -273,7 +519,8 @@ import json
 data = {
     "name": "Alice",
     "age": 25,
-    "skills": ["Python", "Machine Learning"]
 }
 with open('data.json', 'w') as file:
     json.dump(data, file, indent=4)

+# ----------------- Semi-Structured Data Page -----------------
+def semi_structured_data_page():
+    st.title(":orange[Semi-Structured Data]")
+    st.markdown("""
+    Semi-structured data does not follow the rigid structure of relational databases but still has some organizational properties. Examples include:
+    - JSON files
+    - XML files
+    """)
+    if st.button(":green[💾 JSON]"):
+        st.session_state.page = "json"
+    if st.button(":green[📄 CSV]"):
+        st.session_state.page = "csv"
+    if st.button(":green[📄 XML]"):
+        st.session_state.page = "xml"
+import streamlit as st
+import pandas as pd
+import json
+import xml.etree.ElementTree as ET
+# Inject custom CSS to style the buttons
+st.markdown("""
+    <style>
+    .stButton>button {
+        background-color: #4CAF50;
+        color: white;
+        width: 100%;
+    }
+    </style>
+    """, unsafe_allow_html=True)
+# Initialize page navigation state
+if 'page' not in st.session_state:
+    st.session_state.page = "home"  # Default page is "home"
+# ----------------- Home Page -----------------
+def home_page():
+    st.title(":green[Lifecycle of a Machine Learning Project]")
+    st.markdown("Click on a stage to learn more about it.")
+    # Buttons for various stages of the ML project lifecycle
+    if st.button(":blue[📊 Data Collection]"):
+        st.session_state.page = "data_collection"
+    if st.button(":blue[🌟 Problem Statement]"):
+        st.markdown("### Problem Statement\nIdentify the problem you want to solve and set clear objectives and success criteria.")
+    if st.button(":blue[🛠️ Simple EDA]"):
+        st.markdown("### Simple EDA\nPerform exploratory data analysis to understand data distributions and relationships.")
+    if st.button(":blue[🧹 Data Pre-Processing]"):
+        st.markdown("### Data Pre-Processing\nConvert raw data into cleaned data.")
+    if st.button(":blue[📈 Exploratory Data Analysis (EDA)]"):
+        st.markdown("### Exploratory Data Analysis (EDA)\nVisualize and analyze the data to understand its distributions and relationships.")
+    if st.button(":blue[🏋️ Feature Engineering]"):
+        st.markdown("### Feature Engineering\nCreate new features from existing data.")
+    if st.button(":blue[🤖 Model Training]"):
+        st.markdown("### Model Training\nTrain the model using the training data and optimize its parameters.")
+    if st.button(":blue[🔧 Model Testing]"):
+        st.markdown("### Model Testing\nAssess the model's performance using various metrics and cross-validation techniques.")
+    if st.button(":blue[🚀 Model Deployment]"):
+        st.markdown("### Model Deployment\nIntegrate the trained model into a production environment and monitor its performance.")
+    if st.button(":blue[📝 Monitoring]"):
+        st.markdown("### Monitoring\nPeriodically retrain the model with new data and update features as needed.")
+# ----------------- Data Collection Page -----------------
+def data_collection_page():
+    st.title(":red[Data Collection]")
+    st.markdown("### Data Collection\nThis page discusses the process of Data Collection.")
+    st.markdown("Types of Data: **Structured**, **Unstructured**, **Semi-Structured**")
+    if st.button(":blue[🌟 Structured Data]"):
+        st.session_state.page = "structured_data"
+    if st.button(":blue[📷 Unstructured Data]"):
+        st.session_state.page = "unstructured_data"
+    if st.button(":blue[🗃️ Semi-Structured Data]"):
+        st.session_state.page = "semi_structured_data"
+    if st.button("Back to Home"):
+        st.session_state.page = "home"
+# ----------------- Structured Data Page -----------------
+def structured_data_page():
+    st.title(":blue[Structured Data]")
+    st.markdown("""
+    Structured data is highly organized and typically stored in tables like spreadsheets or databases. It is easy to search and analyze.
+    """)
+    st.markdown("### Examples: Excel files")
+    if st.button(":green[📊 Excel]"):
+        st.session_state.page = "excel"
+    if st.button("Back to Data Collection"):
+        st.session_state.page = "data_collection"
+# ----------------- Excel Data Page -----------------
+def excel_page():
+    st.title(":green[Excel Data Format]")
+    st.write("### What is Excel?")
+    st.write("Excel is a spreadsheet tool for storing data in tabular format with rows and columns. Common file extensions: .xls, .xlsx.")
+    st.write("### How to Read Excel Files")
+    st.code("""
+import pandas as pd
+# Read an Excel file
+df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
+print(df)
+    """, language='python')
+    st.write("### Issues Encountered")
+    st.write("""
+- **File not found**: Incorrect file path.
+- **Sheet name error**: Specified sheet doesn't exist.
+- **Missing libraries**: openpyxl or xlrd might be missing.
+""")
+    st.write("### Solutions to These Issues")
+    st.code("""
+# Install required libraries
+# pip install openpyxl xlrd
+# Handle missing file
+try:
+    df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
+except FileNotFoundError:
+    print("File not found. Check the file path.")
+# List available sheet names
+excel_file = pd.ExcelFile('data.xlsx')
+print(excel_file.sheet_names)
+    """, language='python')
+    st.markdown('[Jupyter Notebook](https://colab.research.google.com/drive/1Dv68m9hcRzXsLRlRit0uZc-8CB8U6VV3?usp=sharing)')
+    if st.button("Back to Structured Data"):
+        st.session_state.page = "structured_data"
+# ----------------- Unstructured Data Page -----------------
+from PIL import Image
+import numpy as np
+import matplotlib.pyplot as plt
+def unstructured_data_page():
+    st.title(":blue[Unstructured Data]")
+    st.markdown("""
+    *Unstructured data* does not have a predefined format. It consists of various data types like text, images, videos, and audio files.
+    Examples include:
+    - Images (e.g., .jpg, .png)
+    - Videos (e.g., .mp4, .avi)
+    - Social media posts
+    """)
+    ### Handling Image Data Section
+    st.header("🖼️ Handling Image Data")
+    st.markdown("""
+    Image data can be processed using libraries like OpenCV and PIL (Pillow). Images often need to be preprocessed for tasks like analysis, classification, or feature extraction. Common operations include:
+    - **Reading and displaying images**
+    - **Converting to grayscale**
+    - **Resizing and cropping**
+    - **Rotating and flipping**
+    - **Applying filters**
+    - **Edge detection and other transformations**
+    """)
+    st.code("""
+from PIL import Image
+import numpy as np
+import matplotlib.pyplot as plt
+# Open an image file
+image = Image.open('sample_image.jpg')
+image.show()
+# Convert image to grayscale
+gray_image = image.convert('L')
+gray_image.show()
+# Resize the image
+resized_image = image.resize((200, 200))
+resized_image.show()
+# Rotate the image by 90 degrees
+rotated_image = image.rotate(90)
+rotated_image.show()
+# Convert the image to a NumPy array and display its shape
+image_array = np.array(image)
+print(image_array.shape)
+# Display the image array as a plot
+plt.imshow(image)
+plt.title("Original Image")
+plt.axis('off')
+plt.show()
+    """, language='python')
+    st.markdown("""
+    **Common Image Processing Techniques:**
+    - **Resizing**: Adjust the dimensions of an image for uniformity in models.
+    - **Cropping**: Extract a region of interest (ROI) from an image.
+    - **Grayscale Conversion**: Simplify image data by reducing it to a single channel.
+    - **Rotation/Flipping**: Perform augmentations to increase the dataset for model training.
+    - **Edge Detection**: Identify edges in images using filters like the Sobel or Canny filters.
+    """)
+    ### Challenges and Solutions Section
+    st.markdown("### Challenges with Unstructured Data")
+    st.write("""
+    - *Noise and Inconsistency*: Data is often incomplete or noisy.
+    - *Storage Requirements*: Large size and variability in data types.
+    - *Processing Time*: Analyzing unstructured data is computationally expensive.
+    """)
+    st.markdown("### Solutions")
+    st.write("""
+    - *Data Cleaning*: Preprocess data to remove noise.
+    - *Efficient Storage*: Use NoSQL databases (e.g., MongoDB) or cloud storage.
+    - *Parallel Processing*: Utilize frameworks like Apache Spark.
+    """)
+    # Button to Navigate to Introduction to Image
+    if st.button("Introduction to Image"):
+        st.session_state.page = "introduction_to_image"
+    # Navigation Button
+    if st.button("Back to Data Collection"):
+        st.session_state.page = "data_collection"
 # ----------------- Semi-Structured Data Page -----------------
 def semi_structured_data_page():
     st.title(":orange[Semi-Structured Data]")
 data = {
     "name": "Alice",
     "age": 25,
+    "skills
+: ["Python", "Machine Learning"]
 }
 with open('data.json', 'w') as file:
     json.dump(data, file, indent=4)