Spaces:

ronakreddy18
/

Zerotoheroinmachinelearning

Sleeping

App Files Files Community

ronakreddy18 commited on Dec 12, 2024

Commit

49fc73f

verified ·

1 Parent(s): 10e71ae

Update pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py

Browse files

Files changed (1) hide show

pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py +38 -183

pages/LIFE_CYCLE_OF_MACHINE_LEARNING.py CHANGED Viewed

@@ -78,11 +78,17 @@ def structured_data_page():
     st.markdown("""
     Structured data is highly organized and typically stored in tables like spreadsheets or databases. It is easy to search and analyze.
     """)
-    st.markdown("### Examples: Excel files, CSV files")
     if st.button(":green[📊 Excel]"):
         st.session_state.page = "excel"
     if st.button("Back to Data Collection"):
         st.session_state.page = "data_collection"
@@ -125,204 +131,54 @@ excel_file = pd.ExcelFile('data.xlsx')
 print(excel_file.sheet_names)
     """, language='python')
-    # Download button for a sample Jupyter notebook
-    # with open("excel_handling_guide.ipynb", "rb") as file:
-    #     st.download_button(
-    #         label="Download Jupyter Notebook",
-    #         data = file,
-    #         file_name="excel_handling_guide.ipynb",
-    #         mime="application/octet-stream")
-    #test
-    # with open("excel_handling_guide.ipynb", "rb") as file:
-    #     st.download_button("Download Jupyter Notebook",file)
-    #test-2
-    st.link_button("Jupyter Notebook","https://colab.research.google.com/drive/1ZTKWTknL-4IQ9QbAfcyKzIP-_lNxmz2P?usp=sharing")
     if st.button("Back to Structured Data"):
         st.session_state.page = "structured_data"
-# ----------------- Unstructured Data Page -----------------
-def unstructured_data_page():
-    st.title(":blue[Unstructured Data]")
-    st.markdown("""
-    **Unstructured data** does not have a predefined format. It consists of various data types like text, images, videos, and audio files.
-    Examples include:
-    - Text documents (e.g., .txt, .docx)
-    - Images (e.g., .jpg, .png)
-    - Videos (e.g., .mp4, .avi)
-    - Audio files (e.g., .mp3, .wav)
-    - Social media posts
-    """)
-    st.header("📄 Handling Text Data")
-    st.markdown("""
-    Text data can be analyzed using Natural Language Processing (NLP) techniques.
-    """)
-    st.code("""
-# Reading text data
-with open('sample.txt', 'r') as file:
-    text = file.read()
-    print(text)
-# Basic text processing using NLTK
-import nltk
-from nltk.tokenize import word_tokenize
-nltk.download('punkt')
-tokens = word_tokenize(text)
-print(tokens)
-    """, language='python')
-    st.header("🖼️ Handling Image Data")
-    st.markdown("""
-    Image data can be processed using libraries like OpenCV and PIL (Pillow).
-    """)
     st.code("""
-from PIL import Image
-# Open an image file
-image = Image.open('sample_image.jpg')
-image.show()
-# Convert image to grayscale
-gray_image = image.convert('L')
-gray_image.show()
     """, language='python')
-    st.header("🎥 Handling Video Data")
-    st.markdown("""
-    Videos can be processed frame by frame using OpenCV.
-    """)
-    st.code("""
-import cv2
-# Capture video
-video = cv2.VideoCapture('sample_video.mp4')
-while video.isOpened():
-    ret, frame = video.read()
-    if not ret:
-        break
-    cv2.imshow('Frame', frame)
-    if cv2.waitKey(25) & 0xFF == ord('q'):
-        break
-video.release()
-cv2.destroyAllWindows()
-    """, language='python')
-    st.header("🔊 Handling Audio Data")
-    st.markdown("""
-    Audio data can be handled using libraries like librosa.
-    """)
-    st.code("""
-import librosa
-import librosa.display
-import matplotlib.pyplot as plt
-# Load audio file
-y, sr = librosa.load('sample_audio.mp3')
-librosa.display.waveshow(y, sr=sr)
-plt.title('Waveform')
-plt.show()
-    """, language='python')
-    st.markdown("### Challenges with Unstructured Data")
-    st.write("""
-    - **Noise and Inconsistency**: Data is often incomplete or noisy.
-    - **Storage Requirements**: Large size and variability in data types.
-    - **Processing Time**: Analyzing unstructured data is computationally expensive.
-    """)
-    st.markdown("### Solutions")
     st.write("""
-    - **Data Cleaning**: Preprocess data to remove noise.
-    - **Efficient Storage**: Use NoSQL databases (e.g., MongoDB) or cloud storage.
-    - **Parallel Processing**: Utilize frameworks like Apache Spark.
-    """)
-    # Back to Data Collection
-    if st.button("Back to Data Collection"):
-        st.session_state.page = "data_collection"
-# ----------------- Semi-Structured Data Page -----------------
-def semi_structured_data_page():
-    st.title(":blue[Semi-Structured Data]")
-    st.markdown("""
-    **Semi-structured data** does not conform strictly to a tabular structure but contains tags or markers to separate elements. Examples include:
-    - JSON (JavaScript Object Notation) files
-    - XML (Extensible Markup Language) files
-    - YAML (Yet Another Markup Language)
     """)
-    st.header("🔹 JSON Data")
-    st.markdown("""
-    JSON is a popular format for storing and exchanging data.
-    """)
     st.code("""
-# Sample JSON data
-data = '''
-{
-    "name": "Alice",
-    "age": 25,
-    "skills": ["Python", "Machine Learning"]
-}
-'''
-# Parse JSON
-parsed_data = json.loads(data)
-print(parsed_data['name'])  # Output: Alice
-    """, language='python')
-    st.header("🔹 Reading JSON Files")
-    st.code("""
-# Reading a JSON file
 with open('data.json', 'r') as file:
     data = json.load(file)
     print(data)
     """, language='python')
-    st.header("🔹 XML Data")
-    st.markdown("""
-    XML is a markup language that defines a set of rules for encoding documents.
-    """)
-    st.code("""
-import xml.etree.ElementTree as ET
-# Sample XML data
-xml_data = '''
-<person>
-    <name>Bob</name>
-    <age>30</age>
-    <city>New York</city>
-</person>
-'''
-# Parse XML
-root = ET.fromstring(xml_data)
-print(root.find('name').text)  # Output: Bob
-    """, language='python')
-    st.markdown("### Challenges with Semi-Structured Data")
-    st.write("""
-    - **Complex Parsing**: Requires specialized parsers.
-    - **Nested Data**: Can be deeply nested, making it harder to process.
-    """)
-    st.markdown("### Solutions")
-    st.write("""
-    - **Libraries**: Use libraries like json, xml.etree.ElementTree, and yaml for parsing.
-    - **Validation**: Validate data formats to avoid parsing errors.
-    """)
-    # Back to Data Collection
-    if st.button("Back to Data Collection"):
-        st.session_state.page = "data_collection"
 # ----------------- Router -----------------
 def router():
@@ -334,11 +190,10 @@ def router():
         structured_data_page()
     elif st.session_state.page == "excel":
         excel_page()
-    elif st.session_state.page == "unstructured_data":
-        unstructured_data_page()
-    elif st.session_state.page == "semi_structured_data":
-        semi_structured_data_page()
-# Run te router function
-if __name__ == "__main__":
-   router()

     st.markdown("""
     Structured data is highly organized and typically stored in tables like spreadsheets or databases. It is easy to search and analyze.
     """)
+    st.markdown("### Examples: Excel files, CSV files, JSON files")
     if st.button(":green[📊 Excel]"):
         st.session_state.page = "excel"
+    if st.button(":green[📄 CSV]"):
+        st.session_state.page = "csv"
+    if st.button(":green[🔹 JSON]"):
+        st.session_state.page = "json"
     if st.button("Back to Data Collection"):
         st.session_state.page = "data_collection"
 print(excel_file.sheet_names)
     """, language='python')
+    st.link_button("Jupyter Notebook", "https://colab.research.google.com/drive/1ZTKWTknL-4IQ9QbAfcyKzIP-_lNxmz2P?usp=sharing")
     if st.button("Back to Structured Data"):
         st.session_state.page = "structured_data"
+# ----------------- CSV Data Page -----------------
+def csv_page():
+    st.title(":green[CSV Data Format]")
+    st.write("### What is CSV?")
+    st.write("CSV (Comma-Separated Values) files store tabular data in plain text, where each line is a data record and columns are separated by commas.")
+    st.write("### How to Read CSV Files")
     st.code("""
+import pandas as pd
+# Read a CSV file
+df = pd.read_csv('data.csv')
+print(df)
     """, language='python')
+    st.link_button("Jupyter Notebook", "https://colab.research.google.com/drive/your_csv_guide_link")
+    if st.button("Back to Structured Data"):
+        st.session_state.page = "structured_data"
+# ----------------- JSON Data Page -----------------
+def json_page():
+    st.title(":green[JSON Data Format]")
+    st.write("### What is JSON?")
     st.write("""
+    JSON (JavaScript Object Notation) is a lightweight data-interchange format.
     """)
     st.code("""
+import json
+# Read a JSON file
 with open('data.json', 'r') as file:
     data = json.load(file)
     print(data)
     """, language='python')
+    st.link_button("Jupyter Notebook", "https://colab.research.google.com/drive/your_json_guide_link")
+    if st.button("Back to Structured Data"):
+        st.session_state.page = "structured_data"
 # ----------------- Router -----------------
 def router():
         structured_data_page()
     elif st.session_state.page == "excel":
         excel_page()
+    elif st.session_state.page == "csv":
+        csv_page()
+    elif st.session_state.page == "json":
+        json_page()
+# Run the router function
+router()