yuvarajareddy001 commited on
Commit
f77de56
·
verified ·
1 Parent(s): cd218ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -108
README.md CHANGED
@@ -8,111 +8,3 @@ sdk_version: 1.42.2
8
  app_file: app.py
9
  pinned: false
10
  ---
11
-
12
- # Optimized Log Classification Using LLMs
13
- ---
14
- A comprehensive framework for hybrid log classification that integrates multiple analytical techniques to effectively process and categorize log data.
15
- This system leverages different methods to handle simple, complex, and sparsely labeled log patterns.
16
- ---
17
-
18
- ## Overview
19
-
20
- This project combines three primary classification strategies:
21
-
22
- - **Regex-based Classification**
23
- Captures predictable patterns using predefined regular expressions.
24
-
25
- - **Embedding-based Classification**
26
- Uses Sentence Transformers to generate embeddings followed by Logistic Regression for nuanced pattern recognition.
27
-
28
- - **LLM-assisted Classification**
29
- Employs large language models to classify data when traditional methods struggle due to limited labeled samples.
30
-
31
- ![System Architecture](resources/arch.png)
32
-
33
- ---
34
-
35
- ## Directory Structure
36
-
37
- - **`training/`**
38
- Contains notebooks and scripts for training the models and experimenting with different approaches.
39
-
40
- - **`models/`**
41
- Stores pre-trained models such as the logistic regression classifier and embedding models.
42
-
43
- - **`resources/`**
44
- Holds auxiliary files like CSV datasets, output samples, and images.
45
-
46
- - **Root Directory**
47
- Includes the main API server (`server.py`) and the command-line classification utility (`classify.py`).
48
-
49
- ---
50
-
51
- ## Installation & Setup
52
-
53
- 1. **Clone the Repository**
54
- ```bash
55
- git clone <your_repository_url>
56
- ```
57
-
58
- 2. **Install Dependencies**
59
- Ensure Python is installed and run:
60
- ```bash
61
- pip install -r requirements.txt
62
- ```
63
-
64
- 3. **Train the Model (if needed)**
65
- Open and run the training notebook:
66
- ```bash
67
- jupyter notebook training/log_classification.ipynb
68
- ```
69
-
70
- 4. **Run the API Server**
71
- Start the server using one of the following methods:
72
- - Direct execution:
73
- ```bash
74
- python server.py
75
- ```
76
- - With Uvicorn:
77
- ```bash
78
- uvicorn server:app --reload
79
- ```
80
- Access the API documentation at:
81
- - Main Endpoint: [http://127.0.0.1:8000/](http://127.0.0.1:8000/)
82
- - Swagger UI: [http://127.0.0.1:8000/docs](http://127.0.0.1:8000/docs)
83
- - Redoc: [http://127.0.0.1:8000/redoc](http://127.0.0.1:8000/redoc)
84
-
85
- 5. **Running the Streamlit App**
86
- To start the Streamlit application for log classification:
87
- ```bash
88
- streamlit run app.py
89
- ```
90
- This command will launch the app in your browser at a URL like http://localhost:8501.
91
- ---
92
-
93
- ## Usage Instructions
94
-
95
- - **Input Data**
96
- Upload a CSV file with the following columns:
97
- - `source`
98
- - `log_message`
99
-
100
- - **Output**
101
- The system processes the logs and returns a CSV file with an additional `target_label` column indicating the classification result.
102
-
103
- ---
104
-
105
- ## Customization
106
-
107
- Feel free to modify and extend the classification logic in the following modules:
108
- - `processor_bert.py`
109
- - `processor_llm.py`
110
- - `processor_regex.py`
111
-
112
- These modules are designed to be flexible, allowing you to tailor the classification approaches to your specific needs.
113
-
114
- ---
115
-
116
- ## Contributions
117
- Contributions, feedback, and feature requests are welcome.
118
- Please open an issue or submit a pull request in your GitHub repository.
 
8
  app_file: app.py
9
  pinned: false
10
  ---