julsCadenas commited on
Commit
d6d625b
·
verified ·
1 Parent(s): a389ab8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -88,6 +88,35 @@ To get started, you need to install the required dependencies. You can do this b
88
  2. Add the *URL* of your preferred Reddit post on main.py.
89
  3. Run ```src/main.py```
90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91
  <br>
92
 
93
  # **Model Evaluation**
 
88
  2. Add the *URL* of your preferred Reddit post on main.py.
89
  3. Run ```src/main.py```
90
 
91
+ ### **Formatted JSON Output**
92
+
93
+ The model outputs its responses in JSON format, which may not be fully formatted properly. For instance, the output could look like [this](https://github.com/julsCadenas/summarize-reddit/blob/master/data/test_output.json).
94
+
95
+ You can see that the output contains escaped quotes within the values. This data should be properly formatted for easier consumption. To fix this, you can use the following function to clean and format the JSON:
96
+ ```python
97
+ def fix_json(raw_data, fixed_path):
98
+ if not isinstance(raw_data, dict):
99
+ raise ValueError(f"Expected a dictionary, but got: {type(raw_data)}")
100
+
101
+ try:
102
+ formatted_data = {
103
+ "post_summary": json.loads(raw_data["post_summary"]),
104
+ "comments_summary": json.loads(raw_data["comments_summary"])
105
+ }
106
+ except json.JSONDecodeError as e:
107
+ print("Error decoding JSON:", e)
108
+ return
109
+
110
+ with open(fixed_path, "w") as file:
111
+ json.dump(formatted_data, file, indent=4)
112
+
113
+ print(f"Formatted JSON saved to {fixed_path}")
114
+ ```
115
+ After using the fix_json() function to clean and format the data, the data will now look like [this](https://github.com/julsCadenas/summarize-reddit/blob/master/data/formatted_output.json).
116
+
117
+ You can view the full notebook on formatting the output [here](https://github.com/julsCadenas/summarize-reddit/blob/master/notebooks/testing.ipynb).
118
+
119
+
120
  <br>
121
 
122
  # **Model Evaluation**