georgeck
/

Hacker-News-Comments-Summarization-Llama-3.1-8B-Instruct

Summarization

Safetensors

English

llama

hacker-news

hn-companion

Model card Files Files and versions Community

georgeck commited on 26 days ago

Commit

2d9aa59

1 Parent(s): c1e0b35

Updated Model Card

Browse files

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -119,30 +119,31 @@ print(summary)
 This model was fine-tuned on the [georgeck/hacker-news-discussion-summarization-large](https://huggingface.co/datasets/georgeck/hacker-news-discussion-summarization-large) dataset, which contains 14,531 records of Hacker News front-page stories and their associated discussion threads.
 The dataset includes:
-- 13,077 training examples
-- 1,454 test examples
 - Structured representations of hierarchical comment threads
 - Normalized scoring system that represents comment importance
 - Comprehensive metadata about posts and comments
-Each example includes a post title, author information, timestamps, and a structured representation of the comment thread with information about comment scores, reply counts, and downvotes.
 ### Training Procedure
 #### Preprocessing
 - The hierarchical comment structure was preserved using a standardized format
-- Comments were filtered based on downvote counts, with heavily downvoted content (4+ downvotes) excluded
 - A normalized scoring system (1-1000) was applied to represent each comment's relative importance
 - Comments were organized to maintain their hierarchical relationships
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
-The model was evaluated on the test split of the georgeck/hacker-news-discussion-summarization-large dataset, comprising 1,454 examples of Hacker News discussions and summaries.
 #### Factors
@@ -157,7 +158,8 @@ Evaluation considered:
 ### Model Architecture and Objective
-This model is based on Llama-3.1-8B-Instruct, a causal language model. The primary training objective was to generate structured summaries of hierarchical discussion threads that capture the most important themes, perspectives, and insights while maintaining proper attribution.
 The model was trained to specifically understand and process the hierarchical structure of Hacker News comments, including their scoring system, reply counts, and downvote information to appropriately weight content importance.

 This model was fine-tuned on the [georgeck/hacker-news-discussion-summarization-large](https://huggingface.co/datasets/georgeck/hacker-news-discussion-summarization-large) dataset, which contains 14,531 records of Hacker News front-page stories and their associated discussion threads.
 The dataset includes:
+- 6,300 training examples
+- 700 test examples
 - Structured representations of hierarchical comment threads
 - Normalized scoring system that represents comment importance
 - Comprehensive metadata about posts and comments
+Each example includes a post title, and a structured representation of the comment thread with information about comment scores, reply counts, and downvotes.
 ### Training Procedure
 #### Preprocessing
 - The hierarchical comment structure was preserved using a standardized format
 - A normalized scoring system (1-1000) was applied to represent each comment's relative importance
 - Comments were organized to maintain their hierarchical relationships
+The training was done by using [OpenPipe](https://openpipe.ai/) infrastructure.
 ## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
+The model was evaluated on the test split of the georgeck/hacker-news-discussion-summarization-large dataset.
 #### Factors
 ### Model Architecture and Objective
+This model is based on Llama-3.1-8B-Instruct, a causal language model.
+The primary training objective was to generate structured summaries of hierarchical discussion threads that capture the most important themes, perspectives, and insights while maintaining proper attribution.
 The model was trained to specifically understand and process the hierarchical structure of Hacker News comments, including their scoring system, reply counts, and downvote information to appropriately weight content importance.