File size: 2,752 Bytes
d4fafd0 f780954 a616977 0fc95dd d4fafd0 f780954 9e5c66b f780954 af313db f780954 aeca7d5 70f408d aeca7d5 80eeeb8 aeca7d5 fe83fe7 aeca7d5 fe83fe7 aeca7d5 fe83fe7 aeca7d5 fe83fe7 aeca7d5 fe83fe7 aeca7d5 fe83fe7 aeca7d5 70f408d aeca7d5 fe83fe7 9d4db31 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
license: apache-2.0
language: en
tags:
- sentence similarity
library_name: sentence-transformers
pipeline_tag: sentence-similarity
---
# Dataset Collection:
* The news dataset is collected from Kaggle[dataset](https://www.kaggle.com/competitions/fake-news/data)
* The dataset has news title ,news content and the label(the label shows the cosine similarity between news title and news content).
* Different strategies have been followed during the data gathering phase.
# sentence transformer is fine-tuned for semantic search and sentence similarity
* The model is fine-tuned on the dataset.
* This model can be used for semantic search,sentence similarity,recommendation system.
* This model can be used for the inference purpose as well.
# Data Fields:
**label**: cosine similarity between news title and news content
**news title**: The title of the news
**news content**:The content of the news
# Application:
* This model is useful for the semantic search,sentence similarity,recommendation system.
* You can fine-tune this model for your particular use cases.
# Model Implementation
# pip install -U sentence-transformers
```
from sentence_transformers import SentenceTransformer, InputExample, losses
import pandas as pd
from sentence_transformers import SentenceTransformer, InputExample
from torch.utils.data import DataLoader
from sentence_transformers import SentenceTransformer, util
model_name="Sakil/sentence_similarity_semantic_search"
model = SentenceTransformer(model_name)
sentences = ['A man is eating food.',
'A man is eating a piece of bread.',
'The girl is carrying a baby.',
'A man is riding a horse.',
'A woman is playing violin.',
'Two men pushed carts through the woods.',
'A man is riding a white horse on an enclosed ground.',
'A monkey is playing drums.',
'Someone in a gorilla costume is playing a set of drums.'
]
#Encode all sentences
embeddings = model.encode(sentences)
#Compute cosine similarity between all pairs
cos_sim = util.cos_sim(embeddings, embeddings)
#Add all pairs to a list with their cosine similarity score
all_sentence_combinations = []
for i in range(len(cos_sim)-1):
for j in range(i+1, len(cos_sim)):
all_sentence_combinations.append([cos_sim[i][j], i, j])
#Sort list by the highest cosine similarity score
all_sentence_combinations = sorted(all_sentence_combinations, key=lambda x: x[0], reverse=True)
print("Top-5 most similar pairs:")
for score, i, j in all_sentence_combinations[0:5]:
print("{} \t {} \t {:.4f}".format(sentences[i], sentences[j], cos_sim[i][j]))
```
# Github: [Sakil Ansari](https://github.com/Sakil786/sentence_similarity_semantic_search) |