{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "YG4fX89xbPeQ"
},
"source": [
"### EDA"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "I0-UozYqPKzD"
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 678
},
"id": "likh-JLqO50o",
"outputId": "82a49b50-f756-4cbf-c440-700d20b384dd"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Shape of the Dataset: (12194, 18)\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"summary": "{\n \"name\": \"df_anime\",\n \"rows\": 12194,\n \"fields\": [\n {\n \"column\": \"anime_id\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 11437,\n \"min\": 1,\n \"max\": 34527,\n \"num_unique_values\": 12194,\n \"samples\": [\n 3834,\n 32936,\n 8792\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"genres\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3256,\n \"samples\": [\n \"Mystery, Sci-Fi, Space\",\n \"Music, Sci-Fi\",\n \"Magic, Romance\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 12194,\n \"samples\": [\n \"Hoshi no Ko Chobin\",\n \"Gin no Guardian\",\n \"Madobe Nanami no Windows 7 de PC Jisaku Ouen Commercial!!\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"average_rating\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 552,\n \"samples\": [\n \"4.06\",\n \"8.03\",\n \"7.32\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"overview\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 11689,\n \"samples\": [\n \"Serebii, a Legendary Pok\\u00e9mon known for its ability to traverse time, is hunted by an unnamed Pok\\u00e9mon poacher seeking to capture it. Yukinari, a young Pok\\u00e9mon trainer who enjoys drawing portraits of Pok\\u00e9mon, tries to protect Serebii after it stumbles upon him; but in the middle of its escape, both vanish without a trace.\\n\\nForty years later, ambitious Pok\\u00e9mon trainer Satoshi hopes to sight rare Pok\\u00e9mon around his local area. White, a boat driver, takes Satoshi to his village. Satoshi is accompanied by his three closest friends: Takeshi, a former gym leader training to be a great Pok\\u00e9mon breeder; Kasumi, a young girl wanting to become a skilled Water-type trainer; and Pikachu, Satoshi's Pok\\u00e9mon partner and first comrade.\\n\\nMeanwhile, Yukinari and Serebii reappear in Satoshi's present time and run into his group. Masked Lord Vicious, the strongest executive staff member of Team Rocket, desires to capture Serebii. Using the Dark Ball, a variant of Monster Ball that corrupts the Pok\\u00e9mon caught within and draws out their maximum power, Vicious can transform innocent Pok\\u00e9mon into powerful and frightening obstacles\\u2014including Serebii itself! Placed in a tough position, Satoshi, Yukinari, and their friends must work together to defeat Vicious and save Serebii and themselves.\",\n \"East Force meets West Force and all Hell breaks loose. The Solonoids, that lovable race of female warriors, are at it again, fighting amongst themselves. During a heated battle, however, it looks like the leaders of the two factions have hung their warriors out to dry. In the middle of all this fighting and chaos, East Force detects a transmission from an unidentified planet. The Gall Force gals leave their posts to go. Lufy, the West Force's Ace Pilot, who is after Rabby, follows them to the planet.\\n\\n(Source: AnimeNfo)\",\n \"Netto's father Yuuichirou Hikari has made a scientific breakthrough by introducing the \\\"synchro chips\\\". If an operator and his or her navi are in a special enviroment known as a \\\"dimensional area\\\", they can fuse together in the real world via a technique called \\\"cross fusion\\\"! Yuuichirou's first test subject, Misaki Gorou, attempts the process and sadly fails. Netto offers to try with Rockman, but his father forbids it. Cross Fusion puts enormous strain on the operator's health, and battling in the real world could mean death. \\n\\n(Source: Official Site)\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"type\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 7,\n \"samples\": [\n \"TV\",\n \"Movie\",\n \"ONA\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"episodes\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 200,\n \"samples\": [\n \"93\",\n \"27\",\n \"110\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"producers\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3070,\n \"samples\": [\n \"AIC, Lantis, Media Factory, Pony Canyon, Rakuonsha, AT-X, KlockWorx, Ryukyu Asahi Broadcasting\",\n \"Tama Production, Tokyo MX\",\n \"KAGAYA Studio\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"licensors\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 253,\n \"samples\": [\n \"ADV Films, Media Blasters\",\n \"Funimation\",\n \"Central Park Media, Maiden Japan\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"studios\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 941,\n \"samples\": [\n \"Group TAC, Ginga Ya\",\n \"J.C.Staff, Production I.G\",\n \"drop\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"source\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 16,\n \"samples\": [\n \"Visual novel\",\n \"Manga\",\n \"Novel\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"anime_rating\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 7,\n \"samples\": [\n \"PG-13 - Teens 13 or older\",\n \"R - 17+ (violence & profanity)\",\n \"UNKNOWN\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"rank\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 8846,\n \"samples\": [\n \"12549\",\n \"7778\",\n \"7761\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"popularity\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 5211,\n \"min\": 1,\n \"max\": 19844,\n \"num_unique_values\": 10304,\n \"samples\": [\n 5880,\n 7617,\n 17889\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"favorites\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 5821,\n \"min\": 0,\n \"max\": 217606,\n \"num_unique_values\": 1357,\n \"samples\": [\n 46556,\n 1136,\n 1570\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"scored by\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 6497,\n \"samples\": [\n \"19239\",\n \"1926\",\n \"300\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"members\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 191364,\n \"min\": 142,\n \"max\": 3744541,\n \"num_unique_values\": 8192,\n \"samples\": [\n 9390,\n 8376,\n 1998\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"image url\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 12135,\n \"samples\": [\n \"https://cdn.myanimelist.net/images/anime/1195/111544.jpg\",\n \"https://cdn.myanimelist.net/images/anime/6/26448.jpg\",\n \"https://cdn.myanimelist.net/images/anime/1405/112400.jpg\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
"type": "dataframe",
"variable_name": "df_anime"
},
"text/html": [
"\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
anime_id
\n",
"
genres
\n",
"
name
\n",
"
average_rating
\n",
"
overview
\n",
"
type
\n",
"
episodes
\n",
"
producers
\n",
"
licensors
\n",
"
studios
\n",
"
source
\n",
"
anime_rating
\n",
"
rank
\n",
"
popularity
\n",
"
favorites
\n",
"
scored by
\n",
"
members
\n",
"
image url
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
4181
\n",
"
Drama, Fantasy, Romance, Slice of Life, Supern...
\n",
"
Clannad: After Story
\n",
"
8.93
\n",
"
Clannad: After Story, the sequel to the critic...
\n",
"
TV
\n",
"
24
\n",
"
Pony Canyon, TBS, Rakuonsha, Animation Do
\n",
"
Sentai Filmworks
\n",
"
Kyoto Animation
\n",
"
Visual novel
\n",
"
PG-13 - Teens 13 or older
\n",
"
19
\n",
"
114
\n",
"
68949
\n",
"
639729
\n",
"
1149886
\n",
"
https://cdn.myanimelist.net/images/anime/1299/...
\n",
"
\n",
"
\n",
"
1
\n",
"
28735
\n",
"
Drama, Historical, Josei
\n",
"
Shouwa Genroku Rakugo Shinjuu
\n",
"
8.57
\n",
"
Yotarou is a former yakuza member fresh out of...
\n",
"
TV
\n",
"
13
\n",
"
Starchild Records, Mainichi Broadcasting Syste...
\n",
"
UNKNOWN
\n",
"
Studio Deen
\n",
"
Manga
\n",
"
PG-13 - Teens 13 or older
\n",
"
93
\n",
"
804
\n",
"
5711
\n",
"
91359
\n",
"
281445
\n",
"
https://cdn.myanimelist.net/images/anime/1354/...
\n",
"
\n",
"
\n",
"
2
\n",
"
5205
\n",
"
Action, Mystery, Romance, Supernatural, Thriller
\n",
"
Kara no Kyoukai Movie 7: Satsujin Kousatsu (Go)
\n",
"
8.39
\n",
"
In February 1999, a string of murders has Shik...
\n",
"
Movie
\n",
"
1
\n",
"
Notes
\n",
"
Aniplex of America
\n",
"
ufotable
\n",
"
Light novel
\n",
"
R - 17+ (violence & profanity)
\n",
"
182
\n",
"
1115
\n",
"
2261
\n",
"
108703
\n",
"
200492
\n",
"
https://cdn.myanimelist.net/images/anime/9/566...
\n",
"
\n",
"
\n",
"
3
\n",
"
170
\n",
"
Comedy, Drama, School, Shounen, Sports
\n",
"
Slam Dunk
\n",
"
8.54
\n",
"
Hanamichi Sakuragi, infamous for his temper, m...
\n",
"
TV
\n",
"
101
\n",
"
TV Asahi, Animax
\n",
"
Flatiron Film Company, Geneon Entertainment USA
\n",
"
Toei Animation
\n",
"
Manga
\n",
"
PG-13 - Teens 13 or older
\n",
"
108
\n",
"
797
\n",
"
6879
\n",
"
128920
\n",
"
283226
\n",
"
https://cdn.myanimelist.net/images/anime/12/86...
\n",
"
\n",
"
\n",
"
4
\n",
"
10162
\n",
"
Josei, Slice of Life
\n",
"
Usagi Drop
\n",
"
8.36
\n",
"
Daikichi Kawachi is a 30-year-old bachelor wor...
\n",
"
TV
\n",
"
11
\n",
"
Dentsu, Fuji TV, Toho, Tohokushinsha Film Corp...
\n",
"
NIS America, Inc.
\n",
"
Production I.G
\n",
"
Manga
\n",
"
PG-13 - Teens 13 or older
\n",
"
202
\n",
"
425
\n",
"
5975
\n",
"
237156
\n",
"
479967
\n",
"
https://cdn.myanimelist.net/images/anime/2/296...
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"
\n",
"\n",
"\n",
"
\n",
" \n",
"\n",
"\n",
"\n",
" \n",
"
\n",
"\n",
"
\n",
"
\n"
],
"text/plain": [
" anime_id genres \\\n",
"0 4181 Drama, Fantasy, Romance, Slice of Life, Supern... \n",
"1 28735 Drama, Historical, Josei \n",
"2 5205 Action, Mystery, Romance, Supernatural, Thriller \n",
"3 170 Comedy, Drama, School, Shounen, Sports \n",
"4 10162 Josei, Slice of Life \n",
"\n",
" name average_rating \\\n",
"0 Clannad: After Story 8.93 \n",
"1 Shouwa Genroku Rakugo Shinjuu 8.57 \n",
"2 Kara no Kyoukai Movie 7: Satsujin Kousatsu (Go) 8.39 \n",
"3 Slam Dunk 8.54 \n",
"4 Usagi Drop 8.36 \n",
"\n",
" overview type episodes \\\n",
"0 Clannad: After Story, the sequel to the critic... TV 24 \n",
"1 Yotarou is a former yakuza member fresh out of... TV 13 \n",
"2 In February 1999, a string of murders has Shik... Movie 1 \n",
"3 Hanamichi Sakuragi, infamous for his temper, m... TV 101 \n",
"4 Daikichi Kawachi is a 30-year-old bachelor wor... TV 11 \n",
"\n",
" producers \\\n",
"0 Pony Canyon, TBS, Rakuonsha, Animation Do \n",
"1 Starchild Records, Mainichi Broadcasting Syste... \n",
"2 Notes \n",
"3 TV Asahi, Animax \n",
"4 Dentsu, Fuji TV, Toho, Tohokushinsha Film Corp... \n",
"\n",
" licensors studios \\\n",
"0 Sentai Filmworks Kyoto Animation \n",
"1 UNKNOWN Studio Deen \n",
"2 Aniplex of America ufotable \n",
"3 Flatiron Film Company, Geneon Entertainment USA Toei Animation \n",
"4 NIS America, Inc. Production I.G \n",
"\n",
" source anime_rating rank popularity favorites \\\n",
"0 Visual novel PG-13 - Teens 13 or older 19 114 68949 \n",
"1 Manga PG-13 - Teens 13 or older 93 804 5711 \n",
"2 Light novel R - 17+ (violence & profanity) 182 1115 2261 \n",
"3 Manga PG-13 - Teens 13 or older 108 797 6879 \n",
"4 Manga PG-13 - Teens 13 or older 202 425 5975 \n",
"\n",
" scored by members image url \n",
"0 639729 1149886 https://cdn.myanimelist.net/images/anime/1299/... \n",
"1 91359 281445 https://cdn.myanimelist.net/images/anime/1354/... \n",
"2 108703 200492 https://cdn.myanimelist.net/images/anime/9/566... \n",
"3 128920 283226 https://cdn.myanimelist.net/images/anime/12/86... \n",
"4 237156 479967 https://cdn.myanimelist.net/images/anime/2/296... "
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_anime=pd.read_csv(\"/content/Animes.csv\")\n",
"print(\"Shape of the Dataset:\", df_anime.shape)\n",
"df_anime.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 488
},
"id": "Wuxs0dgVI1UL",
"outputId": "fc2c7b6e-8214-4ff5-8ded-54451869e89b"
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import Image, display\n",
"anime_name = 'One Piece' # Replace with the desired anime name\n",
"anime_row = df_anime[df_anime['name'] == anime_name].iloc[0]\n",
"\n",
"image_url = anime_row['image url']\n",
"display(Image(url=image_url, width=300))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 224
},
"id": "shOuDaGzNfIl",
"outputId": "c2d140e4-955f-42d0-8c48-c1d361ed7abf"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Shape of the dataset: (1112830, 4)\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "dataframe",
"variable_name": "df_rating"
},
"text/html": [
"\n",
"
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
"
],
"text/plain": [
"NearestNeighbors(algorithm='brute', metric='cosine')"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from scipy.sparse import csr_matrix\n",
"from sklearn.neighbors import NearestNeighbors\n",
"\n",
"item_user_matrix = csr_matrix(anime_pivot.values)\n",
"knn_item_based = NearestNeighbors(metric = 'cosine', algorithm = 'brute')\n",
"knn_item_based.fit(item_user_matrix)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"id": "dQkd4zIdXTxK"
},
"outputs": [],
"source": [
"def get_item_based_recommendations(anime_name, n_recommendations=5):\n",
" # Find the index of the anime title\n",
" if anime_name not in anime_pivot.index:\n",
" return f\"Anime title '{anime_name}' not found in the dataset.\"\n",
"\n",
" query_index = anime_pivot.index.get_loc(anime_name)\n",
"\n",
" # Use the KNN model to find the nearest neighbors\n",
" distances, indices = knn_item_based.kneighbors(anime_pivot.iloc[query_index,:].values.reshape(1, -1), n_neighbors=n_recommendations + 1)\n",
"\n",
" recommendations = []\n",
" for i in range(1, len(distances.flatten())):\n",
" anime_title = anime_pivot.index[indices.flatten()[i]]\n",
" distance = distances.flatten()[i]\n",
" recommendations.append((anime_title, distance))\n",
"\n",
" return recommendations"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "K7wvMvI0M3cw",
"outputId": "9e9604d5-a806-41f6-990e-9176f024b785"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Based on user rating, If you like 'One Piece' you will definetly like below recommendations....\n",
"1: Bleach, with distance of 0.2537\n",
"2: Naruto, with distance of 0.2768\n",
"3: One Piece Film: Strong World, with distance of 0.2836\n",
"4: Fairy Tail, with distance of 0.2905\n",
"5: Soul Eater, with distance of 0.2945\n",
"6: Death Note, with distance of 0.2964\n",
"7: Code Geass: Hangyaku no Lelouch, with distance of 0.3043\n",
"8: Naruto: Shippuuden, with distance of 0.3101\n",
"9: Code Geass: Hangyaku no Lelouch R2, with distance of 0.3106\n",
"10: Fullmetal Alchemist: Brotherhood, with distance of 0.3122\n"
]
}
],
"source": [
"anime_name = \"One Piece\"\n",
"item_based_recommendations = get_item_based_recommendations(anime_name,10)\n",
"\n",
"if isinstance(item_based_recommendations, str):\n",
" print(item_based_recommendations)\n",
"else:\n",
" print(f\"Based on user rating, If you like '{anime_name}' you will definetly like below recommendations....\")\n",
" for i, (title, distance) in enumerate(item_based_recommendations, 1):\n",
" print(f\"{i}: {title}, with distance of {distance:.4f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "XXKBq-x9qALF"
},
"source": [
"#### 4) User-Based Collaborative Filtering using KNN\n",
"\n",
"\n",
"User-Based Collaborative Filtering (UBCF) is a recommendation approach that suggests items to users based on the preferences of similar users. Unlike **item-based filtering**, which focuses on item similarities, **user-based filtering** identifies users with similar rating patterns and recommends items they have liked. \n",
"\n",
"- How It Works:\n",
" 1. **Compute User Similarity**: Using **K-Nearest Neighbors (KNN)**, we measure similarity between users based on their anime rating history. \n",
" 2. **Find Nearest Neighbors**: Identify users who have rated anime similarly. \n",
" 3. **Generate Recommendations**: Recommend anime that similar users have highly rated but the target user has not watched yet. \n"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 391
},
"id": "btDeSD4ap384",
"outputId": "683cc21a-fc4e-4324-aa73-a37eb9430b5d"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "dataframe",
"variable_name": "user_pivot"
},
"text/html": [
"\n",
"
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
"
],
"text/plain": [
"NearestNeighbors(algorithm='brute', metric='cosine')"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from scipy.sparse import csr_matrix\n",
"from sklearn.neighbors import NearestNeighbors\n",
"\n",
"user_item_matrix = csr_matrix(user_pivot.values)\n",
"knn_user_based = NearestNeighbors(metric = 'cosine', algorithm = 'brute')\n",
"knn_user_based.fit(user_item_matrix)"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"id": "W_lzzcUr2Qoq"
},
"outputs": [],
"source": [
"def get_user_based_recommendations(user_id, n_recommendations=5):\n",
" # Convert user_id to the same type as user_pivot index\n",
" user_id = float(user_id)\n",
"\n",
" # Check if the user exists in the matrix\n",
" if user_id not in user_pivot.index:\n",
" return f\"User '{user_id}' not found in the dataset.\"\n",
"\n",
" # Find the user index\n",
" user_idx = user_pivot.index.get_loc(user_id)\n",
"\n",
" # Find the nearest neighbors (most similar users)\n",
" distances, indices = knn_user_based.kneighbors(user_pivot.iloc[user_idx, :].values.reshape(1, -1), n_neighbors=n_recommendations + 1)\n",
"\n",
" # Get the list of anime this user has already rated\n",
" user_rated_anime = set(user_pivot.columns[user_pivot.iloc[user_idx, :] > 100])\n",
"\n",
" # Gather all anime rated by nearest neighbors\n",
" all_neighbor_ratings = []\n",
" for i in range(1, len(distances.flatten())):\n",
" neighbor_idx = indices.flatten()[i]\n",
" neighbor_rated_anime = user_pivot.iloc[neighbor_idx, :]\n",
" neighbor_ratings = neighbor_rated_anime[neighbor_rated_anime > 0]\n",
" all_neighbor_ratings.extend(neighbor_ratings.index)\n",
"\n",
" # Count the frequency of each anime rated by the neighbors\n",
" from collections import Counter\n",
" anime_counter = Counter(all_neighbor_ratings)\n",
"\n",
" # Recommend the most common anime among the neighbors that the user hasn't rated yet\n",
" recommendations = [(anime, count) for anime, count in anime_counter.items() if anime not in user_rated_anime]\n",
" recommendations.sort(key=lambda x: x[1], reverse=True)\n",
"\n",
" # Return the top N recommendations\n",
" return recommendations[:n_recommendations]"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Pwenx_vZ2TSz",
"outputId": "b98596e5-cab9-4cdf-d59f-4be10d4f23c9"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"User-based recommendations for user ID 817:\n",
"Accel World, Recommended by 10 similar users\n",
"Ai Yori Aoshi, Recommended by 10 similar users\n",
"Amagami SS, Recommended by 10 similar users\n",
"Angel Beats!, Recommended by 10 similar users\n",
"Ano Hi Mita Hana no Namae wo Bokutachi wa Mada Shiranai., Recommended by 10 similar users\n",
"Ano Natsu de Matteru, Recommended by 10 similar users\n",
"Another, Recommended by 10 similar users\n",
"Ao no Exorcist, Recommended by 10 similar users\n",
"Arakawa Under the Bridge, Recommended by 10 similar users\n",
"Asura Cryin' 2, Recommended by 10 similar users\n"
]
}
],
"source": [
"user_id = 817\n",
"user_recommendations = get_user_based_recommendations(user_id,10)\n",
"\n",
"# Check if recommendations is a string (indicating an error)\n",
"if isinstance(user_recommendations, str):\n",
" print(user_recommendations)\n",
"else:\n",
" print(f\"User-based recommendations for user ID {user_id}:\")\n",
" for anime, count in user_recommendations:\n",
" print(f\"{anime}, Recommended by {count} similar users\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "lTIbtd3uL4Tm"
},
"source": [
"## Content based Filtering"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Content-Based Filtering (CBF) also referred to as cognitive filtering, recommends anime based on their content attributes, such as **genre, synopsis, or other metadata**. Unlike collaborative filtering, which relies on user interactions, CBF focuses on **similarities between anime** using textual data and machine learning techniques. \n",
"\n",
"1. **TF-IDF with Sigmoid Kernel** \n",
" - **TF-IDF (Term Frequency-Inverse Document Frequency)** transforms text into numerical features. \n",
" - The **Sigmoid Kernel** measures similarity between anime based on their textual descriptions. \n",
"\n",
"2. **CountVectorizer with Linear Kernel** \n",
" - **CountVectorizer** converts text into a bag-of-words representation. \n",
" - The **Linear Kernel** computes the similarity between anime using these feature vectors. \n",
"\n",
"- Why Use These Methods?\n",
" - **TF-IDF + Sigmoid Kernel** captures the importance of words while smoothing similarity scores. \n",
" - **CountVectorizer + Linear Kernel** is efficient for comparing anime descriptions at a high level. \n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zzorVg94k4HF"
},
"source": [
"#### 1) Using TFIDF"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "PShKOncdVfbJ"
},
"source": [
"##### a. Sigmoid Kernel"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"id": "rVvjDyI9MKSr"
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 678
},
"id": "cO7tLpcxjLh0",
"outputId": "69147217-a12c-484f-9573-c63a3bf4175c"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Shape of the Dataset: (12194, 18)\n"
]
},
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"summary": "{\n \"name\": \"df\",\n \"rows\": 12194,\n \"fields\": [\n {\n \"column\": \"anime_id\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 11437,\n \"min\": 1,\n \"max\": 34527,\n \"num_unique_values\": 12194,\n \"samples\": [\n 3834,\n 32936,\n 8792\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"genres\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3256,\n \"samples\": [\n \"Mystery, Sci-Fi, Space\",\n \"Music, Sci-Fi\",\n \"Magic, Romance\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 12194,\n \"samples\": [\n \"Hoshi no Ko Chobin\",\n \"Gin no Guardian\",\n \"Madobe Nanami no Windows 7 de PC Jisaku Ouen Commercial!!\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"average_rating\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 552,\n \"samples\": [\n \"4.06\",\n \"8.03\",\n \"7.32\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"overview\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 11689,\n \"samples\": [\n \"Serebii, a Legendary Pok\\u00e9mon known for its ability to traverse time, is hunted by an unnamed Pok\\u00e9mon poacher seeking to capture it. Yukinari, a young Pok\\u00e9mon trainer who enjoys drawing portraits of Pok\\u00e9mon, tries to protect Serebii after it stumbles upon him; but in the middle of its escape, both vanish without a trace.\\n\\nForty years later, ambitious Pok\\u00e9mon trainer Satoshi hopes to sight rare Pok\\u00e9mon around his local area. White, a boat driver, takes Satoshi to his village. Satoshi is accompanied by his three closest friends: Takeshi, a former gym leader training to be a great Pok\\u00e9mon breeder; Kasumi, a young girl wanting to become a skilled Water-type trainer; and Pikachu, Satoshi's Pok\\u00e9mon partner and first comrade.\\n\\nMeanwhile, Yukinari and Serebii reappear in Satoshi's present time and run into his group. Masked Lord Vicious, the strongest executive staff member of Team Rocket, desires to capture Serebii. Using the Dark Ball, a variant of Monster Ball that corrupts the Pok\\u00e9mon caught within and draws out their maximum power, Vicious can transform innocent Pok\\u00e9mon into powerful and frightening obstacles\\u2014including Serebii itself! Placed in a tough position, Satoshi, Yukinari, and their friends must work together to defeat Vicious and save Serebii and themselves.\",\n \"East Force meets West Force and all Hell breaks loose. The Solonoids, that lovable race of female warriors, are at it again, fighting amongst themselves. During a heated battle, however, it looks like the leaders of the two factions have hung their warriors out to dry. In the middle of all this fighting and chaos, East Force detects a transmission from an unidentified planet. The Gall Force gals leave their posts to go. Lufy, the West Force's Ace Pilot, who is after Rabby, follows them to the planet.\\n\\n(Source: AnimeNfo)\",\n \"Netto's father Yuuichirou Hikari has made a scientific breakthrough by introducing the \\\"synchro chips\\\". If an operator and his or her navi are in a special enviroment known as a \\\"dimensional area\\\", they can fuse together in the real world via a technique called \\\"cross fusion\\\"! Yuuichirou's first test subject, Misaki Gorou, attempts the process and sadly fails. Netto offers to try with Rockman, but his father forbids it. Cross Fusion puts enormous strain on the operator's health, and battling in the real world could mean death. \\n\\n(Source: Official Site)\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"type\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 7,\n \"samples\": [\n \"TV\",\n \"Movie\",\n \"ONA\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"episodes\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 200,\n \"samples\": [\n \"93\",\n \"27\",\n \"110\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"producers\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3070,\n \"samples\": [\n \"AIC, Lantis, Media Factory, Pony Canyon, Rakuonsha, AT-X, KlockWorx, Ryukyu Asahi Broadcasting\",\n \"Tama Production, Tokyo MX\",\n \"KAGAYA Studio\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"licensors\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 253,\n \"samples\": [\n \"ADV Films, Media Blasters\",\n \"Funimation\",\n \"Central Park Media, Maiden Japan\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"studios\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 941,\n \"samples\": [\n \"Group TAC, Ginga Ya\",\n \"J.C.Staff, Production I.G\",\n \"drop\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"source\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 16,\n \"samples\": [\n \"Visual novel\",\n \"Manga\",\n \"Novel\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"anime_rating\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 7,\n \"samples\": [\n \"PG-13 - Teens 13 or older\",\n \"R - 17+ (violence & profanity)\",\n \"UNKNOWN\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"rank\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 8846,\n \"samples\": [\n \"12549\",\n \"7778\",\n \"7761\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"popularity\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 5211,\n \"min\": 1,\n \"max\": 19844,\n \"num_unique_values\": 10304,\n \"samples\": [\n 5880,\n 7617,\n 17889\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"favorites\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 5821,\n \"min\": 0,\n \"max\": 217606,\n \"num_unique_values\": 1357,\n \"samples\": [\n 46556,\n 1136,\n 1570\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"scored by\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 6497,\n \"samples\": [\n \"19239\",\n \"1926\",\n \"300\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"members\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 191364,\n \"min\": 142,\n \"max\": 3744541,\n \"num_unique_values\": 8192,\n \"samples\": [\n 9390,\n 8376,\n 1998\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"image url\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 12135,\n \"samples\": [\n \"https://cdn.myanimelist.net/images/anime/1195/111544.jpg\",\n \"https://cdn.myanimelist.net/images/anime/6/26448.jpg\",\n \"https://cdn.myanimelist.net/images/anime/1405/112400.jpg\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
"type": "dataframe",
"variable_name": "df"
},
"text/html": [
"\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
anime_id
\n",
"
genres
\n",
"
name
\n",
"
average_rating
\n",
"
overview
\n",
"
type
\n",
"
episodes
\n",
"
producers
\n",
"
licensors
\n",
"
studios
\n",
"
source
\n",
"
anime_rating
\n",
"
rank
\n",
"
popularity
\n",
"
favorites
\n",
"
scored by
\n",
"
members
\n",
"
image url
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
4181
\n",
"
Drama, Fantasy, Romance, Slice of Life, Supern...
\n",
"
Clannad: After Story
\n",
"
8.93
\n",
"
Clannad: After Story, the sequel to the critic...
\n",
"
TV
\n",
"
24
\n",
"
Pony Canyon, TBS, Rakuonsha, Animation Do
\n",
"
Sentai Filmworks
\n",
"
Kyoto Animation
\n",
"
Visual novel
\n",
"
PG-13 - Teens 13 or older
\n",
"
19
\n",
"
114
\n",
"
68949
\n",
"
639729
\n",
"
1149886
\n",
"
https://cdn.myanimelist.net/images/anime/1299/...
\n",
"
\n",
"
\n",
"
1
\n",
"
28735
\n",
"
Drama, Historical, Josei
\n",
"
Shouwa Genroku Rakugo Shinjuu
\n",
"
8.57
\n",
"
Yotarou is a former yakuza member fresh out of...
\n",
"
TV
\n",
"
13
\n",
"
Starchild Records, Mainichi Broadcasting Syste...
\n",
"
UNKNOWN
\n",
"
Studio Deen
\n",
"
Manga
\n",
"
PG-13 - Teens 13 or older
\n",
"
93
\n",
"
804
\n",
"
5711
\n",
"
91359
\n",
"
281445
\n",
"
https://cdn.myanimelist.net/images/anime/1354/...
\n",
"
\n",
"
\n",
"
2
\n",
"
5205
\n",
"
Action, Mystery, Romance, Supernatural, Thriller
\n",
"
Kara no Kyoukai Movie 7: Satsujin Kousatsu (Go)
\n",
"
8.39
\n",
"
In February 1999, a string of murders has Shik...
\n",
"
Movie
\n",
"
1
\n",
"
Notes
\n",
"
Aniplex of America
\n",
"
ufotable
\n",
"
Light novel
\n",
"
R - 17+ (violence & profanity)
\n",
"
182
\n",
"
1115
\n",
"
2261
\n",
"
108703
\n",
"
200492
\n",
"
https://cdn.myanimelist.net/images/anime/9/566...
\n",
"
\n",
"
\n",
"
3
\n",
"
170
\n",
"
Comedy, Drama, School, Shounen, Sports
\n",
"
Slam Dunk
\n",
"
8.54
\n",
"
Hanamichi Sakuragi, infamous for his temper, m...
\n",
"
TV
\n",
"
101
\n",
"
TV Asahi, Animax
\n",
"
Flatiron Film Company, Geneon Entertainment USA
\n",
"
Toei Animation
\n",
"
Manga
\n",
"
PG-13 - Teens 13 or older
\n",
"
108
\n",
"
797
\n",
"
6879
\n",
"
128920
\n",
"
283226
\n",
"
https://cdn.myanimelist.net/images/anime/12/86...
\n",
"
\n",
"
\n",
"
4
\n",
"
10162
\n",
"
Josei, Slice of Life
\n",
"
Usagi Drop
\n",
"
8.36
\n",
"
Daikichi Kawachi is a 30-year-old bachelor wor...
\n",
"
TV
\n",
"
11
\n",
"
Dentsu, Fuji TV, Toho, Tohokushinsha Film Corp...
\n",
"
NIS America, Inc.
\n",
"
Production I.G
\n",
"
Manga
\n",
"
PG-13 - Teens 13 or older
\n",
"
202
\n",
"
425
\n",
"
5975
\n",
"
237156
\n",
"
479967
\n",
"
https://cdn.myanimelist.net/images/anime/2/296...
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"
\n",
"\n",
"\n",
"
\n",
" \n",
"\n",
"\n",
"\n",
" \n",
"
\n",
"\n",
"
\n",
"
\n"
],
"text/plain": [
" anime_id genres \\\n",
"0 4181 Drama, Fantasy, Romance, Slice of Life, Supern... \n",
"1 28735 Drama, Historical, Josei \n",
"2 5205 Action, Mystery, Romance, Supernatural, Thriller \n",
"3 170 Comedy, Drama, School, Shounen, Sports \n",
"4 10162 Josei, Slice of Life \n",
"\n",
" name average_rating \\\n",
"0 Clannad: After Story 8.93 \n",
"1 Shouwa Genroku Rakugo Shinjuu 8.57 \n",
"2 Kara no Kyoukai Movie 7: Satsujin Kousatsu (Go) 8.39 \n",
"3 Slam Dunk 8.54 \n",
"4 Usagi Drop 8.36 \n",
"\n",
" overview type episodes \\\n",
"0 Clannad: After Story, the sequel to the critic... TV 24 \n",
"1 Yotarou is a former yakuza member fresh out of... TV 13 \n",
"2 In February 1999, a string of murders has Shik... Movie 1 \n",
"3 Hanamichi Sakuragi, infamous for his temper, m... TV 101 \n",
"4 Daikichi Kawachi is a 30-year-old bachelor wor... TV 11 \n",
"\n",
" producers \\\n",
"0 Pony Canyon, TBS, Rakuonsha, Animation Do \n",
"1 Starchild Records, Mainichi Broadcasting Syste... \n",
"2 Notes \n",
"3 TV Asahi, Animax \n",
"4 Dentsu, Fuji TV, Toho, Tohokushinsha Film Corp... \n",
"\n",
" licensors studios \\\n",
"0 Sentai Filmworks Kyoto Animation \n",
"1 UNKNOWN Studio Deen \n",
"2 Aniplex of America ufotable \n",
"3 Flatiron Film Company, Geneon Entertainment USA Toei Animation \n",
"4 NIS America, Inc. Production I.G \n",
"\n",
" source anime_rating rank popularity favorites \\\n",
"0 Visual novel PG-13 - Teens 13 or older 19 114 68949 \n",
"1 Manga PG-13 - Teens 13 or older 93 804 5711 \n",
"2 Light novel R - 17+ (violence & profanity) 182 1115 2261 \n",
"3 Manga PG-13 - Teens 13 or older 108 797 6879 \n",
"4 Manga PG-13 - Teens 13 or older 202 425 5975 \n",
"\n",
" scored by members image url \n",
"0 639729 1149886 https://cdn.myanimelist.net/images/anime/1299/... \n",
"1 91359 281445 https://cdn.myanimelist.net/images/anime/1354/... \n",
"2 108703 200492 https://cdn.myanimelist.net/images/anime/9/566... \n",
"3 128920 283226 https://cdn.myanimelist.net/images/anime/12/86... \n",
"4 237156 479967 https://cdn.myanimelist.net/images/anime/2/296... "
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(\"/content/Animes.csv\")\n",
"print(\"Shape of the Dataset:\", df.shape)\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"id": "8-vUA38AMMCS"
},
"outputs": [],
"source": [
"df.dropna(inplace = True)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "9GvI7884MMCS",
"outputId": "96c483b6-b506-489d-f0ef-f20cdf7fcfa2"
},
"outputs": [
{
"data": {
"text/plain": [
"Index(['anime_id', 'genres', 'name', 'average_rating', 'overview', 'type',\n",
" 'episodes', 'producers', 'licensors', 'studios', 'source',\n",
" 'anime_rating', 'rank', 'popularity', 'favorites', 'scored by',\n",
" 'members', 'image url'],\n",
" dtype='object')"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.columns"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"id": "syXBYlBqL9O_"
},
"outputs": [],
"source": [
"from sklearn.feature_extraction.text import TfidfVectorizer\n",
"\n",
"tfv = TfidfVectorizer(min_df=3,\n",
" strip_accents='unicode', analyzer='word',token_pattern=r'\\w{1,}',\n",
" ngram_range=(1, 3),\n",
" stop_words = 'english')\n",
"\n",
"tfv_matrix = tfv.fit_transform(df['genres'])"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "EDUM2vu7L99g",
"outputId": "69816148-37e1-48f4-b6af-d69337b8bfeb"
},
"outputs": [
{
"data": {
"text/plain": [
"(12133, 1552)"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tfv_matrix.shape"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"id": "0jVvemqYL95s"
},
"outputs": [],
"source": [
"from sklearn.metrics.pairwise import sigmoid_kernel\n",
"sig = sigmoid_kernel(tfv_matrix, tfv_matrix)"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"id": "Qy4uip8AL92D"
},
"outputs": [],
"source": [
"indices = pd.Series(df.index, index=df['name']).drop_duplicates()"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"id": "ezJ6vl80L9xo"
},
"outputs": [],
"source": [
"def get_rec_sig(title, sig=sig,n_recommendations = 10):\n",
" if title not in indices.index:\n",
" print(f\"Anime title '{title}' not found in the dataset.\")\n",
" # Get the index corresponding to original_title\n",
" idx = indices[title]\n",
"\n",
" # Get the pairwsie similarity scores\n",
" sig_scores = list(enumerate(sig[idx]))\n",
"\n",
" # Sort the movies\n",
" sig_scores = sorted(sig_scores, key=lambda x: x[1], reverse=True)\n",
"\n",
" sig_scores = sig_scores[1:n_recommendations+1]\n",
"\n",
" # Movie indices\n",
" anime_indices = [i[0] for i in sig_scores]\n",
" return pd.DataFrame({'Anime name': df['name'].iloc[anime_indices].values,\n",
" 'Rating': df['average_rating'].iloc[anime_indices].values})"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 519
},
"id": "1sktZ0lWL9uO",
"outputId": "a09176a6-dc66-48ec-9b63-a46f2afdec48"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"summary": "{\n \"name\": \"get_rec_sig('Naruto', n_recommendations=15)\",\n \"rows\": 15,\n \"fields\": [\n {\n \"column\": \"Anime name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 15,\n \"samples\": [\n \"Kyutai Panic Adventure!\",\n \"Ben-To\",\n \"Boruto: Naruto the Movie\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Rating\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 14,\n \"samples\": [\n \"7.49\",\n \"7.68\",\n \"7.4\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
"type": "dataframe"
},
"text/html": [
"\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Anime name
\n",
"
Rating
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Boruto: Naruto the Movie
\n",
"
7.4
\n",
"
\n",
"
\n",
"
1
\n",
"
Naruto
\n",
"
7.99
\n",
"
\n",
"
\n",
"
2
\n",
"
Naruto x UT
\n",
"
7.37
\n",
"
\n",
"
\n",
"
3
\n",
"
Boruto: Naruto the Movie - Naruto ga Hokage ni...
\n",
"
7.33
\n",
"
\n",
"
\n",
"
4
\n",
"
Naruto: Shippuuden Movie 4 - The Lost Tower
\n",
"
7.42
\n",
"
\n",
"
\n",
"
5
\n",
"
Naruto: Shippuuden Movie 3 - Hi no Ishi wo Tsu...
\n",
"
7.33
\n",
"
\n",
"
\n",
"
6
\n",
"
Naruto: Shippuuden - Sunny Side Battle
\n",
"
7.56
\n",
"
\n",
"
\n",
"
7
\n",
"
Naruto Soyokazeden Movie: Naruto to Mashin to ...
\n",
"
6.96
\n",
"
\n",
"
\n",
"
8
\n",
"
Battle Spirits: Ryuuko no Ken
\n",
"
4.78
\n",
"
\n",
"
\n",
"
9
\n",
"
Kyutai Panic Adventure!
\n",
"
4.67
\n",
"
\n",
"
\n",
"
10
\n",
"
Ranma ½: Akumu! Shunmin Kou
\n",
"
7.49
\n",
"
\n",
"
\n",
"
11
\n",
"
Ben-To
\n",
"
7.2
\n",
"
\n",
"
\n",
"
12
\n",
"
Naruto: Shippuuden Movie 6 - Road to Ninja
\n",
"
7.68
\n",
"
\n",
"
\n",
"
13
\n",
"
Rekka no Honoo
\n",
"
7.34
\n",
"
\n",
"
\n",
"
14
\n",
"
Naruto: Honoo no Chuunin Shiken! Naruto vs. Ko...
\n",
"
7.17
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"
\n",
"\n",
"\n",
"
\n",
" \n",
"\n",
"\n",
"\n",
" \n",
"
\n",
"\n",
"
\n",
"
\n"
],
"text/plain": [
" Anime name Rating\n",
"0 Boruto: Naruto the Movie 7.4\n",
"1 Naruto 7.99\n",
"2 Naruto x UT 7.37\n",
"3 Boruto: Naruto the Movie - Naruto ga Hokage ni... 7.33\n",
"4 Naruto: Shippuuden Movie 4 - The Lost Tower 7.42\n",
"5 Naruto: Shippuuden Movie 3 - Hi no Ishi wo Tsu... 7.33\n",
"6 Naruto: Shippuuden - Sunny Side Battle 7.56\n",
"7 Naruto Soyokazeden Movie: Naruto to Mashin to ... 6.96\n",
"8 Battle Spirits: Ryuuko no Ken 4.78\n",
"9 Kyutai Panic Adventure! 4.67\n",
"10 Ranma ½: Akumu! Shunmin Kou 7.49\n",
"11 Ben-To 7.2\n",
"12 Naruto: Shippuuden Movie 6 - Road to Ninja 7.68\n",
"13 Rekka no Honoo 7.34\n",
"14 Naruto: Honoo no Chuunin Shiken! Naruto vs. Ko... 7.17"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_rec_sig('Naruto', n_recommendations=15)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9JTNMQdTWDQV"
},
"source": [
"##### b. Linear Kernel"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"id": "6D9pKHkBWDQX"
},
"outputs": [],
"source": [
"from sklearn.metrics.pairwise import linear_kernel\n",
"\n",
"lin = linear_kernel(tfv_matrix, tfv_matrix)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"id": "xFlCrJOeWDQX"
},
"outputs": [],
"source": [
"indices = pd.Series(df.index, index=df['name']).drop_duplicates()"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"id": "nikXuR1fWDQY"
},
"outputs": [],
"source": [
"def get_rec_lin(title, lin=lin,n_recommendations = 10):\n",
" if title not in indices.index:\n",
" print(f\"Anime title '{title}' not found in the dataset.\")\n",
" # Get the index corresponding to original_title\n",
" idx = indices[title]\n",
"\n",
" # Get the pairwsie similarity scores\n",
" lin_scores = list(enumerate(lin[idx]))\n",
"\n",
" # Sort the movies\n",
" lin_scores = sorted(lin_scores, key=lambda x: x[1], reverse=True)\n",
"\n",
" lin_scores = lin_scores[1:n_recommendations+1]\n",
"\n",
" # Movie indices\n",
" anime_indices = [i[0] for i in lin_scores]\n",
" return pd.DataFrame({'Anime name': df['name'].iloc[anime_indices].values,\n",
" 'Rating': df['average_rating'].iloc[anime_indices].values})"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 519
},
"id": "sRJM0gdqWDQY",
"outputId": "8feb0c8b-9193-4be8-f731-6e81c346e9da"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"summary": "{\n \"name\": \"get_rec_lin('Naruto', n_recommendations=15)\",\n \"rows\": 15,\n \"fields\": [\n {\n \"column\": \"Anime name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 15,\n \"samples\": [\n \"Kyutai Panic Adventure!\",\n \"Ben-To\",\n \"Boruto: Naruto the Movie\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Rating\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 14,\n \"samples\": [\n \"7.49\",\n \"7.68\",\n \"7.4\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
"type": "dataframe"
},
"text/html": [
"\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Anime name
\n",
"
Rating
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Boruto: Naruto the Movie
\n",
"
7.4
\n",
"
\n",
"
\n",
"
1
\n",
"
Naruto
\n",
"
7.99
\n",
"
\n",
"
\n",
"
2
\n",
"
Naruto x UT
\n",
"
7.37
\n",
"
\n",
"
\n",
"
3
\n",
"
Boruto: Naruto the Movie - Naruto ga Hokage ni...
\n",
"
7.33
\n",
"
\n",
"
\n",
"
4
\n",
"
Naruto: Shippuuden Movie 4 - The Lost Tower
\n",
"
7.42
\n",
"
\n",
"
\n",
"
5
\n",
"
Naruto: Shippuuden Movie 3 - Hi no Ishi wo Tsu...
\n",
"
7.33
\n",
"
\n",
"
\n",
"
6
\n",
"
Naruto: Shippuuden - Sunny Side Battle
\n",
"
7.56
\n",
"
\n",
"
\n",
"
7
\n",
"
Naruto Soyokazeden Movie: Naruto to Mashin to ...
\n",
"
6.96
\n",
"
\n",
"
\n",
"
8
\n",
"
Battle Spirits: Ryuuko no Ken
\n",
"
4.78
\n",
"
\n",
"
\n",
"
9
\n",
"
Kyutai Panic Adventure!
\n",
"
4.67
\n",
"
\n",
"
\n",
"
10
\n",
"
Ranma ½: Akumu! Shunmin Kou
\n",
"
7.49
\n",
"
\n",
"
\n",
"
11
\n",
"
Ben-To
\n",
"
7.2
\n",
"
\n",
"
\n",
"
12
\n",
"
Naruto: Shippuuden Movie 6 - Road to Ninja
\n",
"
7.68
\n",
"
\n",
"
\n",
"
13
\n",
"
Rekka no Honoo
\n",
"
7.34
\n",
"
\n",
"
\n",
"
14
\n",
"
Naruto: Honoo no Chuunin Shiken! Naruto vs. Ko...
\n",
"
7.17
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"
\n",
"\n",
"\n",
"
\n",
" \n",
"\n",
"\n",
"\n",
" \n",
"
\n",
"\n",
"
\n",
"
\n"
],
"text/plain": [
" Anime name Rating\n",
"0 Boruto: Naruto the Movie 7.4\n",
"1 Naruto 7.99\n",
"2 Naruto x UT 7.37\n",
"3 Boruto: Naruto the Movie - Naruto ga Hokage ni... 7.33\n",
"4 Naruto: Shippuuden Movie 4 - The Lost Tower 7.42\n",
"5 Naruto: Shippuuden Movie 3 - Hi no Ishi wo Tsu... 7.33\n",
"6 Naruto: Shippuuden - Sunny Side Battle 7.56\n",
"7 Naruto Soyokazeden Movie: Naruto to Mashin to ... 6.96\n",
"8 Battle Spirits: Ryuuko no Ken 4.78\n",
"9 Kyutai Panic Adventure! 4.67\n",
"10 Ranma ½: Akumu! Shunmin Kou 7.49\n",
"11 Ben-To 7.2\n",
"12 Naruto: Shippuuden Movie 6 - Road to Ninja 7.68\n",
"13 Rekka no Honoo 7.34\n",
"14 Naruto: Honoo no Chuunin Shiken! Naruto vs. Ko... 7.17"
]
},
"execution_count": 68,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_rec_lin('Naruto', n_recommendations=15)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "rtPCq8UMkyE-"
},
"source": [
"#### 2) Using count vectorizer"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {
"id": "DY4p2FYCk1PT"
},
"outputs": [],
"source": [
"from sklearn.feature_extraction.text import CountVectorizer\n",
"from sklearn.metrics.pairwise import cosine_similarity\n",
"# Initialize CountVectorizer\n",
"count_vectorizer = CountVectorizer(analyzer='word', token_pattern=r'\\w{1,}', ngram_range=(1, 3), stop_words='english',max_features=5000)\n",
"\n",
"# Fit and transform the genre data\n",
"count_matrix = count_vectorizer.fit_transform(df['genres'])\n",
"\n",
"# Compute the cosine similarity matrix\n",
"cosine_sim = cosine_similarity(count_matrix, count_matrix)\n",
"\n",
"# Create a reverse mapping of anime titles to indices\n",
"indices = pd.Series(df.index, index=df['name']).drop_duplicates()\n",
"\n",
"# Function to get recommendations\n",
"def get_rec(title, cosine_sim=cosine_sim, n_recommendations=5):\n",
" if title not in indices.index:\n",
" return f\"Anime title '{title}' not found in the dataset.\"\n",
"\n",
" # Get the index of the anime that matches the title\n",
" idx = indices[title]\n",
"\n",
" # Get the pairwise similarity scores of all anime with that anime\n",
" sim_scores = list(enumerate(cosine_sim[idx]))\n",
"\n",
" # Sort the anime based on the similarity scores\n",
" sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)\n",
"\n",
" # Get the scores of the most similar anime (excluding the input anime itself)\n",
" sim_scores = sim_scores[1:n_recommendations+1]\n",
"\n",
" # Get the anime indices\n",
" anime_indices = [i[0] for i in sim_scores]\n",
"\n",
" # Return the top n most similar anime\n",
" return pd.DataFrame({'Anime name': df['name'].iloc[anime_indices].values,\n",
" 'Rating': df['average_rating'].iloc[anime_indices].values})"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 519
},
"id": "PXxkJk2qleU0",
"outputId": "fab7b7ec-d72b-4d22-e714-b4e14be2ce00"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"summary": "{\n \"name\": \"get_rec( \\\"Naruto\\\", n_recommendations=15)\",\n \"rows\": 15,\n \"fields\": [\n {\n \"column\": \"Anime name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 15,\n \"samples\": [\n \"Kyutai Panic Adventure!\",\n \"Naruto: Shippuuden Movie 6 - Road to Ninja\",\n \"Boruto: Naruto the Movie\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Rating\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 14,\n \"samples\": [\n \"7.49\",\n \"7.34\",\n \"7.4\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
"type": "dataframe"
},
"text/html": [
"\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Anime name
\n",
"
Rating
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Boruto: Naruto the Movie
\n",
"
7.4
\n",
"
\n",
"
\n",
"
1
\n",
"
Naruto
\n",
"
7.99
\n",
"
\n",
"
\n",
"
2
\n",
"
Naruto x UT
\n",
"
7.37
\n",
"
\n",
"
\n",
"
3
\n",
"
Boruto: Naruto the Movie - Naruto ga Hokage ni...
\n",
"
7.33
\n",
"
\n",
"
\n",
"
4
\n",
"
Naruto: Shippuuden Movie 4 - The Lost Tower
\n",
"
7.42
\n",
"
\n",
"
\n",
"
5
\n",
"
Naruto: Shippuuden Movie 3 - Hi no Ishi wo Tsu...
\n",
"
7.33
\n",
"
\n",
"
\n",
"
6
\n",
"
Naruto: Shippuuden - Sunny Side Battle
\n",
"
7.56
\n",
"
\n",
"
\n",
"
7
\n",
"
Naruto Soyokazeden Movie: Naruto to Mashin to ...
\n",
"
6.96
\n",
"
\n",
"
\n",
"
8
\n",
"
Battle Spirits: Ryuuko no Ken
\n",
"
4.78
\n",
"
\n",
"
\n",
"
9
\n",
"
Kyutai Panic Adventure!
\n",
"
4.67
\n",
"
\n",
"
\n",
"
10
\n",
"
Ranma ½: Akumu! Shunmin Kou
\n",
"
7.49
\n",
"
\n",
"
\n",
"
11
\n",
"
Naruto: Shippuuden Movie 6 - Road to Ninja
\n",
"
7.68
\n",
"
\n",
"
\n",
"
12
\n",
"
Rekka no Honoo
\n",
"
7.34
\n",
"
\n",
"
\n",
"
13
\n",
"
Naruto: Honoo no Chuunin Shiken! Naruto vs. Ko...
\n",
"
7.17
\n",
"
\n",
"
\n",
"
14
\n",
"
Street Fighter Zero The Animation
\n",
"
6.51
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"
\n",
"\n",
"\n",
"
\n",
" \n",
"\n",
"\n",
"\n",
" \n",
"
\n",
"\n",
"
\n",
"
\n"
],
"text/plain": [
" Anime name Rating\n",
"0 Boruto: Naruto the Movie 7.4\n",
"1 Naruto 7.99\n",
"2 Naruto x UT 7.37\n",
"3 Boruto: Naruto the Movie - Naruto ga Hokage ni... 7.33\n",
"4 Naruto: Shippuuden Movie 4 - The Lost Tower 7.42\n",
"5 Naruto: Shippuuden Movie 3 - Hi no Ishi wo Tsu... 7.33\n",
"6 Naruto: Shippuuden - Sunny Side Battle 7.56\n",
"7 Naruto Soyokazeden Movie: Naruto to Mashin to ... 6.96\n",
"8 Battle Spirits: Ryuuko no Ken 4.78\n",
"9 Kyutai Panic Adventure! 4.67\n",
"10 Ranma ½: Akumu! Shunmin Kou 7.49\n",
"11 Naruto: Shippuuden Movie 6 - Road to Ninja 7.68\n",
"12 Rekka no Honoo 7.34\n",
"13 Naruto: Honoo no Chuunin Shiken! Naruto vs. Ko... 7.17\n",
"14 Street Fighter Zero The Animation 6.51"
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_rec( \"Naruto\", n_recommendations=15)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "ny0Ub9tJmZk6"
},
"source": [
"As you can see both Count Vectorizer and TF-IDF are giving same recommendations. We will lean towards TfidfVectorizer for more accurate and meaningful recommendations, especially in content-based systems that handle large or complex textual data. CountVectorizer might be used in scenarios where quick and simple solutions are sufficient or for smaller datasets."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QSPSoR1RH9ZE"
},
"source": [
"## Popularity-Based Filtering"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"id": "5r8qPk9PzL_S"
},
"outputs": [],
"source": [
"import seaborn as sns"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {
"id": "ppgmlYpA20D3"
},
"outputs": [],
"source": [
"df = df_anime.copy()"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 340
},
"id": "GK_FQCX53Hth",
"outputId": "9a55ecdc-302e-409f-d68d-f39ba7437943"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"summary": "{\n \"name\": \"df\",\n \"rows\": 12194,\n \"fields\": [\n {\n \"column\": \"anime_id\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 11437,\n \"min\": 1,\n \"max\": 34527,\n \"num_unique_values\": 12194,\n \"samples\": [\n 3834,\n 32936,\n 8792\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"genres\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3256,\n \"samples\": [\n \"Mystery, Sci-Fi, Space\",\n \"Music, Sci-Fi\",\n \"Magic, Romance\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 12194,\n \"samples\": [\n \"Hoshi no Ko Chobin\",\n \"Gin no Guardian\",\n \"Madobe Nanami no Windows 7 de PC Jisaku Ouen Commercial!!\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"average_rating\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 0.9422757691869167,\n \"min\": 1.85,\n \"max\": 9.1,\n \"num_unique_values\": 551,\n \"samples\": [\n 4.06,\n 8.03,\n 7.32\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"overview\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 11689,\n \"samples\": [\n \"Serebii, a Legendary Pok\\u00e9mon known for its ability to traverse time, is hunted by an unnamed Pok\\u00e9mon poacher seeking to capture it. Yukinari, a young Pok\\u00e9mon trainer who enjoys drawing portraits of Pok\\u00e9mon, tries to protect Serebii after it stumbles upon him; but in the middle of its escape, both vanish without a trace.\\n\\nForty years later, ambitious Pok\\u00e9mon trainer Satoshi hopes to sight rare Pok\\u00e9mon around his local area. White, a boat driver, takes Satoshi to his village. Satoshi is accompanied by his three closest friends: Takeshi, a former gym leader training to be a great Pok\\u00e9mon breeder; Kasumi, a young girl wanting to become a skilled Water-type trainer; and Pikachu, Satoshi's Pok\\u00e9mon partner and first comrade.\\n\\nMeanwhile, Yukinari and Serebii reappear in Satoshi's present time and run into his group. Masked Lord Vicious, the strongest executive staff member of Team Rocket, desires to capture Serebii. Using the Dark Ball, a variant of Monster Ball that corrupts the Pok\\u00e9mon caught within and draws out their maximum power, Vicious can transform innocent Pok\\u00e9mon into powerful and frightening obstacles\\u2014including Serebii itself! Placed in a tough position, Satoshi, Yukinari, and their friends must work together to defeat Vicious and save Serebii and themselves.\",\n \"East Force meets West Force and all Hell breaks loose. The Solonoids, that lovable race of female warriors, are at it again, fighting amongst themselves. During a heated battle, however, it looks like the leaders of the two factions have hung their warriors out to dry. In the middle of all this fighting and chaos, East Force detects a transmission from an unidentified planet. The Gall Force gals leave their posts to go. Lufy, the West Force's Ace Pilot, who is after Rabby, follows them to the planet.\\n\\n(Source: AnimeNfo)\",\n \"Netto's father Yuuichirou Hikari has made a scientific breakthrough by introducing the \\\"synchro chips\\\". If an operator and his or her navi are in a special enviroment known as a \\\"dimensional area\\\", they can fuse together in the real world via a technique called \\\"cross fusion\\\"! Yuuichirou's first test subject, Misaki Gorou, attempts the process and sadly fails. Netto offers to try with Rockman, but his father forbids it. Cross Fusion puts enormous strain on the operator's health, and battling in the real world could mean death. \\n\\n(Source: Official Site)\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"type\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 7,\n \"samples\": [\n \"TV\",\n \"Movie\",\n \"ONA\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"episodes\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 200,\n \"samples\": [\n \"93\",\n \"27\",\n \"110\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"producers\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 3070,\n \"samples\": [\n \"AIC, Lantis, Media Factory, Pony Canyon, Rakuonsha, AT-X, KlockWorx, Ryukyu Asahi Broadcasting\",\n \"Tama Production, Tokyo MX\",\n \"KAGAYA Studio\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"licensors\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 253,\n \"samples\": [\n \"ADV Films, Media Blasters\",\n \"Funimation\",\n \"Central Park Media, Maiden Japan\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"studios\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 941,\n \"samples\": [\n \"Group TAC, Ginga Ya\",\n \"J.C.Staff, Production I.G\",\n \"drop\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"source\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 16,\n \"samples\": [\n \"Visual novel\",\n \"Manga\",\n \"Novel\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"anime_rating\",\n \"properties\": {\n \"dtype\": \"category\",\n \"num_unique_values\": 7,\n \"samples\": [\n \"PG-13 - Teens 13 or older\",\n \"R - 17+ (violence & profanity)\",\n \"UNKNOWN\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"rank\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 8846,\n \"samples\": [\n \"12549\",\n \"7778\",\n \"7761\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"popularity\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 5211,\n \"min\": 1,\n \"max\": 19844,\n \"num_unique_values\": 10304,\n \"samples\": [\n 5880,\n 7617,\n 17889\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"favorites\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 5821,\n \"min\": 0,\n \"max\": 217606,\n \"num_unique_values\": 1357,\n \"samples\": [\n 46556,\n 1136,\n 1570\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"scored by\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 6497,\n \"samples\": [\n \"19239\",\n \"1926\",\n \"300\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"members\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 191364,\n \"min\": 142,\n \"max\": 3744541,\n \"num_unique_values\": 8192,\n \"samples\": [\n 9390,\n 8376,\n 1998\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"image url\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 12135,\n \"samples\": [\n \"https://cdn.myanimelist.net/images/anime/1195/111544.jpg\",\n \"https://cdn.myanimelist.net/images/anime/6/26448.jpg\",\n \"https://cdn.myanimelist.net/images/anime/1405/112400.jpg\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
"type": "dataframe",
"variable_name": "df"
},
"text/html": [
"\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
anime_id
\n",
"
genres
\n",
"
name
\n",
"
average_rating
\n",
"
overview
\n",
"
type
\n",
"
episodes
\n",
"
producers
\n",
"
licensors
\n",
"
studios
\n",
"
source
\n",
"
anime_rating
\n",
"
rank
\n",
"
popularity
\n",
"
favorites
\n",
"
scored by
\n",
"
members
\n",
"
image url
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
4181
\n",
"
Drama, Fantasy, Romance, Slice of Life, Supern...
\n",
"
Clannad: After Story
\n",
"
8.93
\n",
"
Clannad: After Story, the sequel to the critic...
\n",
"
TV
\n",
"
24
\n",
"
Pony Canyon, TBS, Rakuonsha, Animation Do
\n",
"
Sentai Filmworks
\n",
"
Kyoto Animation
\n",
"
Visual novel
\n",
"
PG-13 - Teens 13 or older
\n",
"
19
\n",
"
114
\n",
"
68949
\n",
"
639729
\n",
"
1149886
\n",
"
https://cdn.myanimelist.net/images/anime/1299/...
\n",
"
\n",
"
\n",
"
1
\n",
"
28735
\n",
"
Drama, Historical, Josei
\n",
"
Shouwa Genroku Rakugo Shinjuu
\n",
"
8.57
\n",
"
Yotarou is a former yakuza member fresh out of...
\n",
"
TV
\n",
"
13
\n",
"
Starchild Records, Mainichi Broadcasting Syste...
\n",
"
UNKNOWN
\n",
"
Studio Deen
\n",
"
Manga
\n",
"
PG-13 - Teens 13 or older
\n",
"
93
\n",
"
804
\n",
"
5711
\n",
"
91359
\n",
"
281445
\n",
"
https://cdn.myanimelist.net/images/anime/1354/...
\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"\n",
"
\n",
" \n",
"\n",
" \n",
"\n",
" \n",
"
\n",
"\n",
"\n",
"
\n",
" \n",
"\n",
"\n",
"\n",
" \n",
"
\n",
"\n",
"
\n",
"
\n"
],
"text/plain": [
" anime_id genres \\\n",
"0 4181 Drama, Fantasy, Romance, Slice of Life, Supern... \n",
"1 28735 Drama, Historical, Josei \n",
"\n",
" name average_rating \\\n",
"0 Clannad: After Story 8.93 \n",
"1 Shouwa Genroku Rakugo Shinjuu 8.57 \n",
"\n",
" overview type episodes \\\n",
"0 Clannad: After Story, the sequel to the critic... TV 24 \n",
"1 Yotarou is a former yakuza member fresh out of... TV 13 \n",
"\n",
" producers licensors \\\n",
"0 Pony Canyon, TBS, Rakuonsha, Animation Do Sentai Filmworks \n",
"1 Starchild Records, Mainichi Broadcasting Syste... UNKNOWN \n",
"\n",
" studios source anime_rating rank popularity \\\n",
"0 Kyoto Animation Visual novel PG-13 - Teens 13 or older 19 114 \n",
"1 Studio Deen Manga PG-13 - Teens 13 or older 93 804 \n",
"\n",
" favorites scored by members \\\n",
"0 68949 639729 1149886 \n",
"1 5711 91359 281445 \n",
"\n",
" image url \n",
"0 https://cdn.myanimelist.net/images/anime/1299/... \n",
"1 https://cdn.myanimelist.net/images/anime/1354/... "
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head(2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CYqncb_s3t5n"
},
"source": [
"#### 1) Top n animes based on Popularity"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 363
},
"id": "6f9X0QLq3NhC",
"outputId": "f9ec7fbe-4fa6-4188-8906-cbeae62e48f2"
},
"outputs": [
{
"data": {
"application/vnd.google.colaboratory.intrinsic+json": {
"summary": "{\n \"name\": \"popular_animes(n=10)\",\n \"rows\": 10,\n \"fields\": [\n {\n \"column\": \"name\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 10,\n \"samples\": [\n \"Hunter x Hunter (2011)\",\n \"Death Note\",\n \"Boku no Hero Academia\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"popularity\",\n \"properties\": {\n \"dtype\": \"number\",\n \"std\": 3,\n \"min\": 1,\n \"max\": 11,\n \"num_unique_values\": 10,\n \"samples\": [\n 10,\n 2,\n 6\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
"type": "dataframe"
},
"text/html": [
"\n",
"