{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "UhZuaTM3YjJ-" }, "source": [ "# Midterm - Spring 2023\n", "\n", "## Problem 1: Take-at-home (45 points total)\n", "\n", "You are applying for a position at the data science team of USDA and you are given data associated with determining appropriate parasite treatment of canines. The suggested treatment options are determined based on a **logistic regression** model that predicts if the canine is infected with a parasite. \n", "\n", "The data is given in the site: https://data.world/ehales/grls-parasite-study/workspace/file?filename=CBC_data.csv and more specifically in the CBC_data.csv file. Login using you University Google account to access the data and the description that includes a paper on the study (**you dont need to read the paper to solve this problem**). Your target variable $y$ column is titled `parasite_status`. \n", "\n", "\n" ] }, { "cell_type": "markdown", "source": [ "- https://pantelis.github.io/artificial-intelligence/intro.html" ], "metadata": { "id": "AWSv8yOxyXYD" } }, { "cell_type": "markdown", "metadata": { "id": "1THcWuqiYjJ_" }, "source": [ "### Question 1 - Feature Engineering (5 points)\n", "\n", "Write the posterior probability expressions for logistic regression for the problem you are given to solve." ] }, { "cell_type": "markdown", "metadata": { "id": "MckwhLbUYjJ_" }, "source": [ "$$p(y=1| \\mathbf{x}, \\mathbf w)$$ \n", "\n", "$$p(y=0| \\mathbf{x}, \\mathbf w)$$ " ] }, { "cell_type": "markdown", "source": [ "$$p(y = 1|x, w) = \\sigma (\\mathbf{x^T}\\mathbf{w}) = {1 \\over 1 + e^{-x^Tw}}$$\n", "\n", "$$p(y = 0|x, w) = 1 - \\sigma (\\mathbf{x^T}\\mathbf{w}) = 1 -{1 \\over 1 + e^{-x^Tw}}$$\n", "\n" ], "metadata": { "id": "Dof11_sUofVi" } }, { "cell_type": "markdown", "metadata": { "id": "_cHO1w6HYjJ_" }, "source": [ "\n", "### Question 2 - Decision Boundary (5 points)\n", "\n", "Write the expression for the decision boundary assuming that $p(y=1)=p(y=0)$. The decision boundary is the line that separates the two classes." ] }, { "cell_type": "markdown", "metadata": { "id": "vKseaYyfYjKA" }, "source": [ "$$p(y=1) + p(y=0) = 1$$\n", "\n", "Linear decision function:\n", "\n", "$$f(x) = w * x + \\alpha$$\n", "\n", "Decision Boundary:\n", "\n", "$$H = \\{x : w * x = - \\alpha\\}$$\n", "\n", "$$y = 0.5$$" ] }, { "cell_type": "markdown", "metadata": { "id": "nGDtm1LWYjKA" }, "source": [ "\n", "\n", "### Question 3 - Loss function (5 points)\n", "\n", "Write the expression of the loss as a function of $\\mathbf w$ that makes sense for you to use in this problem. \n", "\n", "NOTE: The loss will be a function that will include this function: \n", "\n", "$$\\sigma(a) = \\frac{1}{1+e^{-a}}$$\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "ytoLWwasYjKA" }, "source": [ "$$a = w^Tx$$\n", "\n", "$$L_{CE} = - [\\sum_{i=1}^n \\{y_i \\ln(\\sigma(a)) + (1 - \\sigma(a)) \\ln(1 - \\sigma (a))\\}]$$" ] }, { "cell_type": "markdown", "metadata": { "id": "M_0ufZQtYjKA" }, "source": [ "\n", "### Question 4 - Gradient (5 points)\n", "\n", "Write the expression of the gradient of the loss with respect to the parameters - show all your work.\n", "\n" ] }, { "cell_type": "markdown", "source": [ "$${d \\over da} \\sigma (a) = {d \\over da} (\\frac{1}{1+e^{-a}})$$\n", "\n", "$$= {e^{-a} \\over (1 - e^{-a})^2}$$\n", "\n", "$$= {1 \\over 1 + e^{-a}} * {e^{-a} \\over 1 + e^{-a}}$$\n", "\n", "$$= {1 \\over 1 + e^{-a}} * {1 + e^{-a} - 1 \\over 1 + e^{-a}}$$\n", "\n", "$$= {1 \\over 1 + e^{-a}} * {1 + e^{-a} \\over 1 + e^{-a}} - {1 \\over 1 + e^{-a}}$$\n", "\n", "$$= \\sigma(a)(1 - \\sigma(a))$$\n", "\n" ], "metadata": { "id": "j7JwYBU5c2Oz" } }, { "cell_type": "markdown", "metadata": { "id": "mM9vu8WnYjKA" }, "source": [ "$$ \\nabla_\\mathbf w L_{CE} = \\sum_{i=1}^m (\\hat y_i - y_i)x_i$$\n" ] }, { "cell_type": "markdown", "metadata": { "id": "mbKlMmtMYjKB" }, "source": [ "### Question 5 - Imbalanced dataset (10 points)\n", "\n", "You are now told that in the dataset \n", "\n", "$$p(y=0) >> p(y=1)$$\n", "\n", "Can you comment if the accuracy of Logistic Regression will be affected by such imbalance?\n", "\n" ] }, { "cell_type": "code", "source": [ "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "\n", "np.random.seed(0)\n", "sns.set_theme(style='whitegrid', palette='pastel')\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "# Need to manually import to execute\n", "\n", "df = pd.read_csv('CBC_data.csv')\n", "df.info()" ], "metadata": { "id": "c6XkzsjVR5cx", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "8535c7b6-c17e-42d3-ad3a-cfcad4e34f25" }, "execution_count": 537, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "\n", "RangeIndex: 3018 entries, 0 to 3017\n", "Data columns (total 15 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 ID 3018 non-null object \n", " 1 SEX 3018 non-null object \n", " 2 TYPEAREA 3018 non-null object \n", " 3 SEX.REPRO 3018 non-null object \n", " 4 REPRO.STATUS 3018 non-null object \n", " 5 AGE 3018 non-null int64 \n", " 6 PARASITE_STATUS 3018 non-null object \n", " 7 RBC 2995 non-null float64\n", " 8 HGB 2995 non-null float64\n", " 9 WBC 2996 non-null float64\n", " 10 EOS.CNT 2995 non-null float64\n", " 11 MONO.CNT 2995 non-null float64\n", " 12 NUT.CNT 2995 non-null float64\n", " 13 PL.CNT 2995 non-null float64\n", " 14 LYMP.CNT 2995 non-null float64\n", "dtypes: float64(8), int64(1), object(6)\n", "memory usage: 353.8+ KB\n" ] } ] }, { "cell_type": "code", "source": [ "def label_function(val):\n", " return f'{val / 100 * len(df):.0f}\\n{val:.0f}%'\n", "\n", "df.groupby('PARASITE_STATUS').size().plot(kind='pie', autopct=label_function)\n", "\n", "plt.ylabel('') \n", "plt.title('Parasite Status')\n", "plt.show()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 264 }, "id": "_ykAJfjMSEnm", "outputId": "77e9250b-b988-4cfd-b733-9b0b88321622" }, "execution_count": 538, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "
" ], "image/png": "iVBORw0KGgoAAAANSUhEUgAAASUAAAD3CAYAAABb5kLnAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/NK7nSAAAACXBIWXMAAAsTAAALEwEAmpwYAAArAklEQVR4nO3deXgURf7H8Xf3HLnJDYQzXOFQkEjCISCSeHKKyHogeOy6oKuI16og3rsosrLrIorHruv+VFZRRBGPVYEgyiksiALhSgLhSEjIPZmZ7vr9MRiJgCSQSTqZ7+t5eB4y091TPZn5pKq6qlpTSimEEMIi9IYugBBCHE9CSQhhKRJKQghLkVASQliKhJIQwlIklIQQliKhJGps+PDhrFmzpqGLIZo4TcYpNQ5paWnk5+djs9kICQnhwgsvZMaMGYSFhTVIef7+97+TlZXF7Nmzz2j/4uJiZs6cSUZGBuXl5TRv3pyxY8fy+9//HoCuXbvy+eef0759+xodb8KECYwaNYpx48adUXmEdUhNqRF56aWX2LhxI4sWLeL777/nxRdfrNX+SilM0/RT6Wpn5syZlJeXs3TpUjZs2MC8efNo165dQxdLWICEUiPUokULBg8eTGZmJkVFRUyaNIn+/fuTmprKpEmTOHjwYNW2EyZMYM6cOVx77bWcd9555OTk8N5773HFFVeQnJxMeno6CxYsqNq+oKCASZMmkZKSQt++fbn++uurgiwtLY1vvvmGjIwM5s+fzyeffEJycjKjRo0CoKSkhGnTpjFo0CAGDx7MnDlzMAzjpOewZcsWRo4cSWRkJLqu06lTJy6//HIAxo8fD8Do0aNJTk5m6dKlv3qec+bMYf369TzxxBMkJyfzxBNPsG/fPrp27YrX6632Xrz77rsAZGVlccMNN9CnTx/69evH1KlT6+i3I86aEo3C0KFD1apVq5RSSuXm5qphw4apOXPmqIKCAvXpp5+q8vJyVVJSou6880512223Ve13ww03qCFDhqgdO3Yoj8ej3G63WrZsmcrKylKmaao1a9aoXr16qe+//14ppdTs2bPVjBkzlNvtVm63W61bt06ZpnlCGZ5//nl17733Vivj7bffrmbMmKHKyspUfn6+Gjt2rHr77bdPej7Tpk1Tw4YNUwsXLlR79uw54fmkpCS1d+/eqp9rcp7vvPNO1c85OTkqKSlJeTyek25z9913q3nz5inDMJTL5VLr1q07/S9B1AupKTUif/jDH0hJSeH6668nNTWVyZMnEx0dzWWXXUZISAjh4eHcdtttrFu3rtp+Y8aMoUuXLtjtdhwOBxdddBHt2rVD0zT69u3LwIEDWb9+PQB2u528vDxyc3NxOBykpKSgadppy5afn8+KFSuYNm0aoaGhxMbGctNNN/Hxxx+fdPsZM2YwcuRI3nzzTYYPH84ll1zCihUrTnn8mpxnbdjtdnJzczl8+DBBQUGkpKSc8bFE3bI3dAFEzb3wwgtccMEF1R6rqKhg5syZrFy5kqKiIgDKysowDAObzQZAQkJCtX1WrFjBCy+8wN69ezFNE5fLRVJSEgC//e1vmTt3LrfccgsA11xzTVXn86/Jzc3F6/UyaNCgqsdM0zzhtX8SHBzM5MmTmTx5MqWlpbz88stMnTqVZcuWERUVdcL2NTnP2rj//vv529/+xtVXX01kZCQ333wzV199da2PI+qehFIj949//IM9e/bwzjvvEB8fz48//siVV16JOu6i6vE1HbfbzZQpU3jmmWdIT0/H4XBw++23V20fHh7Ogw8+yIMPPsiOHTu48cYb6dmzJwMGDKj2ur+sPbVs2RKn08nq1aux22v3sQoPD2fSpEnMnz+fffv2nTSUanKexwsNDQXA5XIRHh4OQF5eXtXz8fHxPPXUUwCsX7+em2++mdTU1Bpf7RP+I823Rq6srIygoCCaNWvG0aNHmTt37q9u73a7cbvdxMTEYLfbWbFiBatWrap6ftmyZWRlZaGUIiIiApvNdtLmW2xsLPv376/qBG/evDkDBw7k6aefprS0FNM0yc7OZu3atSctxwsvvMDmzZtxu91UVlbyxhtv0KxZMzp06ABAXFwcOTk5NT7PX24fExNDixYtWLx4MYZhsHDhwmrPf/LJJ1Ud5ZGRkWiahq7L18EK5LfQyN14441UVlbSv39/rrnmGgYPHvyr24eHh/Pwww8zdepUUlNTWbJkCWlpaVXPZ2VlcfPNN5OcnMw111zDddddR//+/U84zk9Xyvr168eYMWMAmDVrFh6Ph2HDhpGamsqUKVOq1U6Op2ka06ZNo3///gwePJhvvvmG+fPnV427uuOOO3jwwQdJSUlh6dKlpz3PiRMn8tlnn5GamlpVA3ryySd57bXX6NevHzt37iQ5Oblq+y1btjBu3DiSk5O57bbbmD59Om3btj3d2y3qgQyeFEJYitSUhBCWIqEkhLAUCSUhhKVIKAkhLEVCSQhhKRJKQghLkVASQliKhJIQwlIklIQQliKhJISwFAklIYSlSCgJISxFQkkIYSkSSkIIS5FQEkJYioSSEMJSJJSEEJYioSSEsBQJJSGEpUgoCSEsRUJJCGEpEkpCCEuRUBJCWIqEkhDCUiSUhBCWIqEkhLAUe0MXQDQOhqkwTNA00I/9M0zwmAq3Fyq9CpfXt42ugU3/aTsNXfPtZ9PAYdMIsms4bGAoME3f8e066LrWsCcpLEFCSVTzU/jox+rQpS6TgnLFkXKDogpFaaVJpdcXROosXkcDgh0aoU6N8CCNcKdOVIhGbJhORLCOeezgDpsEVaDRlFJn89kSjZzH8P36FZBXYpBbbHC0XHG0wqTC0zAfDQ2ICtGJDdNpHqETH24jPEjDMH01MJvUqJo0CaUAYyqF1/A1rQ6XGmQXGhwqNihyWftjoGsQE6rTJspGh1g7IU4NFNilJtXkSCgFAFP5mmSGCbvyPeQUGuSVmTTm33yoU6sKqLgwHcMEhw00TUKqsZNQasLchkIHsgsNMvM8HCoxG7pIfmHXISHSRpd4OwnNbCjALk28RktCqYnxmgoNX9NsxyEv+44aGAH0Gw52aHSJt9OthR27rmHXpfbU2EgoNRE/dVj/eNDDj4c8VHobuEAW0CrSRo+WdppH2KT/qRGRUGrkPIavv2hLrpvMPC/eptlCOyshDo3uLex0beFAQ8LJ6iSUGimPoaj0Kjbt97D3iLdqXI84NacNerR00L2lhJOVSSg1Mj+F0bosNzlHjYYuTqPksEHPBAfdWjh8I82lU9xSJJQaCcNUmAo27XOz7bC3UV/Ot4pgh0bv1g46xtmrpsSIhiehZHHq2BijrAIv63Pc0oHtB5HBGoM6BdEsWJdpLRYgoWRhHkNR4jL5Zo+bgnLpwfa3pHg7fdo5j00olnBqKBJKFmSaCkPBmr1udh+RqlF9CnFoDOjgpEWETWpNDURCyWI8hqKgzCRjV2WDTYgV0DbKxgUdgrDpcpWuvkkoWcRPfUcbctxsPyy1Iytw6HBBxyBaRUqtqT5JKFmA11BUeBRfZbooqpBfh9V0bW6nT1un1JjqiYRSA/MaiqxCL6v3ujGkL9uy4sJ00pKCcdikE9zfJJQakNfwDYLMzJfmWmMQZIehXYKJDpWhA/4kodQAlFJ4TVie6eJAsVSPGhMN6N3GQfcWDmnO+YmEUj0zTIXHgM+3VXBU+o8arfbRNgZ2DJJg8gMJpXrkNRVllYrPt7nkcn8TkNDMxtAuEkx1TUKpnngMRV6pwfLMSllepAmJC9e5pGuwLCZXhySU6oHHUOw76uXrXe6zui2RsKaoEI3LuofgsMmk3rogoeRnXkORW2SwYmelBFITFh6kcUX3YJx2TYYMnCW5bbcfeQ3FoRKDFbskkJq60krFkq0uXB6FKSvunRUJJT/xGor8MpNlmZWy9lGAqPAoPv3RhdvwDfsQZ0ZCyQ+8pqKg3OSL7S5ZpjbAlLkVn/1YgUcWBT1jEkp1zDAVRRUSSIGsyKX473YX3kC6t1UdklCqQ0r51s/+7zaXXPYPcEeOLT8jwVR7Ekp1yGvCF9t9fQpC7Dtq8N0+twRTLUko1RGvoVi1u1Kmjohqth3ysueIV4KpFiSU6oDHUGw77CG7UKpI4kRrstyUuRWmXJGrEQmls2SYiiNlJhtzPA1dFGFRpoKvdrhkvawaklA6Cz91bC/LdMngSPGrSioV3+6Rju+akFA6C4aCr3ZUypgUUSN7CwyyCqV/6XQklM6Qx1B8n+uR+7GJWlm9x02FR/qXfo2E0hkwlaK0UrElV/qRRO0Yx/qXTPlbdkoSSmfANGHFTulHEmemyKX4/oAHjzTjTkpCqZY8hu8DVeySD5Q4c98f8OD2ymfoZCSUakEphcuj2HJAmm3i7JgKVsnVuJOSUKoFw4SMXbIUiagbB4tNcosNDJm5XY2EUg0ZpmJ/kcGRMumhFHVnzV63rCbxCxJKNaQUbMh2N3QxRBNT4VFs3OeWTu/jSCjVgNdU7Mr3UuqWD46oe9sPeXHJLbeqSCjVgFKwab/UkoR/KOC7HKkt/URC6TS8huKHAx4qvQ1dEtGUZRUaVMoQAUBC6bRMBVsPyhAA4X/St+QjofQrPIZi0363LG0r6sXeIwZuCSUJpdPZmSftNlE/FLAxR6afSCidgnHsipvUkkR9kqVzJZROSQE/SF+SqGcK2LQ/sGtLEkonoZQiv9SktDJwPxii4ew54kVr6EI0IAmlk/CavlncQjQErwl7CryYATr/RELpJNyGIrdI1rgVDWfbQW/AzomTUPoFj6HYKrUk0cAKK0zKAnRak4TSL+iar00vREPbcdgTkFfiJJR+obDclCklwhL2HPGiBWCPt4TScTyGb2ySEFbg8kJeaeANlJNQOo6uIbfeFpayO98bcGOWJJSOU+JSVMi6NsJC9hcZ6AHWhDttKKWlpTFixAjM425UlZaWxo4dO+q8MMXFxbzyyivVHps+fTrr16+v89f6Ja803YQFVXgC7w9ljWpK5eXlLF682N9lobi4mFdffbXaY3/6059ISUnx+2ujQVahhJKwnuxCb0DdUddek43uuOMO5s6dy/Dhw3E6nVWPHz58mKeeeorc3FwqKysZPnw4kydPBmD9+vU8/vjjAPTr148vv/yS+fPnk5SUxDPPPMPatWvxeDxER0fz5z//mdatW/PEE09QUlLC6NGjCQkJYcGCBUyYMIFbbrmFrl27Mm7cOJYvX47D4QBgypQpDB06lDFjxrBixQpefPFF3G43DoeDhx56iN69e9f4jXB5VJ1NK/F63Hz6+kz2bF2Dq6yYqOZtGPqbO+l83kAAfljzORnvv0RJwWGaxbTgonF30DVlaNX+az75P779+F94Kl1065vOFTdNw+5wUpR/gPkPXl3ttTyVFaRfdzf9h02ok7IL69l31KBznANnjb6tjV+Nakrnnnsu55xzDm+//Xa1xx944AEmTJjAwoULee+998jIyGDVqlW43W7uueceHn30UT766CP69etHbm5u1X633nor7733Hh9++CEjRoxg9uzZADzyyCNERESwePFiFixYUO21WrVqRZcuXcjIyACgsLCQNWvWcNlll5Gdnc28efN49dVXef/993nqqaeYOnVqjd8EpRQH6nAEt2kYNIttwYTpr3Lf/Awuuvp2Fs19gKN5uRQXHGbxiw9z8fX3ct/LK0m7biofvDidsqICAHZt/oZvl7zO+Adf4o6/fszRw/vJeP8lACLjEvjjq6uq/v3+z++gaTrdUtPrrOzCeg6XmNgCqPe3xtk7depUJk6cyNVX+/5Sm6bJ2rVrKSgoqNqmrKyMXbt2ERsbS3BwcFWz65JLLqFZs2ZV22VkZPDWW29RXl6O11vzJtOYMWNYtGgR6enpLFmyhLS0NEJDQ1m5ciXZ2dmMHz++aluv10t+fj5xcXGnPa7HgAPFdRdKzuAQLrxqctXPXZIvJCq+FQf3/khEdHOCQyOqak1deg/GGRRM4eEcwiJj2PL1Es4bMpr4Np0AGHTl71j84sOkXTPlhNfZ/PUS2nU7n6j4VnVWdmE9pvINDWjZzNbQRakXNQ6ljh07MmTIEP75z38CoGkamqaxcOHCqubUT7Zt23bK4+zfv5+ZM2eycOFC2rZty3fffcd9991XozJceumlzJw5k8LCQhYtWsS0adOqnhs8eDCzZs2q6elUo+twqMR/40FKi45w5GA2ca07EtOyHXGtOrDjuxV07j2IzO8ysNmdNG+bBEDevl0knT+kat8W7ZIoKzpCeclRQiOiqh5XSrFl1ccMGv07v5VbWEd2oZe4cB17AFyKq1Wl8M477+Stt96irKwMTdPo06cPL7/8ctXzBw4cIC8vj44dO1JRUcGGDRsA+OKLLyguLgagtLQUh8NBfHw8pmlWa6aFh4fjcrlOWXsKCQkhPT2d5557jtLS0qqa2MCBA1m5ciWZmZlV227evLnG5+U18NsVDsPrYfGL0+k1aARxrTqg6zZ6DhrBB/Om8fTN/fngxelccct0nMEhALgrKwgKDa/aPyjE93+3q7zacXN2bKSs6Ajd+17sl3ILaykoMzEDZBxlrbrOWrZsyejRo/nHP/4BwOzZs5k5cyYjR44EICwsjD/96U/Ex8fzl7/8hcceewyAvn37EhsbS0REBAkJCVx++eUMGzaM6OhohgwZUnXJPyoqipEjRzJy5EgiIyNP6FcCXxNu/Pjx3HXXXVWPJSYm8uyzzzJ9+nRcLhcej4fzzz+fXr161ei8jpT5Z8CkMk0+fGkGNpuDyyY+AMCe79fw1X/+xg3TXiEhsRsH9v7IO8/dzbX3/52W7bviDAqhsqKs6hg//d8ZHFrt2JtXLqFbavoJj4umqbDcxB4g/UqaUv651lhaWkp4uO+v/OrVq3nooYf48ssv0XVrvbOGqdi0z1PndyxRSrHklccoyj/ANfc9j8MZDMDqj98gJ/N/jJv6l6pt351zD22TetN/+EQ+mDeNyPhWDB13BwB7t67lgxenM3Xuf6u297hd/O2OS7n6rtkkntO3TsstrGtccgghDmt9f/zBb2f4+eefM2rUKEaOHMmzzz7L7NmzLRdIAIYJ+X6oKX3y+p/Jz93Db+75a1UgASR0PIec7Rs5mLUdgIN7t5GzYyPN23UBoOeg4fxvxWLy9u/GVVbC14tfpdfgkdWOvX39MoLDImjfI7XOyy2s62h5YLTf/FZTaiwMU7FwU3mdrgxQlJ/L3LtHYHM40fWfr5gMu3k65w4cxrr/LmDdp29RVlxAaEQ0fS7+TbVxRms++T++XfI6Hncl3VLTuOLm6dgdP48Pe3vW7SR0PJeLrr697gotLO+81g56tnKgN/GlAySUTMWb68tPv6EQDaxtlI2BHYNw2pt2KFmvPVXPAm1ekWi8CsrNgJicG/ChVOqSUBKNQ5lbSSgFgqOuwOg8FE2DOwCW+wroUPIaimIJJdGIuLxNv2Yf0KFkKuSGk6JRcQXAHU4CZDGEU9CgtNK/NaX8/bv59F9Pc3DvNkIjoki7birdUtLI27+bD+fP4OihfQC07NCdSyf8kfjWHQH4/ptP+OKtOdgdDkbc+hiJx8YkFR7KYfFLM5g447Vqww1EYChzm0DT/r0HdE1J13yLs/uLaXh596/30CX5Qu55aRnDbnmYD198mCMHsoiIimfsnc9yz0vLufvFr0hKHsIHLzxYtd+yd/7Ob596k8smPsDnb/w80fizfz/LJePvlUAKUKVuRVMfxRPQoaQBhh8XZc/P3UtJYR59Lx+PrttIPKcvbZJ6s2XVxwSHRRAV3wpN00ApNF2n4Fitqby0iIjoeCKi4ulwTj8K8/YD8OPaL4iIjqd1555+K7OwtgqPwmji3aAB3XzTNaj3fkOlyNu3q+rH2ZMuxO2qQCmTIVfdBkBYRDQVpUUUFxzi4N5txLfuSGVFGV8vfpXxD75UzwUWVuLyqCZ/O++ADiUAf9aEYxPaE9YshtUf/4u+l48n68f1ZG3bQGKPn9ccv29+Bm5XBZu//ojIuAQANF3n8pse4r3n78dudzLstzPIeP8lUi+5lsM5mXz9wSvY7HbSr7uH5m07++8EhOVUBsDVt4CeZlIfU0wOZe/g8zdmkbd/FwkduhMaEY3N7mTErY9W206ZJnP+kM6kp98jLDKm+jGydvDZv5/hhmkvM/fuEdw44zWKjxzii7fncPNjb/i1/MJamofrpCUFN+mpJgFdU6qPe/y1aJfEhId/vkPL64/fRK/BI07YTikTT6WLksLD1UJJKcVnbzzNpRMfoLzkKMo0iIxrRVhkHIdzdvr/BISlBEINIqBDqT5W8juUvYPYlu1RymTDF+9SejSfXoNHsXvLakIjomjergueygqWvzuP4LAI4lp1qLb/puWLaJnYnZbtu2IaXrzuSvL276b4yAGi41v7/wTOUHJrBxHB1v5r7na7mTv7STauW01JSREJrdtyy+SppA4YjMfj4elH/0jmtq0cOpjLrLn/4Lzzf167atOGtbz5zxfZuf1HIiKa8cb7n9dLmYPsOtZ+V89eQIeSUQ8t1+9Xfcym5R9gGF7adk3m+gfnYXc4qSwv4fN/z6K44BAOZxAJHc/l2vvnYncGVe1bXlLIus/e5sZHfeui6zY7l018gDdnTsLucDLi1sf8Xv4z1TFaERbqRBkeOLQDvO6GLtIJyitcdI7WuX/2Q7RqHseKtRu59+G7+fDlWSTExjC4ewK3jR7I1Cf/SgvXHtqX/bzKZxG5jL84FdeF5zH/7Q9oX/Z9/RTaGQqh7YHg027aWAV0n1KFx+TdjRUNXYwmq32MjdTWGiEODbLWoe1eDW5rLxMz8tkl3HFpLy47r13VYxc+/j7Pjr+Afp1bnrD9NzsO8PB/VvPVjDH1U8DoNpB6HTiabigFdE0pEO4M0ZCyCgyyCqBFhM6AdilEJPaF/VvQdn4NFUUNXbwT5JdUsDevmM4tIxu6KKemNf2hhQEdSoF0g7+GdKjE5IOtJtEhGgPan0vskJ5weCfajhVQmtfQxQPAY5jc93+rGJPSkU4trBxKTX8kf0CHkoZvAGVTH4xmFYUViqXbPIQ64YLEziQM6gyF+9B2LIfCfQ1WLtNU/PHNVThsOjPGWvxGDBZc576uBXQoGQocNup0fW5xeuVu+GKHG6cO/RLbkNh3PJQdQdu+DPJ2nf4AdUgpxfT/fEt+iYtXbh2Kw+rVZ2coNPE1ugM6lJQCh65RGRCjP6zHbcLK3W5WASntY0lKHovmLkPbvhwO/uDf4fbHPLpwLbsOFfPP29IJdlb/Ori9RlURPIZJpcfAadfRNA3TVHgME49hooBKj4GmgdPu5+ZVcDPQHaffrhEL6Ktvbq/isx9dFFY08RmOjUivVg7Oba6wmR7IXIG2739g+me5xf0FpaQ99QFOu479uGbR4+P6MapPB9KeXMT+wrJq+3z58JW0iQlnzc6DTJz3RbXn+nZqzr//cKlfylql5whol+zf12hgAR9KX2W6OFwioWQ1SfF2khPAaVOw6xu0rPXgrWzoYjW8/hMgNrGhS+FXAd1804AwZ9NunzdWO/K87MiDdtE6qe0HEtp5EGStPzbWqez0B2iqgps1dAn8LqBDyWaDiCAdCIDV2Bup7EKT7EKT5uE6A9qn0CwxFXK3omWuhIqjDV28+hcU1tAl8LuADiVd04gMsfjVFgHA4VKTxVtNIoPhgsQexA05B/J2+cY6lRxu6OLVD93W5Du5IcBDCSDS4pNGRXVFLvhkm4dQBwxI7ESrCzpBUa5vOEFhTkMXz7+Cm4HpAT3o9Ns2YgEfSuFBUlNqjMo98GWmG7sO/dq3omPf66GswDcQ83BmQxfPP4Ij6mWYREML6Ktv4Fvo7d2N5QFxk7+mTAf6tHOQFKvQPeW+sU4HttboS5z84IJqP7s8BtcPTGLGVakcKCzjrn+tZG9eMVf17cSDo/tUbfe7l7/irivOo2fb2Do+m1Nocx6ccxnYpabUpBkmRIbo5JXKsIDGzATWZXtYlw09E8Loec4wbD0uhcwMtJxNYJ562P7Gp6+t+n9ZpYdBj77H5cdWCZj/5VauTO3IyPMTGfPcUoafn0jPtrEs3biXNjHh9RdIANFtm3wgQYDfzQR8I/ajpbO7SdlywMNbm7x8m2unsnMa6uK7UZ0H1+gL/fnmbGLCg0np2ByAfQWl9O/SgogQJz3bxpJzpJRSl5uXv9rKPcN7+/lMfiGmbf2+XgMJ+G+jw6aRENn0Z14Hop15Xt7Z7GXZHkV52wGo9Kmo7pdAUPgp91m0bjdXpnTw3foK6NIykm+2H6S4ws3WfQV0aRnJXz/5Hzde2I1mIc76OhXfkiWh0TXaNC0tjcsvv5xRo0YxYsQIPv7441q/3JYtW7j33nsBKC4u5pVXXqn2/PTp01m/fn2tj1sTAd+nBODymLwji701efFhOgPa24gM0SD3B7SdK6G8sOr5/QWlXPynxXw+bTRtY33BdbSsksfeW8vuQ0Vc1bcTfTu1YObiDfz9pgt58v11HCwq54rz2nPD4K7+LXyzltB/IjhOX9tLS0vjpZdeIikpiR9++IFrr72W5cuXExMTc9p9T2bfvn2MHTuWNWvWnNH+tRXwNSUAu02Tkd0BIK/M5MMfPCze6iYvsjvqwkmolN/4vvDA4g176NMhviqQAKLCgvjrxMF8eP8IJg7uxpOL1jHjqlRe/morXRKi+OfkdBZ8u4Ndh/y8aF10mzNa4K1Hjx6EhYWxb98+brzxRkaOHMmYMWPIyMgAoKKigilTpjBs2DBGjRrFXXfdBcCaNWu46qqrAHjiiScoKSlh9OjRXHutr/9twoQJLFu2jNzcXAYOHIjH46l6zSlTprBo0SIAVqxYwbXXXstVV13FNddcw6ZNm05b5oDv6AbfekrNI3T2HJFLcIGg2AWfbvMQbIcBiR1pM6AjFB9g8TPXcOtFSafc7z+rM+ndPo6khCh2HDjKTUO64bTbSEqIYvuBo/5dHK55Z7DXfuDk6tWrqays5P777+d3v/sd48aNY+fOnYwfP55PPvmEDRs2UFZWxtKlSwEoKjoxXB955BHGjh3L4sWLT3iuVatWdOnShYyMDNLT0yksLGTNmjU8/fTTZGdnM2/ePF577TXCw8PJzMzk1ltvZfny5b9aZgklwGnTaNXMJqEUYFxeWLbTN9YpvOggh4oruOy2x2H/Gt/NDo5zpMTFW1/vYMFdlwH4VgrIPERyYjzf5xRw80U9/FvY6Np1ck+ZMoWgoCDCw8OZPXs2U6ZMYezYsQB07tyZ7t27s2nTJrp168auXbt4/PHH6du3LxdddFGtizZmzBgWLVpEeno6S5YsIS0tjdDQUFauXEl2djbjx4+v2tbr9ZKfn09cXNwpjyehdEzLZtLZHai8Jrz17iI6nz+UbFrRrdeV6N4K3xSW3O9BmTzz0QZuv7QnYUG+2sqk9HOY8q8MFnybyVV9O/p3aEBoTK2bbs8//zxJSb5aX2lp6Sm3a9u2LUuWLGH16tVkZGQwZ84cPvroo1q91qWXXsrMmTMpLCxk0aJFTJs2req5wYMHM2vWrFodT0LpmGCHRrDd99dTBJ5htzwMwIYcDxty4JyWIfTqfjn27hfDzpXMusFebaxTQnQY7069on4KF9v+rHYPDw+ne/fuLFq0iLFjx7Jr1y62bdtG7969OXjwIJGRkVx88cUMHDiQwYMHc/To0RP2d7lceL1e7PYTIyMkJIT09HSee+45SktLSUnx3ZZ+4MCBzJ07l8zMTLp06QLA5s2b6dWr16+WV0LpGMOEhEhpwgmfrQe9bD0IneJs9Ok0lKCkobBnNdqeteB11W9h2vYG+9kNP5g9ezaPPPIIr7/+Ona7nVmzZhETE8OKFSv4y1/+AoBpmvz+97+nRYsW7N27t2rfqKgoRo4cyciRI4mMjGTBggUnHH/MmDGMHz++qqMcIDExkWeffZbp06fjcrnweDycf/75pw0lGRJwnANFXv67XRYSEydqHanTr61OWJAO2RvRdn0DlSX+f2FnKKTdBbbAqT9IKB3HMBXvbCzHI5UlcQqxYToXtLcRFaLBwW1omRlQVuC/F2zXB7pffNY1pcZEQuk4HkOxJsvN7nzpWBK/LiIILkh00jwcOLLXNwG4+GDdv9CgWyHyxDvzNmUSSr9wuMTg0x/ruc9ANFrBduif6KBtM6D4kG/plCN76+bgQREw9I6AarqBhNIJDFOxcFO53AtO1Ipdh9R2TjpFm2iuYt+ic4e2n91BE/tB16FnNGiyMZNQ+gWPoVif7SYzT1JJnJnkNg56xCl0o9JXc9q/BdQZLI1z4WSIiK/z8lmdhNJJ5JcaLP1BmnDi7HRvYee8luDQDNj5NVr2d2B4Tr8jQEgUDJkMtsCqJYGE0kl5DcXHP1RQVCFvjTh7HWJspLTxDc5lz1q0PWvAc5pVKToP8v2TUBIApqnYW2Dw9W4ZsyTqTkIznf7tdN+68Dn/Q9u1ClzFJ26o6XDxPeAMqf9CWoCE0il4TcX7m8pl2omoc9GhOgPb24gO1eDQdrQdGVB25OcNWp0LPYcFxNK3JyOhdApeQ/HDQQ+b9tewD0CIWgp3woBEBy0jNCjI9l2xKzoAQ26H8Hpc+9tiJJR+hdtQvPtdOYa8Q8KPnHYY0N5Bu0ig4ihaSFRAjeD+JQmlX+ExFOuy3eyU4QGiHug6XNUrhFBnYC8IG9hnfxoOm8Z5rQLv6odoGDEhOk6bLMssoXQaTrtGYowsACf87/y2TmzyjZRQOh2HTSOlnRNd/oAJP4oN04kL06tu7RTIJJRqwGHTSGoeWJMiRf26oEOQ1JKOkbehBhw2jd6tnTikFSf8oFOcnfAgTWpJx0go1ZCuQ3KbwL1MK/zDYfOtLuCQDu4qEko1ZNc1OsfZaRYsHx5Rd3q3dqLLt7AaeTtqQddgQIfAHPov6l6zYI0u8XbschWlGgmlWtB1jZhQnU5x0uktzt4FHYLkqu5JSCjVksOm0be9kzCnfJrEmUuMsREdqqNLKp1AQukM2DS4sLM048SZCXNqDOgQJJ3bpyChdAZ0XSMqRKd7S2nGidrRgKFdgpA8OjUJpTPksGkkt3bK1ThRK+e1dhARLM22XyOhdBZ0HS7qEoyMeRM1ER+u06OlQ5ptpyGhdBZ0TSPcqdGnjawkIH6dwwYXdQnCLoF0WhJKZ8lu00hq7qBjrMxBEac2sGMQDmmy1YiEUh2w2zT6JwYRFyZvpzhR79YOEprZpJZUQ/ItqiN2m0Z612AZvySq6Rhrk36kWpJQqkMOHS7pGoxd3lUBtIjQ6Z8o/Ui1JV+fOqTrGqFBGkNkYGXAiwzWSEsKlkA6AxJKdcyuazSPsNG3nSxzEqiC7XBpN6kxnyl52/zAYdPoFG8npa0MFQg0dh0u6RaC0y6Ltp0pCSU/cdg0ujR3SDAFELsOl3UPJiJYwyaX/8+YTN7yo5+CCWB9jtxptylzHAukZsG6rI90liSU/EyCqelz2uDS7sE0C9KlY7sOSCjVAwmmpivYDpf3CCHMKU22uiK37a5HHkORU+hl1R438q43fmFOjSt6BBNkl0CqSxJK9cxrKArKTb7c4cJjNHRpxJmKD9dJSwrGoSPLkNQxCaUGYJiKcrfiv9tclLrl7W9skuLtpLRzSv+Rn0goNRDTVHgVLM90cbDYbOjiiBrw3c3GSftouwSSH0koNTCvodi0380PB70NXRTxK0IcGuldg+QKWz2QULIAj6HIKzX4elclLskmy/mp/8iuIx3a9UBCySIMU2GYsGp3JTlHpQfcCnTNt6Z295YOGRBZjySULMZrKHKOGqzeU4lHupoaTEyozpDOQYQ4NGmu1TMJJQvyGgqvCSt2ujhUIslUn3QNerdx0K25A5uOTKptABJKFuY1FLvyvXy3zy1jmupBbJivdhRsl9pRQ5JQsjivqTAVbMpxsz3PKyPB/SDIDr1bO+kUZ5fakQVIKDUSHkPh9ipWZ7nZLx3hdcKuQ4+WDs5JcKBrcmXNKiSUGhmPoThaYbJ6j5vCCulvOhOa5huVndzGiaYhi/pbjIRSI2QqhWnCviKDzfvdHK2QX2FNtY+xkdrOicOmSRhZlIRSI2Ye628qLDf5X66H3CJp1p2MrkGHWDu9WjkIdkgYWZ2EUhPhMRSVXsXm/R52H/Fiym+VYIdG1+Z2urdwSDOtEZFQamI8hkIp+PGQh515XsoCcBWClhE6PRJ8d6VVIKOxGxkJpSbKayjQoLjCZPthL1kFXtxNuHUXG6aTGGOjQ6wdh03DLpf2Gy0JpQDgMRS6BkfKTHble8kp9Db6ib8a0DxCJzHGTvsY3/gimyYLrjUFEkoB5qeAKqlU5BZ5OVBscrjEaBQjxkMcGvHhOu2i7bSNtqGUb6yRBFHTIqEUwEzTN8fOpkO5W5FbZHCg2OBwidHgNSld802KjQ/XSYi0ERdmw24DU/luZyRNs6ZLQklUUUrhMXwhZZhQ6jY5Wq4oLDcpdvn+lVSqOr2yF+LQiAjSCAvSiAjSiQzRiQ7RiAjWMUxfOMk8tMAioSRO66e1ntB8zSW319cM9BgKtwHuY1NgXF6F2+ubr6fha1bpmq+vx2nXCLJpOO2+/4c5NYLtGqby1X40zReGutSAAp6EkqhTplJVk4Y1zdchLU0tURsSSkIIS9EbugBCCHE8CSUhhKVIKAkhLEVCSQhhKRJKQghLkVASQliKhJIQwlIklIQQliKhJISwFAklIYSlSCgJISxFQkkIYSkSSkIIS5FQEkJYioSSEMJSJJSEEJYioSSEsBQJJSGEpUgoCSEsRUJJCGEpEkpCCEuRUBJCWIqEkhDCUiSUhBCWIqEkhLAUCSUhhKVIKAkhLOX/AQIJjoeuD4nCAAAAAElFTkSuQmCC\n" }, "metadata": {} } ] }, { "cell_type": "markdown", "metadata": { "id": "m0ME9LZGYjKB" }, "source": [ "The given problem is a binary classification problem with a dataset that is majority negative. Based on the graph above, 93% of data represent a 0 and 7% represent a 1.\n", "\n", "For a logistic regression model, the dataset may produce more false negatives rather than false positives. The weights calculated by the model will focus on negative results rather than positive.\n", "\n", "On a basic level if the model labeled all the data negative, it would have around a 93% accuracy based on the dataset. This would create a misleading result.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "6hzmRuBNYjKB" }, "source": [ "\n", "### Question 6 - SGD (15 points)\n", "\n", "The interviewer was impressed with your answers and wants to test your programming skills. \n", "\n", "1. Use the dataset to train a logistic regressor that will predict the target variable $y$. \n", "\n", " 2. Report the harmonic mean of precision (p) and recall (r) i.e the [metric called $F_1$ score](https://en.wikipedia.org/wiki/F-score) that is calculated as shown below using a test dataset that is 20% of each group. Plot the $F_1$ score vs the iteration number $t$. \n", "\n", "$$F_1 = \\frac{2}{r^{-1} + p^{-1}}$$\n", "\n", "Your code includes hyperparameter optimization of the learning rate and mini batch size. Please learn about cross validation which is a splitting strategy for tuning models [here](https://scikit-learn.org/stable/modules/cross_validation.html).\n", "\n", "You are allowed to use any library you want to code this problem.\n", "\n" ] }, { "cell_type": "code", "source": [ "from sklearn.metrics import f1_score\n", "from sklearn.model_selection import GridSearchCV\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.linear_model import LogisticRegression, SGDClassifier" ], "metadata": { "id": "d47OQOdkkAJN" }, "execution_count": 539, "outputs": [] }, { "cell_type": "code", "source": [ "df.head()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 270 }, "id": "6lCFhxjSX04P", "outputId": "e41f7d82-64c2-4d99-daaf-b7a01c163645" }, "execution_count": 540, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " ID SEX TYPEAREA SEX.REPRO REPRO.STATUS AGE \\\n", "0 grls5ZUT2BYY Male Suburban IntactMale Intact 9 \n", "1 grls8DCONYUU Female Rural NeuteredFemale Neutered 6 \n", "2 grlsUC5R4PTT Male Suburban IntactMale Intact 14 \n", "3 grlsXUR2PY88 Male Rural IntactMale Intact 6 \n", "4 grlsTBZUF3GG Female Rural IntactFemale Intact 18 \n", "\n", " PARASITE_STATUS RBC HGB WBC EOS.CNT MONO.CNT NUT.CNT PL.CNT \\\n", "0 Negative 6.4 16.6 14.2 142.0 852.0 6390.0 210.0 \n", "1 Negative 4.8 12.5 10.0 400.0 300.0 4800.0 209.0 \n", "2 Negative 6.2 17.3 9.5 190.0 475.0 7315.0 164.0 \n", "3 Negative 5.4 13.8 14.1 1692.0 423.0 7755.0 254.0 \n", "4 Negative 5.9 14.4 6.5 390.0 130.0 2795.0 213.0 \n", "\n", " LYMP.CNT \n", "0 6816.0 \n", "1 4500.0 \n", "2 1520.0 \n", "3 4230.0 \n", "4 3185.0 " ], "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDSEXTYPEAREASEX.REPROREPRO.STATUSAGEPARASITE_STATUSRBCHGBWBCEOS.CNTMONO.CNTNUT.CNTPL.CNTLYMP.CNT
0grls5ZUT2BYYMaleSuburbanIntactMaleIntact9Negative6.416.614.2142.0852.06390.0210.06816.0
1grls8DCONYUUFemaleRuralNeuteredFemaleNeutered6Negative4.812.510.0400.0300.04800.0209.04500.0
2grlsUC5R4PTTMaleSuburbanIntactMaleIntact14Negative6.217.39.5190.0475.07315.0164.01520.0
3grlsXUR2PY88MaleRuralIntactMaleIntact6Negative5.413.814.11692.0423.07755.0254.04230.0
4grlsTBZUF3GGFemaleRuralIntactFemaleIntact18Negative5.914.46.5390.0130.02795.0213.03185.0
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ] }, "metadata": {}, "execution_count": 540 } ] }, { "cell_type": "markdown", "source": [ "**Preprocessing**" ], "metadata": { "id": "Akl94gXapAzB" } }, { "cell_type": "code", "source": [ "# Remove rows with NaNs\n", "df = df.dropna()\n", "\n", "# All the columns replaced below have a sort of either binary or natural pattern\n", "# to them so manually replaced them inplace.\n", "#\n", "# Ex: Rural -> Suburban -> Urban has increasing population density, so mapped\n", "# to [0, 1, 2]\n", "\n", "df['SEX'].replace(['Male', 'Female'], [1, 0], inplace=True)\n", "df['REPRO.STATUS'].replace(['Intact', 'Neutered'], [1, 0], inplace=True)\n", "df['PARASITE_STATUS'].replace(['Negative', 'Positive'], [0, 1], inplace=True)\n", "df['TYPEAREA'].replace(['Rural', 'Suburban', 'Urban'], [0, 1, 2], inplace=True)\n", "\n", "\n", "# Undersampling\n", "# https://www.datasnips.com/63/undersampling-imbalanced-data-for-binary-classification/\n", "positive = df[df['PARASITE_STATUS'] == 1]\n", "negative = df[df['PARASITE_STATUS'] == 0]\n", "negative = negative.sample(n=len(positive), random_state=42)\n", "df = pd.concat([positive, negative], axis=0)\n", "\n", "\n", "# Removing `ID` since doesn't provide model relevant information\n", "# Removing `SEX.REPRO` because of already existing `SEX` and `REPRO.STATUS` columns\n", "try:\n", " df = df.drop(['ID', 'SEX.REPRO'], axis=1)\n", "except:\n", " pass\n", "\n", "\n", "# Shifting target variable to the front for readability\n", "cols = ['PARASITE_STATUS'] + [col for col in df if col != 'PARASITE_STATUS']\n", "df = df[cols]\n", "\n", "df.head()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 250 }, "id": "ARJrbriigHii", "outputId": "7d1d7b07-9cf4-41dc-e177-1b1d699babb3" }, "execution_count": 541, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " PARASITE_STATUS SEX TYPEAREA REPRO.STATUS AGE RBC HGB WBC \\\n", "7 1 1 1 1 9 5.8 14.7 13.9 \n", "19 1 1 0 1 25 5.8 14.6 11.3 \n", "23 1 1 1 1 24 5.7 14.4 10.1 \n", "24 1 0 1 1 11 5.0 13.6 10.7 \n", "52 1 1 1 1 7 5.6 14.4 11.8 \n", "\n", " EOS.CNT MONO.CNT NUT.CNT PL.CNT LYMP.CNT \n", "7 139.0 417.0 7089.0 334.0 6255.0 \n", "19 0.0 1017.0 6667.0 183.0 3616.0 \n", "23 3131.0 404.0 3333.0 262.0 3232.0 \n", "24 1177.0 535.0 4922.0 318.0 4066.0 \n", "52 118.0 354.0 5664.0 319.0 5664.0 " ], "text/html": [ "\n", "
\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PARASITE_STATUSSEXTYPEAREAREPRO.STATUSAGERBCHGBWBCEOS.CNTMONO.CNTNUT.CNTPL.CNTLYMP.CNT
7111195.814.713.9139.0417.07089.0334.06255.0
191101255.814.611.30.01017.06667.0183.03616.0
231111245.714.410.13131.0404.03333.0262.03232.0
241011115.013.610.71177.0535.04922.0318.04066.0
52111175.614.411.8118.0354.05664.0319.05664.0
\n", "
\n", " \n", " \n", " \n", "\n", " \n", "
\n", "
\n", " " ] }, "metadata": {}, "execution_count": 541 } ] }, { "cell_type": "markdown", "source": [ "**Modeling**" ], "metadata": { "id": "eJBjumyMpMl-" } }, { "cell_type": "code", "source": [ "X = df.drop(['PARASITE_STATUS'], axis=1)\n", "y = df['PARASITE_STATUS']\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)" ], "metadata": { "id": "NVyvAkQvjiir" }, "execution_count": 542, "outputs": [] }, { "cell_type": "code", "source": [ "iters = 1000\n", "\n", "clf = SGDClassifier(max_iter=iters, shuffle=True, random_state=42)\n", "\n", "f1_scores = []\n", "\n", "for _ in range(iters):\n", " clf.partial_fit(X_train, y_train, classes=[1, 0])\n", " y_pred = clf.predict(X_test)\n", " f1_scores.append(f1_score(y_test, y_pred))\n", "\n", "\n", "sns.lineplot(x=range(len(f1_scores)), y=f1_scores)\n", "plt.show()" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 270 }, "id": "n-_mpDTs6em-", "outputId": "4caba560-a959-4dcc-9cd4-be11264137d8" }, "execution_count": 543, "outputs": [ { "output_type": "display_data", "data": { "text/plain": [ "
" ], "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAD9CAYAAAC2l2x5AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/NK7nSAAAACXBIWXMAAAsTAAALEwEAmpwYAABC6UlEQVR4nO2deXAb5333v8DyvkSRFimQuizKkiBRJ1LnjzZOYzmmJ6EsTdJEffXaSceOMk3SuMn0sJxpdIyTadXMeJr6aBpNG1flNEn1OrET2q/tN1cTubUlURdB8JB4iwRBEiRF4gZ2n/cPEDcW2AV2sQeez4wtYvfZ5/n9nuO3z/lbAyGEgEKhUCi6w6i0ABQKhUKRB2rgKRQKRadQA0+hUCg6hRp4CoVC0SnUwFMoFIpOoQaeQqFQdEqJkECjo6M4efIklpaWUF9fj3PnzmHLli0JYf76r/8ag4OD0d+Dg4N4+eWXcejQIUkFplAoFIowDEL2wX/uc5/Dpz/9aRw5cgRvvPEGXnvtNVy4cIE3/MDAAD7/+c/jd7/7HcrKyrIKwXEc3G43SktLYTAYxGlAoVAoRQohBMFgENXV1TAaUydkshp4p9OJjo4OfPDBB2AYBizL4sMf/jDeffddNDQ0pH3mW9/6FgDgb/7mbwQJubKygqGhIUFhKRQKhZLI9u3bUVtbm3I96xSN3W5Hc3MzGIYBADAMg6amJtjt9rQGPhAI4Oc//zleffVVwcKVlpZGhRTS40/GarWivb1d9HNahupcHFCdi4NcdQ4EAhgaGora0GQEzcGL4Re/+AVaWlpgNpsFPxOZlsmnF2+1WnN+VqtQnYsDqnNxkI/OfFPbWQ28yWSCw+EAy7LRKZrZ2VmYTKa04V977TV8+tOfzknI9vZ2lJeXi36up6cHFoslpzS1CtW5OKA6Fwe56uz3+zO+GLJuk2xsbITZbEZ3dzcAoLu7G2azOe30zMzMDHp6enD48GHRglIoFApFWgTtgz9z5gy6urrQ0dGBrq4unD17FgBw4sQJ9Pb2RsP99Kc/xcc+9jGsWbNGHmkpFAqFIhhBc/BtbW24ePFiyvXz588n/P7Sl74kjVQUCoVCyRt6kpVCoVB0CjXwIpDq2yiReAgh0f+kjFdKCvE9GCnzgEKhxJB8m6SaGJoN4v2xAP73h6rAGBO3EXmDBBeve/CxB8qxcW0J3uzzoqLEgKl7LPa0lOLAhsT9+LaZIK5OBAAAn3uwOnp9wBHE5fFA9Pen91WiutyIC5fd0WuHtpejtT6c1UteDj/v9WL/hlL0TgcR4oDmWiM6zJUp8v/PqB+TSyw+e6Aqeu3dfi9mVjgAO2EB4PJz+MlNL+6rNmLRy6GMMcAbDBvLo3srEeIIuq0+VJQaUFNmwH01Rgw4QlhTacCRPVUpacZz424AffYgPrW/CpWl6bdhvWn1oqLUgEM7KgAAN6cCuDkVTMijTHCE4LUbXqyvM+IjbeE4Inn6xIeqYIwrN44YovmarkyT6bMH0TMZwJO/VwWDwYDXb3nQUGXEQ9sqBMkGJNYTjgD/dcePzx6oQgVPfghhZD6ESyN+/PHBKpSVJMYTCBH86JoHD7WVY0tjSdp7H2krx/2N+my6w/NB9LE7sZclKGXoqfZ80XUPvnc6CABRgxfPgpsFAAw4QgAAp5vD1D024bl4hudCadO4cTeQ8HvOzaWEGYp71u3nQAAMOkIIrQZ1rKQ+AwC350LwJck+kxR2zhX+Pe/mwHKJus4ss5hYCOvkCxLMu7movve82XvMtpkgWIIUGeJxemL5BgA3p1LzLhPcqsyjzlgckTwNJmVLMK4/EmSRlZ7JcDwR6Zd9BGMLAh6MY84VDn97LoT+mbBu93zpy0soffZwPK5AajwLnvC1gdnUfFz0rt5ziMtjLWFdbXvuAB3RSYGuDXzJqnahNO3xl0P+jM/emgok9MIpaqDwPTp/KGxoykvkSft/Rv346U1PSnoVadKTWxaK/tDnOG+V8BCegOXE9wZuiOyJCoUk/asFiIzSqj0fYkYVWJEh/turoztCCAwGQzS95KmbRFmogacIQ3c9eJefw4XLbjhWWDAZevD5MJhm+CyWdNNGfPTPBHHhslv0QuT7Y4GoURDC1YkA/vOasFHLB2N+vHYj1vO8cNnNO+J5b8SPH18L359cTD/VBQATiyFMLIYQ4JlFidfkP697otNs6fj1bV9G+dNxedyP/xOnEwAEVsXNZFR/ctODK+OZR4QROEKiUy3d1piM/37Fg9dveTIa8aHZsDDpevc/venBb0TqHCnvodlw/eLrCP2s14NLw8L0E0q31Yvf3hFfRlIwMh/ChctuBNlUfX1BgguX3ZjIUE8j7XFkPoRFT9jeLHgy1MUhH97u90oiu1h0Z+Dty+GMHp4LRQ08K7GBvz4ZyB5IQiJzyTkMRETNF9tmgvClq9dp0h2cDQmeJx2eD8G/Gm9/hvnjPnsQNnuml2eiYRtf5G9Uk/H3BObbgCMET5JOEaPHZHBj7fIT9Dv4DUI8mcpw2Uei99OtH0cuVZen3lzxE0xkyI90RMo7subB92Jd8hKMOIXpJ5QFDyd6PUQqrPawviv+1LaxtPryjay3pCMS5p6Pi3ZYxjPoMrnEYpZnnU1udD1FU7LaSnKZosmE3NMKd5dC0R07QKxh630nIcch+lIWQpZNNJIQyfJCfaYgUzI6L/6CER4dkWinQ8/o2sDLNUUjd0v7n9FA4hROuD7qvoGzhIDJYOKS9S+EzY28VAtl4DOWsd4rQIGITH+Jmb5MC4ES6/6i0LWBV2UPPoeHla5Dueo7thDCzanU6axrkwF4Ahz+oC1xP7rYEYpQo5tPeRXawEfonQ7CNhNENdeCiI/B6AJ9FoX+r82LtvtK4FhhUV1mxL7WUvys1wtPkGCPqRR7WxPPeKSdlkvDf1734NEdFfh/gz4c2lGOhiom4f7ViQDmXSw6zBW87ms5QvDz3sT5aH+I4Ge9Xjy8PdWT7H/d8WF8gcUndlXgvhom5X6E65MBeIIEv781MY4lL4dfDPrwyV0VuHY3iECIRM8w/PaOH1seLEHPRABjCyE8vD39+QjHCgt/kOCD8QAMADyrna9eexBrKmJ6egMcFjwcWutLMLEYQnMtk3YtZWQ+hPfH/AhxwNE9lairlG+mXHdz8PFE6pjepzbUynvD/rT77a32IEacqXOWada8kkhsLIUwupHBnxIvWZYDllEX/c2XPcmL73MuDu+PBTDqZGG1B+EOEKz4CVguv91hviDB5Qk/vEGSdq3ENhPErIvjncsHwovW93xJZzuWWXiDJO35k8jc9q009+LptQcxPJ/6php0BOEJhM8/DM+HMLnEJkztsRzBsDO8nuTkWbR/p9+H39wJ6+1J2hwRr8vb/T78csgPl5/Db2770y5Me4MEl0b80VkFKTZsZELXBj7SIq5OBNKumAthaimEkTQVJ2OyGd4ouUgRnYPP4Vm1E69TtoFW8m1jnmY3yBJcGfcjlCFhLXQOBmcLP5nsDRL0TATApcmgsYX85YmUTQShL/O7SbtfIj1oV9yCqmMlZsivTgSiB/ni5+QDLHBl3C9q9L/iD4f1BiLxpbrg4JLik3hyIQV9G/hVglzshJxYfjnkx6WRxDdxwRu9ni18HMmVPxv59uD77EH0O0IZT4YWuqwzqsQjS7yrDLmJyGdf5tA3E4T9Xmqv94MxfnmEltnNqWDCziShRf2r24ltNXKeID6upbhRZfzLMX5OftHDod8RSjsqyEb8Vtds1SfHfqdgdD0HH18rxhdDqKs0oO2+9N8uFEPB7SxRKN0Ck2zf7fdY+FmCHU2RMkudohmeC6KsxICNa4VX5ZtTAWxcy0TT4zPihGTfHijmbMK8i8XdpcxbAzPFFn/gbGopBJefYEez+PrcMxnAkMRTA1cnsu+Tn11hE7evIuy6I7JjbGKRRVVZuIynk14cM8ssRp3hbay7TYk630pa53H7OXiDBO4AiZ5mF4LVHkzJf6FbrOPD+eIOx2VrtHJ3IPRt4ONY9hG8NxKQxMAXGl0b9jjlkg38b1fnMCMGPmUXjQF4bzTiAE6MgQ+idzqIXesz1wUh/m7ElM1btvwO9sSnFXG1kYuB78t41kA8hBDYZrL3dN/uT9X/0og/4TxF8jmECAEW+N1qfdi2riRh8TJ5XeG1m7FF3I31/Auz6UjOm6DAUaUtbt+8P+5wXNYevMxzNEUxRSM1Ss3LamE+OB/E1vV8ZmiE5KWQ6YSClkmOaRW63pQl2VSWI3CssChNY2tzkU1MPRFqoHmfz+EsVmSKZtbFZT0MSHvw+aCUIZYpPp3bdwENN7ddNHLmWyGNZ7p6EBIwiZvp9LB4UjM9WYLk35fHA7g9F5LuYBop3Ab0XHrYEQPvdHP4rzuZp67knoOnPfgcyGtfdS7PKGzZlU4/QsouGqUPCED6l4dYlXwCDusM5bDLZt7FJuw8yYfFVRfIyT37gpDvWaYcnk9erM0E3UWTDyowAFJCv3qUiCGfAs7j0Xf6fXCs+jySukgyLrImHW4GJDiNycNbNh9+clNaB1lVZfo2NxECIrrl6baZSklx5LjGEXrARYvko4HgbZISZFNyFJFpD7l7YHyUr06uymXg+chna6pUB9PEaJxv7sidu6rowY+OjuLYsWPo6OjAsWPHMDY2ljbcW2+9hcOHD6OzsxOHDx/G/Py8lLKK4s58CMt5fnlHNeh0m6TYFxSR8CQrIcBoHh4SI54blVpjjewi8en3407aRUSl4GQ2UYIWWU+fPo3jx4/jyJEjeOONN3Dq1ClcuHAhIUxvby9eeukl/Nu//RvWrVuHlZUVlJWV8cRYGIS56FS/2YwurqURVa2deCHGO9/eS77DTyk+C1fQUVTcKmtZ1COiCipAtlXWbNeLiGTHh4r34J1OJ2w2Gzo7OwEAnZ2dsNlsWFhYSAj36quv4qmnnsK6desAALW1tSgvT3UepHemltjM3itpJY+Sb+Xm68FPLSX2zCXJct7DUFJELl6E6LcOJBag2+rFxeue7AGLhRyyl++RqaUQ3khytCb3HLyBZOmCWK1WPPvss3jzzTej1z7xiU/gO9/5Dnbv3h29dvToUXz0ox/F1atX4fF48PGPfxxf+tKXeL3KxeP3+2G1WvNQI8YitwbTxLT6K3U71W5mAADQx+4EAFTDhS3M3ehvPsrhgx8xb3ORePrZB8AhcXvAGsM93CNror9rsYJNzBQAYJnUYJLbkBJ/JD4AGGTbEELqAZYHjMO4zbVFf+8yDmCZ1OIuaeWVuxpuuFGd9l58mkAsT5J1u984hiqDL21YPnYZB9DPbQdJ6kPEy7PTOBT+ADm3PXq/Ch54UJVWzuS822ScxAS3MUWXaW49Fkl99LfZOAijgfDK3GSYxTrjQor+LDFiYFU2k2EG90hdgmy7mQEESGm0TJLzM5lseQYA6wxzmCPrEq5F4h1i2xBEKZoNDrhIDdyoRrNhFg7SlBBWSDrp2GG8jUHugZS04+OrgQsu1ER/bzJOohpu9HOxMEawMDO3o79H2M3wohIV8MGHRI+N6co7G9uNd1BqiL3EM+mbS/z5UgU3PGnaXKNhAU7SkHCtHH5sY0bzTrO9vT1th1qyffAsy2JwcBA/+MEPEAgE8IUvfAEtLS04evRo3kJmo6enBxZL2Knq7bkgpldPNzJGQ8pR40i4vtXPy9XVrYFlZ3P0Nx9VlVXwe2ORRdPrcad4zyutWgu4Y2Hr19bD8sB6AMD4QgiTafbGRuIDgJHrHoTSfNKvvb0dt2/FegAHLQcxvsDibobPqdXW1cK9nH5IEZ8mEMuTyPXBq25wHLBz506sS3LVmi2/LJaDGLzqSdnnGy/Pnn37YQAwGNdjrKmpgceVKG9Enl9fHUy4vm3bNkysnuiM14VM+LEYd7Jy/4EDKGUMvDK3trZiT8v9KfoHWIKBnrBsGzdtAusMJchmsViw7OOiZZKcn/EQQtB3JXvPuKWlBXNJJzP72J34X5YqjN7yIhgk2LBhI+4uheBe4dDa2grH3Vh4i8WStWz42LtvX0JZpItvzZo1cMW5EXhg2zasr2PQfzX2nNHIJOSFo88Lr5tDVVUVfEnbBtOVd1Y59+5J2JGTSd/a2hp4Cvw1pdra2rRpNjc3wZl04reisgKWPZYEGyaGbJ3jrFM0JpMJDocDLBsuVJZlMTs7C5PJlBCupaUFjz32GMrKylBTU4NDhw7h1q1bogWWEjXskwbCn4+TYq60ULMB/z3iz/iNSamQa/66RKqCFyCeUBUceRoZf0jOT59T9EpWA9/Y2Aiz2Yzu7m4AQHd3N8xmMxoaEocanZ2duHTpEgghCAaDeP/997FzZ25DRalgBLXz3JqN2A8G2zJ841EwBVpkvTMfwm+G8vvIshCx0sk+m6E3l7yLho9SYQUvCULzn29v9APr8htEF6PRL+Q2Sa0jaCPCmTNn0NXVhY6ODnR1deHs2bMAgBMnTqC3txcA8MlPfhKNjY34xCc+gaNHj2Lbtm34oz/6I/kkF4CQ+f9cyfTB4Fwq1fW72V2+FntlFZoBYjwI5ku+ZdKyRvjxzvgRT9HXBb2gBl80bW1tuHjxYsr18+fPR/82Go147rnn8Nxzz0knXZ4Iyzt1zOP0TgdxYEPmbaXp9Cn0p+SkJO+6zRNBSQGPxMu9C4IXDVp4yUTWoO5Koe+TrGrdJJ4jP0vaYiVFRb9w2Y3/HskyHSNTNv7kphcTi8IPGwmdomGS5uBV4WxMwtO0BMrNx08l+Wn/ZZ5TebKjQEapyero2sBLeYhg0St8kczp1tYJ2js5fLVGKsbyOE0qd0PKN/7e6QAurO7wUFOjlxqh55zkSk/LyK2Lbgz83aUQLlx2Y8WXOcv04L+FkgWVFPH1u+IX1jNVT1p1xVPsWaYbAx/5MPZ83JfR0xWu0gUuZfqF0iWndAQ+JG5HRI4LDlJMj+QRh1SdivhYqLFXMSoqG11/8IPPd8uFK7kdBKFoAxW1ryhyyaQKXakvmpyhX3TKAzX24LMJ0HXFrZgLWinRgQqCEKwnT0AxY5KoMSiWzNUoaioe3UzRpENL3hcj6MG4FwrZF/MkTEBqWUnKH1JEmltkwjvwtHIXGn0b+HTXaB3TLLosOhFdeF3qT5EVXRv4tEf7Cy+FqtLPBS3KHI/S8kuVfqRzQqSMlKJrdG3gaQ+eIjtCdwtJXO/kqMZaaRpakVMNaN7AL7hZ2LlmweH1tA9eD5poRQdVb0ml8KKj5p4Tmjfw08ssFsjazF9RiqPIy1vTpExX8xSmUl9ZKmRaRb2hRkdK05OsFMoqmm3XhH8HiaijW5rNgOJCTaMGauC1jIrnDXgfkdBxiWA/XzmmoUg7zeSqQGC4QkPPOakXzRt4g0rc/VIofMix64VvRKDIGhO13KpF8wY+gpqGRYVC1SrzCZf0Pla1DvHwzffn93hOqM0nTYoItAuvGnRj4LWCGhqkWDQoMiUHci3nQp++1lV9lFmZojPwuqocFM0gcECjyIggXxT7qpUA1CtZYdC8gacz8OpEyYYlmc8TFX0DNcGGEqjqpJOK7bvqodskpYZWRooS5Fnvko0oneZWL2oqA0HugkdHR3Hy5EksLS2hvr4e586dw5YtWxLCvPjii/iP//gPNDU1AQAOHjyI06dPSy5wCrQLX0TkVti5NjhBzwl1VcB3Q6BKav4IZMG3SYqJSE3WVgEEGfjTp0/j+PHjOHLkCN544w2cOnUKFy5cSAl39OhRPPvss5ILKQQtzl1ShPPfI37YyfqEa1opSynkTOnBy6B8zlFqpSBUiOJTNE6nEzabDZ2dnQCAzs5O2Gw2LCwsyCxa8aEnPzlSo+SHwTMhm5/3JPgWMmmNyYzaDqsVmqwG3m63o7m5GQzDAAAYhkFTUxPsdntK2DfffBOHDx/GU089hevXr0svrQBmltU8mJX2owcqPsgqPO58Ii/gRHS6chP8Qib5ixSflFxrrFqhmHUXi2Sf7PvjP/5j/Omf/ilKS0vx3nvv4ctf/jLeeustrF27VnAcVqtVdLpObi2AZizfuwegJmv43l4rgLbo7+Xle+jp6QOwU3TaPT09CLEPAGCyhrXb7eAc81ji6gC08MR3La0cFfDCh8qU6zeu34CLVANo5U13ZXkFQDWv/JH04v8GAH/ADw4MAAZDQ0OYNniSns6cX9ev3wBHtiG5D5Esj9vtBtLoxidnPCMjI4joHg4TJjmPb928iRIDyyvz1NQUAvYFJOYFECQlALaFw9ydgovUAKiKk+saPKgCsClFhjDh+K5fv44lsgbAeiRz584dABujv2dmZgA0poTrHxgAsAUAMD09DT9ZA6BstaN1XzTcNZ46JITe3l5E9I3Al/cJz1kT21TsuTBudjOASvh8PgDlCeFcLhfi81QIfX19KDcE4q7wy+fxeABUiIo/X9w8aTocswAaEq4FAwH09PQDSFd/8iergTeZTHA4HGBZFgzDgGVZzM7OwmQyJYRbt25d9O/f//3fh8lkwu3bt/Hggw8KFqa9vR3l5eXZA8bRPxPEzEQAtXVr4Fpms6expx23b3qjv+vq1sCysxl9l8V/iNtiseB2jxuB7MnCZDLhwIbNGJ4PYmokkDaMxXIQtivJhhSorq6Gz506Mtm/fz+m7rG4O+znTbe2rhZunlGNxWKJ6h3/NwCUl5XDFyLgOGD79u0w1SW+xLLl1/4D+3H7mgdsUncrWZ7q6mp40+jGJ2c8W7dujepusVii1+/MBTE1GsvjvXv3orLMyCtza2sr9rTcn5AXAOAJcBi6Ea4rrRtacXeRhccVk9ViOQjHCoexAV+KDEAsj/YfOIDh+RDs46nl/sC2bZgYipXf+vXr4bQHU8Jt37ETo/3hdFpaWuCZCyEYIDCZTJifjoU/ePAgbFdT65AQ9uzZk9A2IjplK+v29nbcuZX6XARHnxdeN4fy8goE/IkVoqamJiFPhbB7926sqYx1HDLJV1lVBZ+nsKP6Kp40m5ub4JxJnGosLSuDZb8FPT09KfVHCH6/P2PHOOsUTWNjI8xmM7q7uwEA3d3dMJvNaGhIfBM5HI7o3/39/ZiamsL9998vWmDZUfH4TrCDrkKjdPo8qFQsWVD1NsliKggBiMoOmfNO0BTNmTNncPLkSbzyyiuoq6vDuXPnAAAnTpzAM888gz179uCFF15AX18fjEYjSktL8fd///cJvXr50W8t069m6kZNxpLXHYAKKkc6EQghMBjk2cOsJgOqdgQZ+La2Nly8eDHl+vnz56N/R4y+2iny8i4cKspoSURJE4kci528djz+VC1R2TZJFSPlpgYRicoRNCeK7ySr0mQo0W6rN+11pRue0umLRY3y5muQ+Z5Xg65qkEFNKPJS4UHzBl70KFCwb9PCs+QVJwsB9QMiKyrKW3Vv/lUOenYkM5o38BG0cpJV6fTVhlbyI58FcPGv7VSS5+BlMWyy+nRQBhWLVhB0Y+BzRwPObDLUUpnWsSgQYBwkmGsVWn5q7qimXWQVeE2yBKUIq0OKzsBrsbzVNKcnlEJIrAajJ1iEPGVNdlVAkv6lpOeer/A5pKYyKToDT8kFNVVZAahN3IzyCOvCp07R5JCUTiA8f6sGEUKFOIJg8mlACdGPgc95El6VVUQQ+W6XowtU+VHImYJkXzRqQosjTLUQZIEf9uR28lgImjfwkf6PnhdZFZdZVm9j8kepdP6JgS+vU3rwvBHkkXbuj1JUiuYNfP7kvkrpCxJBfmjyhi6yqhhhZjHfXq6aB1sj6Vw5y7rKqm7UpKb2DbyCBu69EX4nX8WOCE+68iNBIvlEIYWOKaMSnlVWJYyLbUadvvopEroL1gpSzhfKuTgSj1ypqKmnoUay5c/QbAjT9wQO4XgiS+6f8LsqEJaMWhicDRt98/rS6DVZVFBhviwrsHOHD80beNEdeB0tsgLaa/h64spEerfP6ZC0mFS84BohkjfxBp5SeLQ/RSMSxRuE4gLoB6GjsZyzXOR+PDl3JSXrGvllm0n1HZ9HIhSdUXQGPhX1r1Jmsht0kVX9ZLT7SeVHR2TZiX/Z0ezKjOanaCIUY0HrQWc5dMg3To4jeb04OUIEz60LJnlaRkWuaLLGK2NF5fWTTwGgIwMvFKV7SKqqj2ra6aIw8dMrXVc9OLihFJsbcmseXVc82NyQ9J1egYusvPJl+a1WJhZDcK5+vk5qXzQrfg4/vZnexTYljH6maCSZaNUeSr+w+FBULAkSH3bmt/VvfKEQByTUz90l+fLhXoG/tapFNG/gi2EKmm6T5EFmBcT6PMkUplAnrdVWpnJ6NVabrmpE8wY+Qu4NSNuvCLrImooeG74edaLIj/YNfDEYOL2cfklHAXynqCGXBBcVb1lnD5L9ZvakGQ1ZBDWUq9oRVJyjo6M4duwYOjo6cOzYMYyNjfGGHRkZwb59+zTzEe58KFQFU7wiKy6A/KjFn5rSL60SiQ08ETvPJSpyiePTIYKK8/Tp0zh+/DjeeecdHD9+HKdOnUobjmVZnD59Go888oikQkoKrRQUDVKoaqulATFtytnJauCdTidsNhs6OzsBAJ2dnbDZbFhYWEgJ+/3vfx9/+Id/iC1btkguqFRIWSkK1RgEjNrVh8A5iYL4Es81ifhpkTxXWSXXUtWFHw89lKQkWQ283W5Hc3MzGCa8r5dhGDQ1NcFutyeEGxgYwKVLl/Anf/InsgjKR9QfvEZqj5rEVHo6gKJfZ2MRNCq2bpDkoFMwGMQ3v/lN/O3f/m30RZALVqtV9DOL3BoAJrjdbgCVWcMPDA4C2Bz9vbx8Dz09fQB2ik7b5XIBqBIU1m63g3PMY4GrB7BeVDqBQABAqtOmW7d64SGVAFp5n11ZXgFQnfbe9WvXAewAAPT09CA+D/wBPzgwABgMDw9j1uBKejpzft28dQscaUNyHyJZHq/XC6AiY1zJskUYGxsHYIoLE8aZlMdWqxVlhiCvzFNTU/BPLyTc93l9sPYNA9gKIFx+blKFTOXdc+0ajAakpHPz5k0skLUA7kt5ZnAosT7Oz88DqE8JF+5QhZ+fmZkBQQPSjSFv3LgBYDuvjJno6+tDkNuEeLPAl/dCCY/01wAAgsHUeiymDUXo7x9ApcGHe6QWmeq+2nA4ZgE08N6Pr8NSkdXAm0wmOBwOsCwLhmHAsixmZ2dhMpmiYebm5jAxMYEvfvGLAIDl5WUQQuByufD8888LFqa9vR3l5eWiFLgzF8T0aADV1dXwurMffNixYwfG+n3R33V1a2DZ2Yy+y25R6QJAdU0NPC5hhy1MJhMObNiMQUcQ9nHhXggBoLS0DKFgal9o7949cKxwuDvM75e+trYW7pX0Mh44cAD9q58Ls1gsCXlQXlYOX4iA44C2tjZsXJtYVbLl1969ezF804tkj8q1dbVwL8fkqayshN+buZ+XLFuEzVs2Y3o0EA0ToX8miJk4T4/t7e2orTDyytza2ord67fAdjX26bSKygrs3rYbd3rDJyVNJhNmltmM5X3w4EEwRkNKOvv27sXAbAjz06mOwZLr432N92EpzQc0TCYT5lafX9+8Hk4eJ2P79u/HwLXcPgG3e9duTA14wcYlz5f3Qlm7tgH3Vg99pavHNSLaUISdO3fivhoGo85QxrqvNpqbm+DM4Ds/vg4Lxe/3Z+wYZzXwjY2NMJvN6O7uxpEjR9Dd3Q2z2YyGhtibqKWlBR988EH094svvgiPx4Nnn31WtMCyU2RjxiJTN2fUsotGzjgoxYegXTRnzpxBV1cXOjo60NXVhbNnzwIATpw4gd7eXlkFzIbYhU4tLrJS8kNLxjHfE7N5py3jyTktlYNeEDQH39bWhosXL6ZcP3/+fNrwX/3qV/OTSseoqZLTRVYNodFCSBBb4pVirS48FxINnVvjoQi60ZrcJimQvHRQmS8aOUn/uY804ZQWlKIqtG/gV9F1b5S2Wt2Ssk1Sb0WtN300hm4MvGBohdMNaitKuXzESPC4YsgxCop5k9RqrhSOojPwWqwSvDKrWBnBosnoTlbONAqO2FO1OSLnjKc/Pxf7lBwoOgNPyQE9GMgsqGebZPrQWi0CWeXWaqYUEM0beEOkz6Fkl1EhCvURipxQUTbnLIrY+QUJdM6wfJp/5FnTVlGhCUBb0iqD5g18BMGLrArXilzSV1xmZZPXPhkyMJcpEU2Vh6aE1R+6MfAUHhTrwitILnqleUZsNGrITjXIEI8s8hAZ49YZ1MBTZEFv21bzkZOIeJ5vtBZ/XbYRnVYKI4LW5FWAojPwWqwTWpRZayiWxyo6qPdbGRx3yTm9SNtFdorOwKei7WqSrQFpW7vM6PmEbzKF0GnZp42c04aU6kDzBt5QvJtoKALRUpEr6WyMoj80b+DFkto4VDRG5iPTLow8xNeKoSC8E9OFlUMIvLLmHa/UAQuDes4XFCdFZ+ClpFAVTM9TESqzR7yoSkxVCaMgNB+yQg18gdFindSazFId2BEbTy4v4pQBmIAktbTzVY4XOEn6l8JP0Rn41EqRezVRfHKHqLcHXAix1Kh6vjKpbbQ24Ej/aUChBJK/2Sghd+aoc5ts6MfAq7G1S0Ree7AzPKzWl4Pk5Norltr9oTqjy8hlkd8PTsYp4DvJueAJcFjwyBO3ntC8gRfdi05pHYr3w7Mj0yJrPMu+DI1FQosysyy+Uf77lewfkb5w2Y0bd/MzRnmT6WWaZ9RDs4XprWrlnc9pRVCF0byBjyB0vlTKeqH0IquUzCyz0kaoQAO8NZ1+OkGQKNkCqelorqxbU6jl1BPaN/Aa6IAnoMH2ozmRJRJYCb2V3gevibLWhJDqQPsGPm+0u8hKQDtcaiPv4lDacyitT7qiREig0dFRnDx5EktLS6ivr8e5c+ewZcuWhDCvvfYaXn31VRiNRnAch8985jP43Oc+J4fMCYg1srT+qgs5ykNtcWbcJplUgQX14Ivcv4sWZFQLggz86dOncfz4cRw5cgRvvPEGTp06hQsXLiSE6ejowKc+9SkYDAa4XC4cPnwYDz74IHbu3CmL4Dmj8CKrRJ5soxTDSVY+pDR0UkSlhvzU+ghCzzhWCr/rJ+sUjdPphM1mQ2dnJwCgs7MTNpsNCwsLCeFqampgWLU2Pp8PwWAw+rsQKLEGVrC2oMFGpzaR83chUBiNlrzKbv1TW7npCbm2jGYiaw/ebrejubkZDMMAABiGQVNTE+x2OxoaGhLC/vKXv8QLL7yAiYkJ/MVf/AV27NghShir1SoqPAAsk1oArfB5fQDKs4YfHRkB0Bp7fvkeenr6AIgfabhdLgBVgsLa7XZwjnnMcw0AmkSlE94hlPqytFqt8JEKxOuTIqPbDaAy7b1bN28CeAAAMD4+DsAUvecP+MGBAcBgdGQEC2MrSU9nzi+r1QqO3I9sfQi/3w+gLGMYPiYnJwE0J1zr6enBXFIe22w2lMPPK/PU1BS800sAtkev+bw+9PePALgfAOBwOOAllchU3jdu3IARBEBivbdarVgk9QAaU57p7++PpgEALn92E+tccAJYk/Ze761eANuyxsEHy3Eo5NKcS0QbijA0NIRSBJCPnmqkp6dH8jgFTdEI5dChQzh06BCmp6fxla98BQ899BC2bt0q+Pn29naUl2c30vGML4QwecePisoK+L3ZG8f9W7fibpzf67q6Olh2WtB32S0qXSA8avG4hL2VTSYTDmzYjD57EI5Jcfu1DTCk7Vm1727HvJvD3RF+P95V1dXw8vQc9u7bh8Hr4T3mmzdvhn0sJld5WTl8IQKOC+fZ/Y2JVSVbfu0078adPl/GMABQXl6OoACjlo6NGzdiZiIxLy0WC3qnA5i9G9syuWvXLtRXGmDj2U/f2tqK7U1bMHAtdr+isgLmrWaMrOrQ3NyMeReXsbzrNrSjJ03ZNm4y485Y+jI3m2NpCKWhoQH3nOm3te7ZswdDN72i4ovHaDBCxsOnKYhpQxG2b9+OmnIDbuehpxqxWCyin/H7/Rk7xllf1SaTCQ6HAywbrlAsy2J2dhYmk4n3mZaWFuzZswe/+c1vRAucMwIrpRZ3CWhQZKwINdoFUC5X3zRi68qViUDaAzi9PPvzcyXj6eR8487zeYq6yGrgGxsbYTab0d3dDQDo7u6G2WxOmZ4ZHh6O/r2wsIAPPvgA27dvh9zkP8tf6EVWaZtQXsscAi2YFl+KYpFVRw3ln4ZEpQhA0BTNmTNncPLkSbzyyiuoq6vDuXPnAAAnTpzAM888gz179uDHP/4x3nvvPZSUlIAQgieeeAJ/8Ad/IKvwxY6a98ErefAzJU4iQc9WpfksORrQUwMiqgZBBr6trQ0XL15MuX7+/Pno39/4xjekk0oMInuwUvegtQzNicIgyl2wwtA6oS90c5I194qpxJMFRCkhCzK3rlzaSmGArtWjSIxuDLxgiqx1FJm6aVF1HuTShVe1QgWg2PUXQdEZ+NS6kfsgOacnaeWk5IPa5nTyZFbkFkmKODRv4HVW30Wh5neFmrzr5ppQIWQziKzBdIqG6i8GzRv4CEJ3OShdOZROPx7hRlhNUsfBI5bU/n60hF70oEiDbgx87mh7kTWb8c14Vw0KFIgiUpVCiVJ8Bp629Cg0K4Qjl7tgsRiQZbRKC5USh+YNfP7+4Au8yEqRjKK0ZbTSASjSss8BzRv4okYPtbwQ++Vz/YqGVLJJeAyWLrICkhxNLhIk9SapCKs9GlrelAhyuBVwujl4g7SWUbSF9g28SFKbqHYXWYkQIaTwPJjLFkOlM0cC4lWY08h+bR1ke1YI7cALpvimaGjNiKGCvFCBCACUk0PslHrWRVYKJY6iM/BKL7JqsW2qVWbV7s9PQlIps1U6bWQJpUBo3sAX+6YCtWzfUzuq1VXiCqxaPaWmaBTND80b+Ci0wNOilV4uRRh0F00YmgfC0I+BV4CcKhmtmZqBFhVF6+jGwAv2q0JbbQyd5kWyWoJ2G2kIOb/JStEXujHwSqD1+X/VeXwUi2oFS0TKTkX4G7waUVwmCIo9B4RDDbyGqwpR8Qx7ISQTk0IexwVkRfJOglorBEURdGPgNd8bVQA15EUhZODUoKhEZFtk1ZGqmSkaRfND0EnW0dFRnDx5EktLS6ivr8e5c+ewZcuWhDAvv/wy3nrrLRiNRpSWluLrX/86PvKRj8ghcwKGvLtAuUdQzH7HVU1SJr/T78PmtYyoZ3ivUVQBLRphCDLwp0+fxvHjx3HkyBG88cYbOHXqFC5cuJAQZu/evXjqqadQWVmJgYEBPPHEE7h06RIqKipkETxn9FQz9KBLgVa9xxfZgqQjOwaDPsqdUhCyTtE4nU7YbDZ0dnYCADo7O2Gz2bCwsJAQ7iMf+QgqKysBADt27AAhBEtLS9JLnIK4HnixtQ1JPvih0qGKlElopQNP98GDZoAIshp4u92O5uZmMEx4iMswDJqammC323mfef3117Fp0yasX79eOkmzkXOhExCN7p0smLOxHChIjkr4yT45ySSP1Ius6l12pyiB5N4kL1++jO9+97v413/9V9HPWq1W0c+4SBWATQgEAwBKs4a/e/cugKbo7+XlFfT02ADsFJ222+UCUCUorN1uB+eYxyzXCGCd6LTS0d/fDx+pAGDiDePz+QCUp71ns9kA3A8AGB8fT4jHH/CDAwOAwfj4OJYn7yU9nTm/xsYS4+MjGAoh12o4NT2N5Lzs6emBg7sPwH3C45magmt6GcC26DWf14ehoXEAm3KSLR6WZQGkXwOw9lkBtAmOKxgMYCUYBF+9s9n6ESlTvXJneBilCEJvevb09EgeZ9aWZTKZ4HA4wLIsGIYBy7KYnZ2FyZTaeK9fv46/+qu/wiuvvIKtW7eKFqa9vR3l5emNER/T91iMD/pQVlqGkAB/3a0bWuGYDEZ/19XVwbLjIGxXPKLlrampgUegG1mTyYQDGzbjxt0A5qaD2R8QgNlsxoKbw/RYgDdMRUUF/L70+bJr1y4MW70AgM2bN8MeF095WTl8IQKOAzZt3oztTYkvz77L7oyybd6yGdOj/HJFKC0pARvKGiwtLS0tmJtKzEuLxYJrkwHM24XncWtrK+5v3IzbN73RaxWVFdi+eTvGBny5CRcHwzDgeJYA2ne3406vN/3NNJSVlqGmohyelfT1zmw2Y6Qvf5nVTFtbG6rLDBjWmZ4Wi0X0M36/P2PHOOsUTWNjI8xmM7q7uwEA3d3dMJvNaGhoSAh369YtfP3rX8c//uM/Yvfu3aIFzZXIEFdO3+ZqJj9nYzJmhsCodVYc4snJX7AcglD0iKB98GfOnEFXVxc6OjrQ1dWFs2fPAgBOnDiB3t5eAMDZs2fh8/lw6tQpHDlyBEeOHMHg4KB8kueI3AtzmkJ5+667tAuB3vUTAs0DYQia/Gxra8PFixdTrp8/fz7692uvvSadVGLIe5VK38tSej4Uo5m1ca3ISdEdujnJqgRa90UjJ1qzaXJuk5QyL7J+70NrGZ8DRaCiZGjewIs1slJ+0YmSAdoKZYF+si8MzQNhaN7Ai0ZHFUPNlbwQ7pslVV+hk06iuxe0P0IRgW4MvBK2TsX2VRBqfkHkhV71olBEohsDL5TUtk8KahCkTirrQVYJEszJUwE1srKh54VzirTox8ArYFHoaFlhZHZVoEZjSescCt0n0zT6MfACSX0P0CaTlRxaUyEaYNE2ctqFpwhE8wZeCvOs6TahlkVKVSFeM8XyIocKrN9yE0ax6y8GzRv4CHSRVWVozF2wnBRSTq3kSd4UjaL5oX0DL7IHlHaRtZBImBzJNzoZ/cXQ9hdHhszIZQRK85bmgVC0b+ApOZNrI3ntRnbPm0IdQKi7oapbunRoT2IKIN8ekeIz8AovsuqhAboDKtFCI57jJHVVQL1JAqBZIBTdGHihb8BC+gVRO7I2EgXn4KXqDal1L7++3eNRpETzBj5iZA0KWFstNDOlZCxMuloogQJDs4QSh+YNfITce1sa7g/lK7hmFZcevWSFXvTIBIn+j5IN3Rh4odB6EUPOvBD8ws1nH3+RFmaRqp0AzQNhFJ2BT4UusmZD74ZUS+oZAG0JTFEUzRt40eZZQrewSi+yEqjX+BagA68LlK5DmqTYK40ING/gxSLlQSdN1LMchQxyBGwBFAxx0scpWmyNOa+irmj0h1zlJuibrKpmtQskPIOUOcvqDxLcng0WIKVEcjUG/lCe6erAVYEmvVIWgYXP+wR3ESGoBz86Oopjx46ho6MDx44dw9jYWEqYS5cu4VOf+hTa29tx7tw5qeXUPENzIfzPWAArPhm6rLlCW0lhkDifM708i6ZIdaeoPJN1ggz86dOncfz4cbzzzjs4fvw4Tp06lRJm48aN+Pa3v42nn35aciGlROk5a1ZF9l1OCuIuWHeNXAB00h6ADu27TGQ18E6nEzabDZ2dnQCAzs5O2Gw2LCwsJITbvHkzzGYzSkq0NeuTT0VRuq0JkV3oFA1tMHqBliQlRlYDb7fb0dzcDIZhAAAMw6CpqQl2u1124YQQNbIK1GtdNaUCTiPIjdi0tXZwJuNLW0N6UORHVd1tq9Uq+hkPqQCwBSzHAmCyhnfMzgJoiP5eWV7B9esDALaLTtvtcgGoEvXMvXv3ANSITisdg4OD8JEKAM28YQKBAIDStPdu3x4CsAkAMDExAWB92nCTk5PwTC3GXdmZVbaZmRkAjVnD5cPs3ByAtQnXenp6MMc1p1zPxPT0NFz2FQBbo9d8Xh/u3JkAsDFvOcNnpdOP93p7ewFsExyXz+sDBwOAsrT379y5AylkVjOjIyNgDCwidVcv9PT0SB5nVgNvMpngcDjAsiwYhgHLspidnYXJZJJcmPb2dpSXl4t6Zt7FYtTmg9HIgBMwv93U1IQFR2yLSG1dLQ48sB8DPdld4CZTXVMDj0vcpPqaNWvguseKTisdO3bsgNPNYWYiwBumtLQMoWD6bt22bQ9gfMgPANi4aRPs4+nj2bBxI3atjxm/vsvurLI1r1+Pebu8u4bWrVuHxdnE7T4WiwUfjPmxMCt8G1BLSws2rS3BsNUbvVZRWYFtG7Zh4rZfAkn5J/P27NmDoZte3vvJVFZWIMQCQR6Pntu2bYuWqV65f+tWlDHQnZ4Wi0X0M36/P2PHOOsUTWNjI8xmM7q7uwEA3d3dMJvNaGhoyPKkOlF6CEtH0NpBrWWlVrkKhdJtWEsI2kVz5swZdHV1oaOjA11dXTh79iwA4MSJE6tDTODq1at46KGH8IMf/AA/+tGP8NBDD+F3v/udfJJLRR6HXJReZFV1Sy/INpoCpKExaJZQ4hE0B9/W1oaLFy+mXD9//nz07w996EP47W9/K51kFNlJ2EUj9SJrAUyNZAeRNGYVNSYuRUE076pArB/45MZR6B0UUhoTIbJLkhy1KBSVobcqKZc+mjfwSpJboRS4airUPVV0m6RySRcG6oxGhyh4klVXKL3IqqYGqCZZckArvmikxAADte9AESmaH7ox8FpxTyt1+vnEJ/Qkay5pKJ3PYtHwd72KElpawtCNgddKiRe6By/5C0WgAgXRUyNlXkiKIUvoy1g4+jHwApHwex85zZqpqWoS3h8aR89DDoN2RKUoj24MvBKVPic7otI1VqmnuLTUgdecwVTVQo4y0BwQhuYNvCSf7CsgWq+YWrAtknXgNaBrClqUmSIbmjfwYkm7D76Q6Uu8Dz6v+UjBvXtqNdQE3UWDIlI0P7Rv4BX3FyAONfWAE3fRSLt4WpBP9smYhgHUhqgWWjCC0b6BV5DcFlm1XTu1Lb1w1KqnAerqJCiF3rKAnmTlwSDSzKY0DpJ7g1F8kVVAXIKTk9wXjYbQlLAUWlzC0byBj6CVQleTnEIPOiU8o6ZtNDyJqCmPC00x605JRTcGPle0vsiaVQEpVuRI2j9zTlYqinWbZEZ5taZMjhSJmnlThAZe2aqh/ZOsEkeoEvSil07UyI7uFKXOxjIj8YEeIeRSJOI+8CcvcvbG9WIw1YYB0KFxo8iFbgx8sZ5kLfQ0hZpsC29eSiSkmnSNR61yFQoCmgdC0byBF9uLpj1LHrS5yiobqj1eoVrBKGpE8wZeCgppioR6YxQYW14hchFFj+ad1wGdlpRYhXZgKPFo38BrrEfDFXzbjjLRFeQkq8jrekHv+gmBvsiEoX0DLxJaL9KjJsNNyQItA92h6EnW0dFRHDt2DB0dHTh27BjGxsZSwrAsi7Nnz+KRRx7Bxz/+cVy8eFFqWWVBre57k0k3UBESF7UFwuCdplEhapWrYBR9BghHkIE/ffo0jh8/jnfeeQfHjx/HqVOnUsL8/Oc/x8TEBN599138+Mc/xosvvoi7d+9KLnDeUHfBUeL94ijlNz4vZHY2pkayyaV1X0cUaclq4J1OJ2w2Gzo7OwEAnZ2dsNlsWFhYSAj31ltv4TOf+QyMRiMaGhrwyCOP4O2335ZH6jjENkR3ILEBeAIEk4uhnNL2BsU3ppzn4NMoOrvCYdGTeWd9JsM974o9e8/LH8+ih8PYQghjCyHcXWSzigoAbr/8hia5LAFgbCEEl1/caYN7Xg4zy4l6eQIE8y5huhYaoWWqV+bdHOZUWja5I0+XoiRbALvdjubmZjAMAwBgGAZNTU2w2+1oaGhICNfS0hL9bTKZMDMzI0oYq9UqKjwABAkDYBuEZtC8O7EBeIIE748FRKcLAK4cjFiuBr6M+OFHecK1Xnsw63OZkhucjb3YxjMY7rEFFmML4hpUcj7LQbo0fnvHLzqeu0ss7i4l6hdggX5HCABBCViEsjcV0dTAhb7eIQDbUYsVrKA2Y/gShOBz+0BQAwAoRRBBlCaEiS9TJaiCBx5UyZrG7TlldZQDLyrQ09MjebzS19o8aG9vR3l5efaAyVy9iZ279qCsBAiyQIkRKC8xwOUnqCg1IMgScAQoZYBACChhgDLGgBBHEFitKwZD+D7Lhf8rYcI9JY6EXx1GY/gZAiDIErBc+D6zOgYqYQxgOQJCwkbVYACY1XdOJE7GGP63vNSAEEtQYgzHTwBwXPiZSHjGGE7TCMAfAmorquALhgfgRgNw85YV5l3tAFbjJUA5E46rxBjTORIXYwRCbPhfDqu6xckErOpPYr7QCQHKSgzwJ41UjMbwPaMxHLC81ABvgKCy1ABfkMBgiMvDVX2Mhtgr2GgAjMZwHkTiK2PCMhsMseulq9eAsFy2PisO7tsTLS9vgMC4+o1SoyGmBwBUlRkQYMOycFxMhhIG0TTYJP0j8oLErpWVhOXwBsP5WRZ5PvKmJkBFqQH+EInmb3wZxv9dVmJAIESi8leWVoExNuMAS1BirIInGJapjAnLTgjQ12fFnvZwOZeXGGA01MHlJyhhws+7/QQGY7iuhbhYmUbqBGOI6Rspz7K4cg5y4X9DHAFjDMtXXW5AiA1fi4wWqsvCecAYDSAk/C9HwnoEWALGEM6nEmMV/CHAHyLh9saG5SYkXLalxnD4SN00JvXLWA4YtN3C7x3cB2+IgGWBijID2FU5ORJrswwDlBpjZVFVZkAgFJaJkEidIChdbbcE4fzwBsPXIu3TFyKoKjOA48LtkSPhNs4Yw22pxBjOW4LV9koIykrCFS/Akuh1ZrWOGwyG1fZGEGDD9YPjCPyhWJ1gjOH88ocIDAAGrC5YLBaIxe/3Z+wYZzXwJpMJDocDLMuCYRiwLIvZ2VmYTKaUcNPT09i7dy+A1B69nJQaQqivSp1tqq8Kl2BFaawWVZXF7pfBkPBbKOUlfKMFEcOsUuFhK1dlrC6PPVNmCKbVOXqfV0bxVAqQtazSID7dpHhLIy0u7np82ZUbgom/s6SVURYR+Z8gWxQe2bOQTubIs9VlqXqXG4KorUgs50i9BoC6SoF6JOkbibGEiVwJ34+UdYkxdi1CrP4lXk/WqaI0Jn+sLx8LU5Ilr0oMLIxGQ0J+IE7O1DYbn2/89yLUlCdeq1mVx8jErlUk5VdZQhEI06WUMaAy7pnKNLZGaL3Jlaxz8I2NjTCbzeju7gYAdHd3w2w2J0zPAMBjjz2GixcvguM4LCws4Be/+AU6OjrkkZpCoVAoWRG0i+bMmTPo6upCR0cHurq6cPbsWQDAiRMn0NvbCwA4cuQINmzYgEcffRSf/exn8ZWvfAUbN26UT3IKhUKhZETQHHxbW1vafe3nz5+P/s0wTNTwUygUCkV5iu4kK4VCoRQL1MBTKBSKTqEGnkKhUHSKKvbBR1zoBgK5HTgCwvtBiw2qc3FAdS4OctE5YjP53JAbiLQOynNiZWUFQ0NDSotBoVAommT79u2orU09Ca0KA89xHNxuN0pLS2EwyLvxn0KhUPQCIQTBYBDV1dUwGlNn3FVh4CkUCoUiPXSRlUKhUHQKNfAUCoWiU6iBp1AoFJ1CDTyFQqHoFGrgKRQKRadQA0+hUCg6hRp4CoVC0SmaN/Cjo6M4duwYOjo6cOzYMYyNjSktUt4sLi7ixIkT6OjowOHDh/Fnf/Zn0Y+c37hxA48//jg6Ojrw1FNPwel0Rp/LdE8rvPTSS9ixY0f0ZLOe9fX7/Th9+jQeffRRHD58GN/85jcBZK7TWq/vv/71r3H06FEcOXIEjz/+ON59910A+tL53LlzePjhhxPqMZC7jnnpTzTOk08+SV5//XVCCCGvv/46efLJJxWWKH8WFxfJ+++/H/39d3/3d+S5554jLMuSRx55hFy5coUQQsjLL79MTp48SQghGe9pBavVSp5++mnysY99jAwODupe3+eff558+9vfJhzHEUIImZubI4RkrtNaru8cx5EPfehDZHBwkBBCSH9/P9m/fz9hWVZXOl+5coVMT09H63GEXHXMR39NG/j5+XlisVhIKBQihBASCoWIxWIhTqdTYcmk5e233yaf//znyc2bN8knP/nJ6HWn00n2799PCCEZ72kBv99PPvvZz5LJyclow9Czvi6Xi1gsFuJyuRKuZ6rTWq/vHMeRBx98kFy9epUQQsjly5fJo48+qlud4w18rjrmq78qvEnmit1uR3NzMxgm/LVchmHQ1NQEu92e8s1YrcJxHH74wx/i4YcfTvmQeUNDAziOw9LSUsZ79fX1Ckguju9+97t4/PHHsWHDhug1Pes7OTmJ+vp6vPTSS/jggw9QXV2NP//zP0dFRQVvnSaEaLq+GwwG/MM//AO+/OUvo6qqCm63G9///vcztmOt6xwhVx3z1V/zc/B65/nnn0dVVRWeeOIJpUWRjevXr8NqteL48eNKi1IwWJbF5OQkdu3ahZ/85Cf4y7/8S3z1q1+Fx+NRWjTZCIVC+Od//me88sor+PWvf41/+qd/wte+9jVd66w0mu7Bm0wmOBwOsCwLhmHAsixmZ2dhMpmUFk0Szp07h/HxcXzve9+D0WiEyWTC9PR09P7CwgKMRiPq6+sz3lM7V65cwfDwMA4dOgQAmJmZwdNPP40nn3xSl/oC4bpbUlKCzs5OAMC+ffuwdu1aVFRU8NZpQoim63t/fz9mZ2dhsVgAABaLBZWVlSgvL9etzhEy2apMOuarv6Z78I2NjTCbzeju7gYAdHd3w2w2a2roxscLL7wAq9WKl19+GWVlZQCA9vZ2+Hw+XL16FQDwox/9CI899ljWe2rni1/8Ii5duoRf/epX+NWvfoX169fjX/7lX/CFL3xBl/oC4SmlD3/4w3jvvfcAhHdKOJ1ObNmyhbdOa72+r1+/HjMzMxgZGQEADA8Pw+l0YvPmzbrVOUImPXK9JwTNuwseHh7GyZMnsby8jLq6Opw7dw5bt25VWqy8uH37Njo7O7FlyxZUVFQAADZs2ICXX34Z165dw+nTp+H3+9Ha2orvfOc7uO+++wAg4z0t8fDDD+N73/setm/frmt9Jycn8Y1vfANLS0soKSnB1772NXz0ox/NWKe1Xt9/9rOf4fz589HvPjzzzDN45JFHdKXzt771Lbz77ruYn5/H2rVrUV9fjzfffDNnHfPRX/MGnkKhUCjp0fQUDYVCoVD4oQaeQqFQdAo18BQKhaJTqIGnUCgUnUINPIVCoegUauApFApFp1ADT6FQKDqFGngKhULRKf8fvo23k283LG8AAAAASUVORK5CYII=\n" }, "metadata": {} } ] }, { "cell_type": "markdown", "source": [ "**Tuning**" ], "metadata": { "id": "QuBneAp4Gck5" } }, { "cell_type": "code", "source": [ "param_grid = {\n", " 'learning_rate': ['constant', 'optimal', 'invscaling', 'adaptive'],\n", " # 'batch_size?': [16, 32, 64, 128]\n", "}\n", "\n", "sgd = SGDClassifier()\n", "\n", "grid_search = GridSearchCV(sgd, param_grid, cv=5, scoring='f1_macro')\n", "grid_search.fit(X_train, y_train)\n", "\n", "print(grid_search.best_score_)" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "2EId9Mr9k11f", "outputId": "c7f2ae0b-8107-4774-e753-c49bb9fe22fd" }, "execution_count": 544, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "0.3760951785106917\n" ] } ] } ], "metadata": { "kernelspec": { "display_name": "ai-course", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.10.8" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "62556f7a043365a66e0918c892755cfafede529a87e97207556f006a109bade4" } }, "colab": { "provenance": [] } }, "nbformat": 4, "nbformat_minor": 0 }