K – Nearest Neighbors

In this project we try out the most common Nearest neighbors technique: K-NN.

Data Info

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 11 columns):
XVPM            1000 non-null float64
GWYH            1000 non-null float64
TRAT            1000 non-null float64
TLLZ            1000 non-null float64
IGGA            1000 non-null float64
HYKR            1000 non-null float64
EDFS            1000 non-null float64
GUUB            1000 non-null float64
MGJM            1000 non-null float64
JHZC            1000 non-null float64
TARGET CLASS    1000 non-null int64
dtypes: float64(10), int64(1)
memory usage: 86.0 KB

Data Exploration

Since this data is artificial, we'll just do a large pairplot with seaborn.

Seaborn on the dataframe to create a pairplot with the hue indicated by the TARGET CLASS column.

<seaborn.axisgrid.PairGrid at 0x7f35caa4ac50>

png

**Converted the scaled features to a dataframe **

<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
XVPM GWYH TRAT TLLZ IGGA HYKR EDFS GUUB MGJM JHZC
0 1.568522 -0.443435 1.619808 -0.958255 -1.128481 0.138336 0.980493 -0.932794 1.008313 -1.069627
1 -0.112376 -1.056574 1.741918 -1.504220 0.640009 1.081552 -1.182663 -0.461864 0.258321 -1.041546
2 0.660647 -0.436981 0.775793 0.213394 -0.053171 2.030872 -1.240707 1.149298 2.184784 0.342811
3 0.011533 0.191324 -1.433473 -0.100053 -1.507223 -1.753632 -1.183561 -0.888557 0.162310 -0.002793
4 -0.099059 0.820815 -0.904346 1.609015 -0.282065 -0.365099 -1.095644 0.391419 -1.365603 0.787762
# Predictions and Evaluations

** Confusion matrix and classification report.**

[[109  43]
 [ 41 107]]



             precision    recall  f1-score   support

          0       0.73      0.72      0.72       152
          1       0.71      0.72      0.72       148

avg / total       0.72      0.72      0.72       300

Choosing a better K Value

Text(0,0.5,'Error rate')

png

Retrained with new K Value

[[124  28]
 [ 24 124]]


             precision    recall  f1-score   support

          0       0.84      0.82      0.83       152
          1       0.82      0.84      0.83       148

avg / total       0.83      0.83      0.83       300

Leave a Reply

Your email address will not be published. Required fields are marked *