# K-Nearest Neighbors and curse of dimensionality in python Scikit-Learn

What is K nearest neighbors(KNN)?

KNN is one of the simplest machine learning algorithm and it is a lazy algorithm, as it doesn’t run computations on a data set until you give it a new data point you are trying to test.

In this tutorial, I will not only show you how to implement k-Nearest Neighbors in Python (SciKit-Learn), but also I will investigate the influence of higher dimensional spaces on  the classification.

The implementation will be specific for a classification problem and will be demonstrated using the digits data set.

## How K Nearest Neighbors Work?

Lets say you have several apples and oranges and you have an unclassified fruit. If K value is 3, the algorithm looks at the 3 nearest neighbors of the unknown fruit and classify the unknown fruit as orange(as there are two oranges and one apple).

If K is 5,  the algorithm looks at the 5 nearest neighbors and classify the unknown fruit as apple( 3 apples and 2 oranges).

If K is 5,  the algorithm looks at the 5 nearest neighbors and classify the unknown fruit as apple( 3 apples and 2 oranges).

KNN classifies an unknown item based on the concept of  majority votes.  Each neighbor can either be given an equal weight or the vote can be based on the distance. The similarity measure is dependent on the type of data, for real-valued data, the Euclidean distance can be used; For other types of data such as categorical or binary data, hamming distance can be used. Since there is a minimum training involved, there is a high computational cost associated with testing a new data. I recommend you to read Saravanan’s blog to know more about KNN.

### Analyzing Digits Data Set

First I import all the required Python libraries  to my Ipython Notebook.

Seaborn is a Python library for making attractive statistical graphs, it is built on top of matplotlib. sklearn.datasets is used to import default data sets present in scikit-learn.  sklearn.cross validation is used to perform cross validation on your data set, and sklearn.grid_search is used to select the best parameter K. If you don’t know what is meant by parameter selection and cross validation, please watch week 6 videos of coursera’s machine learning course. I will explain aboutsklearn.decomposition and sklearn.metrics later in this post.

I then load the digits data set and store these data and target values in X and Y variables. My X value has 1797 rows and 64 columns, and Y value has 1797 rows and one column.  You can print digits.DESCR to know more  about this data set.

Train-test split and mean normalization

I Split the data set into train and test set, in which I use 33% of the samples as my test data. I then mean normalize X_train and X_test.

### Projection Of Principal components

I create a scatter plot of the projections to the first two Prinicpal components.

You can see here that I use sklearn.decomposition.TruncatedSVD  function to reduce the number of components.  It performs linear dimensionality reduction very similar to PCA, but operates directly on sample vectors, instead of on covariance matrix.

Cross-Validation To Estimate The Optimal-Value For K

I am going to do a ten-fold cross-validation to estimate the best K value. Apart from estimating the best K value, I am also interested in the influence of the number of dimensions I project the data down. This means that I am going to optimize K for different dimensional projections of the data.

compute test function

Implementation of K nearest

You don’t have to panic by seeing the above-mentioned code, I will explain the code line by line.  In our implementation of knearest section, I set different values for K( from 1 to 20).  Then I put these K values into a dictionary because GridSearchCvaccepts parameter values only as a dictionary.

In the next line of this code, I call my nearest neighbors classifier from scikit-learn,knearest = sklearn.neighbors.KNeighborsClassifier().

Don’t get confused as I introduced Iris data set here. In this section, I am going to explain what GridSearchCV does use Iris data set.

First I will load iris data set and then perform a train test split.

X_train, X_test, Y_train, Y_test = sklearn.cross_validation.train_test_split(X, Y, test_size = 0.33, random_state = 42)

Then I fit nearest neighbors to my dataset.

clf = sklearn.grid_search.GridSearchCV(knn, parameters, cv =10),  here I pass my nearest neighbors classifier, parameters and cross-validation value to  GridSearchCV.  Even if you don’t understand what cross-validation or what GridSeachCV does, don’t worry about it, it just selects the best parameter K for you.  This is all you have to know about GridSearchCv.

You can see GridSearchCv does all the hard work for you and returns the best k parameter.

### Explaining The Effect Of Dimensions In KNearest Neighbors

Ok(!) Let me continue to explain the code of my digits dataset,

First I create two empty lists and a list containing numbers from 1 to 10.

accuracy =[]

params =[]

no_of_dimensions = [1,2,3,4,5,6,7,8,9, 10]

I then loop over my no_of_dimensions  using a for loop(for d in no_of_dimensions).

Then I call TruncatedSVD from Scikit-Learn:

svd =sklearn.decomposition.TruncatedSVD(n_components =d)

Then I fit svd to my training data(X_train) and apply transform method to my test data.

if d<64:

X_fit = svd.fit_transform(X_train)

X_fit_atest = svd.transform(X_test)

Now I fit my classifier to the truncated X_fit and Y_train. When you are fitting your classifier to your data set, remember  to use X_fit instead of X_train.

clf.fit(X_fit, Y_train)

### Understanding Accuracy Scores

In this line of code “accuracy.append(compute_test(x_test = X_fit_atest, y_test = Y_test, clf = clf, cv =10))”  I  compute the accuracy score for every dimension using compute_test function. In compute test function, sklearn.cross_validation.KFold gives  the indices to do a 10 fold cross validation split. I then calculate the accuracy score for X_fit_atest andY_test.

### Conclusion

The accuracy gets better as the dimensions increase. I have enough data points that the curse of dimensionality does not harm my predictions here and the additional dimensions add to the class separability.

Hope this post has given a good idea of how k nearest neighbors operate, and how dimensions of the data affect your classification accuracy.

1. Raymondwob 11 months ago

Top Cryptocurrencies To Invest In 2018-2019: http://www.vkvi.net/bestinvestcryptobitcoin27724

2. JamesOvalp 11 months ago

Where to invest \$ 3000 once and receive every month from \$ 55000: http://valeriemace.co.uk/milliondollars92562

3. Raymondwob 10 months ago
4. g 9 months ago

WOW just what I was looking for. Came here by searching for g

5. BryantWeisK 9 months ago

If you invested \$1,000 in bitcoin in 2011, now you have \$4 million: http://www.vkvi.net/investmining63528

6. JamesOvalp 8 months ago

Wenn Sie im Jahr 2011 1.000 USD in Bitcoin investiert haben, haben Sie jetzt 4 Millionen USD: http://corta.co/bestinvest19690

7. gamefly free trial 8 months ago

I was more than happy to uncover this website. I need to to thank you for your time just for this fantastic read!!
I definitely savored every part of it and i also have you bookmarked to see new information on your site.

8. gamefly free trial 8 months ago

Appreciate the recommendation. Let me try it out.

9. LowellBiC 8 months ago

Wie man € 10.000 pro Tag SCHNELL macht: https://clck.ru/GR2DG

10. JamesOvalp 8 months ago

\$ 10000 pro Tag Bitcoins auf dem Markt fГјr binГ¤re Optionen handeln: http://tinyurl.com/yxq78ut5

11. LowellBiC 8 months ago

Erwachsenendatierung bei 35 Jahren alt: http://tinyurl.com/y3r6ru57

12. Marvinpop 8 months ago

5 Popular Investment Apps In Australia: http://sneetsodenit.tk/txcq

13. What is the best way to invest \$10,000 for Australians http://voitabliasi.tk/jhyl4 8 months ago

Bitcoin Investment Australia: http://supwildsynchhyrd.tk/oxzm6

14. JamesCix 8 months ago

What is the best way to invest \$10,000 for Australians: http://finostmipi.tk/0m0n

15. JamesCix 8 months ago

Bitcoin Investment Deutschland: http://xurl.es/veh8s

16. Marvinpop 7 months ago

Natural Stress Solutions CBD Lip Balm: http://www.abcagency.se/qwe73745

17. WilliamMeave 7 months ago

18. EduardoAvabs 7 months ago

Trouvez-vous une fille pour la nuit dans votre ville: http://xurl.es/2qjgs

19. EduardoAvabs 7 months ago

Trouvez-vous une fille pour la nuit dans votre ville: http://xurl.es/2qjgs

20. DylanLes 7 months ago

If you invested \$1,000 in bitcoin in 2011, now you have \$4 million: http://cutt.us/3UFg9vJ

21. WilliamMeave 7 months ago

Wie man in bitcoins \$ 5000 investiert – erzielt eine Rendite von bis zu 2000%: http://cutt.us/xJPxEylzV

22. WilliamMeave 7 months ago

Wie man in bitcoins \$ 5000 investiert – erzielt eine Rendite von bis zu 2000%: http://cutt.us/xJPxEylzV

23. LowellBiC 7 months ago

Wenn Sie im Jahr 2011 1.000 USD in Bitcoin investiert haben, haben Sie jetzt 4 Millionen USD: https://is.gd/WWJSxB

24. LowellBiC 7 months ago

Investoi kannabiksen NZ: hen: http://v.ht/G28tdh

25. DylanLes 7 months ago

Wie man in Cannabis investiert – die 3 besten Marihuana-Aktien fГјr 2019: https://hideuri.com/qvrX46

26. DylanLes 7 months ago

Cannabis investiert in London: http://v.ht/5HHz0

27. GeorgeKak 7 months ago

Р—Р°Р±РµСЂРёС‚Рµ СЃРІРѕРё 104114 С‡РµСЃС‚РЅРѕ Р·Р°СЂР°Р±РѕС‚Р°РЅРЅС‹С… СЂСѓР±Р»РµР№: https://s.coop/22p77?&dqlag=F34fKOCcfIx

28. GeorgeKak 7 months ago

Р—Р°Р±РµСЂРёС‚Рµ СЃРІРѕРё 104114 С‡РµСЃС‚РЅРѕ Р·Р°СЂР°Р±РѕС‚Р°РЅРЅС‹С… СЂСѓР±Р»РµР№: https://s.coop/22p77?&dqlag=F34fKOCcfIx

29. Louismup 7 months ago

Р—Р°Р±РµСЂРёС‚Рµ Р’Р°С€Рё 127583 СЂСѓР±Р»РµР№: https://hideuri.com/qb1Ykr?hCh5KLbMbn

30. Aarongag 7 months ago

РџРѕР»СѓС‡РёС‚Рµ СЃРІРѕРё 143283 С‡РµСЃС‚РЅРѕ Р·Р°СЂР°Р±РѕС‚Р°РЅРЅС‹С… СЂСѓР±Р»РµР№: http://v.ht/7Lcj6D?&hsinh=d9rN9k

31. GeorgeKak 7 months ago

Р—Р°Р±РµСЂРёС‚Рµ Р’Р°С€Рё 128310 СЂСѓР±Р»РµР№: https://hideuri.com/xyVZy1?FSHKuZ

32. Louismup 7 months ago

Р’РѕР·СЊРјРёС‚Рµ Р’Р°С€Рё 137673 СЂСѓР±Р»РµР№: http://merky.de/tnfncr?&qkymk=az9rmtd1Iw

33. RichardFam 7 months ago

If you invested \$1,000 in bitcoin in 2011, now you have \$4 million: http://bit.do/eYLhf?pnCL2XS9

34. RichardFam 7 months ago

If you invested \$1,000 in bitcoin in 2011, now you have \$4 million: http://bit.do/eYLhf?pnCL2XS9

35. Randhax 7 months ago

Dapoxetina Ou Paroxetina Mail Order Isotretinoin Mastercard

36. LowellBiC 6 months ago

Forex + Bitcoin = \$ 7000 per week: http://dserpenretask.tk/m68e

37. RichardPiT 6 months ago

Make Money 10000\$ Per Day With Bitcoin: http://viconringbos.tk/hkm81

38. LowellBiC 6 months ago

Get \$1500 вЂ“ \$6000 per DAY: http://tingmacpostro.tk/16u6

39. Marvinpop 6 months ago

If you invested \$1,000 in bitcoin in 2011, now you have \$4 million: http://snifkaawindpoo.tk/3lr8

40. Marvinpop 6 months ago

Get \$1500 вЂ“ \$6000 per DAY: http://boggrickblacun.tk/jiq4

41. LowellBiC 6 months ago

Invest in Bitcoin and earn from \$ 3000 per day: https://hideuri.com/a7LWW1?&fkiry=bH0N6Op7

42. RichardPiT 6 months ago

\$15,000 a month (30mins вЂњworkвЂќ lol): http://metanowa.tk/8v9lk

43. RichardFam 6 months ago

Invest in Bitcoin and earn from \$ 3000 per day: https://hideuri.com/KezM9v?pt2BLXU5borT

44. Marvinpop 6 months ago

Invest in Bitcoin and earn from \$ 3000 per day: http://v.ht/2scMxu9?S5yvse0

45. RichardFam 6 months ago

Earnings on the Bitcoin course from \$ 2500 per day: https://hideuri.com/xp4DeN?&cwqii=4yVZA

46. JamesOvalp 6 months ago

Paid Surveys: Earn \$30,000 Or More Per Week: http://chaifuldedy.tk/vacs6

47. natalielise 6 months ago

Saved as a favorite, I like your blog! pof natalielise

48. Aarongag 6 months ago

Paid Surveys: Earn \$30,000 Or More Per Week: http://v.ht/d1pK3S?&nygmn=9nkEFqLsoanySl

49. Marvinpop 6 months ago

п»їWhat’s the easiest way to earn \$30000 a month: http://v.ht/adA0v?m7YukAQpASjnv6

50. LowellBiC 6 months ago

Find yourself a girl for the night in your city: http://olspynesmen.tk/fma59?tXiMU5lyNCY

51. Marvinpop 6 months ago

\$200 for 10 mins вЂњwork?вЂќ: http://enouflacu.tk/2o5bc?&jvufr=7NxSnnamiRC

52. Marvinpop 6 months ago

Binary options + cryptocurrency = \$ 7000 per week: http://steerwarita.tk/ngbv?&tgdhr=1vTIV

53. Marvinpop 6 months ago

Get \$1500 вЂ“ \$6000 per DAY: http://paddreatpolan.tk/1f0y4?lrJMUW0LCT

54. Marvinpop 6 months ago

Invest in Bitcoin and earn from \$ 3000 per day: http://plaserperni.tk/3kdto?gSTTGJchwF

55. RichardPiT 6 months ago

Where to invest \$ 3000 once and receive every month from \$ 55000: http://besttilcfindti.tk/0d2x?&gcajo=3QQhDmhGQgC

56. RichardPiT 6 months ago

Get \$1500 вЂ“ \$6000 per DAY: http://laytimenbe.tk/qehs?&auaae=aoVfEt1MK

57. Robertgrava 6 months ago

If you invested \$1,000 in bitcoin in 2011, now you have \$4 million: http://ackratpupo.tk/lwdc?&cwawm=EjQRh2

58. RichardFam 6 months ago

Earn Free Bitcoin 0.2 BTC Per day: http://gerensusi.tk/sx27?&bdskp=yGjct0d

59. DarrellAcare 6 months ago

LAZY way for \$200 in 20 mins: http://alecolmix.tk/f3k0v?rCSRCY7jFMd6

60. Robertgrava 6 months ago

\$200 for 10 mins вЂњwork?вЂќ: http://felancichin.cf/z9pwi?rBZ1soUStp

61. PeterCIG 5 months ago

Find yourself a girl for the night in your city: http://cuistifabxi.tk/3zgdv?w1fWSrjT4

62. PeterCIG 5 months ago

Find yourself a girl for the night in your city: http://geouremeti.tk/7w4i?&luoua=YNcgUHLGQ

63. Robertgrava 5 months ago

Binary options + cryptocurrency = \$ 7000 per week: http://estaisugi.gq/tr1n8?&jyoqi=LQRdg

64. Timothygoofs 5 months ago

Invest in Bitcoin and earn from \$ 3000 per day: http://tantconspleased.tk/mf0o?d0cO9c2thufP

65. Robertgrava 5 months ago

Paid Surveys: Earn \$30,000 Or More Per Week: http://go-4.net/fiRM?&tbcws=EMkI24ZZWdY

66. Robertgrava 5 months ago

Paid Surveys: Earn \$30,000 Or More Per Week: https://hec.su/k31U?&ewpkp=602niRtUnn5cmI

67. JamesOvalp 5 months ago

Forex + cryptocurrency = \$ 9000 per week: http://v.ht/sHzWf?&tgymd=X9iW3tE

68. JamesOvalp 5 months ago

Get \$1000 вЂ“ \$6000 A Day: http://gmy.su/:epkO?8CjqUyouk

69. DannyLielo 5 months ago

70. Timothygoofs 5 months ago

10 meilleurs sites de rencontres en RF 2019: https://hec.su/lgHq?&cwdcu=BtkDenk

71. JamesOvalp 5 months ago

Top 5 sites de rencontre occasionnels IE 2019: https://hec.su/laGY?R5XuhdxTZ5v0sk

72. JamesOvalp 5 months ago

5 meilleures applications de rencontres occasionnelles: http://v.ht/IsQJvc?&tgiwr=FG3HQ8DbRIP3

73. PeterCIG 5 months ago

\$200 for 10 mins вЂњwork?вЂќ: http://go-4.net/fksl?&aijpe=G2wLTfo9h

74. Robertgrava 5 months ago

Binary options + Bitcoin = \$ 5000 per week: http://v.ht/tHNjzy?&guphi=MkZw2

75. DarrellAcare 5 months ago

Paid Surveys: Earn \$30,000 Or More Per Week: http://go-4.net/fLw2?HPE0hlnfvtx

76. DarrellAcare 5 months ago

Forex + Bitcoin = \$ 7000 per week: http://surpatapppo.cf/pulun?&zppen=8DewcMMoHPxC

77. Antonioscelm 5 months ago

\$15,000 a month (30mins “work” lol): http://v.ht/mxSK4k?&djfvz=DpKHvp

78. JamesOvalp 5 months ago

Become a bitcoin millionaire. Get from \$ 2500 per day: http://v.ht/epOoVw?&cnwen=xdDWyaEvk4I

79. PeterCIG 5 months ago

Binary options + cryptocurrency = \$ 7000 per week: http://v.ht/0lycq?cvDOd

80. Antonioscelm 5 months ago

Bitcoin Investment Deutschland: https://hec.su/lVLe?&tchgl=2ZlBKBgsuY

81. Antonioscelm 5 months ago

Wie man € 10.000 pro Tag SCHNELL macht: http://soedwindecka.tk/ca9k?&cnkwb=mHNzOuN5z2

82. Keithanabs 4 months ago

Forex + Bitcoin = \$ 7000 per week: https://hec.su/nvjh?fnNFeINiN

83. RonaldVed 4 months ago

Paid Surveys: Earn \$30,000 Or More Per Week: http://gmy.su/:aaMT?&bztia=ztGEnJX

84. RonaldVed 4 months ago

Binary options + Bitcoin = \$ 5000 per week: https://hec.su/nBWF?vhfmzQ2o0bH5

85. Keithanabs 4 months ago

Binary options + cryptocurrency = \$ 7000 per week: https://hec.su/nC4p?IusEK

86. Antonioscelm 4 months ago

87. PeterCIG 4 months ago

UK Essay Writing Service You Can Rely On: https://vk.cc/9OTGak?id=buyessayonline112737

88. Antonioscelm 4 months ago

89. DarrellAcare 4 months ago

90. Antonioscelm 4 months ago

91. Antonioscelm 4 months ago

Just Buy Essay Online | 100% Safe and Client-Oriented Writing Service: https://vk.cc/9OTGak?id=buyessayonline1118jc

92. Antonioscelm 4 months ago

93. PeterCIG 4 months ago

94. DarrellAcare 4 months ago

95. Robertgrava 4 months ago

USA Essay Writing Service You Can Rely On: https://vk.cc/9OTGak?id=buyessayonline111tdw

96. PeterCIG 4 months ago

97. Paid Surveys: Earn \$6437 Or More Weekly: http://v.ht/bx3rVfP?oKjctbOyee5v 4 months ago

Invest \$ 1413 and get \$ 65391 every month: http://rosnaheartkat.tk/k4eh?hz8lIbdi

98. Forex + Cryptocurrency = \$ 7371 per week: http://www.lookweb.it/earnonebtc27923?h=1283 3 months ago

Paid Surveys: Earn \$5428 Or More Per Week: https://hideuri.com/aVyOm4?wvTbDxWb

99. Get yours \$914326,98: https://cutt.us/m4FcMcap0 3 months ago

You have balance left \$914326,98: https://cutt.us/OzziGY

100. What's the most convenient way to earn \$64671 a month: http://lovebyt.es/m4g7 3 months ago

Binary options + Cryptocurrency = \$ 6293 per week: https://jtbtigers.com/wpgo

101. ShawnSic 3 months ago
102. WalkerSoake 3 months ago

Binary options + Cryptocurrency = \$ 7387 per week: https://ecuadortenisclub.com/earnonebitcoinperday188907

103. coconut oil is 2 months ago

Everyone loves it when people get together and
share views. Great blog, continue the good work!

104. Buy Essay Online | Purchase Essays From Experts: https://onlineuniversalwork.com/buyessayonline345791 2 months ago

UK Essay Writing Service You Can Rely On: http://www.nuratina.com/go/buyessayonline810744

106. Dissertation Writing Services from Expert PhD Writers: https://sms.i-link.us/buyessayonline918363 2 months ago

The Best Dissertation Service in The AU: https://1borsa.com/buyessayonline725241

108. READY SCHEME EARNINGS ON THE INTERNET WITH MINIMUM INVESTMENTS from \$9975 per day: https://onlineuniversalwork.com/morebitcoins481787 2 months ago

Earnings on the Internet from \$6771 per day: https://ecuadortenisclub.com/get10bitcoins206120

109. READY SCHEME EARNINGS ON THE INTERNET WITH MINIMUM INVESTMENTS from \$8683 per day: http://roabackstalod.tk/wllu 2 months ago

Verified earnings on the Internet from \$8847 per day: http://roafensboco.tk/t9zvf

110. EASY SCHEME EARNINGS ON THE INTERNET from \$6998 per day: http://xsle.net/get1million879789 2 months ago

BEST EARNINGS FOR ALL FROM \$5886 per day: http://wallbotare.tk/9nfcs

111. Earnings on the Internet from \$9771 per day: http://freeurlredirect.com/morebitcoins682043 2 months ago

The Most Fastest Way To Earn Money On The Internet From \$8213 per day: https://darknesstr.com/morebitcoins868231

112. Verified earnings on the Internet from \$9339 per day: http://planworkcilin.tk/4yezt 2 months ago

BEST EARNINGS FOR ALL FROM \$9684 per day: https://bogazicitente.com/binarycrypto657779

113. BEST EARNINGS FOR ALL FROM \$9879 per day: http://xsle.net/binarycrypto682448 2 months ago

56 Ways to Make Money Online From \$8278 per day: https://jtbtigers.com/binarycrypto239867

114. 89 Ways to Make Money Online From \$9972 per day: http://vuwowytonaru.tk/gzq7 2 months ago

READY SCHEME EARNINGS ON THE INTERNET WITH MINIMUM INVESTMENTS from \$6849 per day: http://rorydytopo.ml/oslfo

115. EASY SCHEME EARNINGS ON THE INTERNET from \$9324 per day: http://lieterchicu.tk/ovygd 2 months ago

READY EARNINGS ON THE INTERNET from \$9965 per day: http://senmargsandtep.tk/g95m

116. BEST EARNINGS FOR ALL FROM \$9552 per day: http://raqesuna.tk/8jwd3 2 months ago

A proven way to make money on the Internet from \$7757 per day: https://darknesstr.com/morebitcoins116283

117. Earnings on the Internet from \$5332 per day: http://zawebwaca.tk/p2vl0 2 months ago

Not a standard way to make money online from \$8783 per day: https://darknesstr.com/morebitcoins254813

118. Fast and Big money on the Internet from \$6244 per day: https://sms.i-link.us/morebitcoins519876 2 months ago

EASY SCHEME EARNINGS ON THE INTERNET from \$6798 per day: http://triphpelmefour.tk/ofsy6

119. My family members every time say that I am wasting my time here at net, but I know I am getting familiarity
all the time by reading such pleasant content.

120. ps4 games 2 months ago

I’m really loving the theme/design of your site.

Do you ever run into any internet browser compatibility issues?
A handful of my blog visitors have complained about my blog
not operating correctly in Explorer but looks great in Safari.

Do you have any solutions to help fix this issue?

121. ps4 games 2 months ago

Wonderful work! That is the type of info that are meant to be shared around the web.

Disgrace on Google for no longer positioning this publish upper!

Come on over and talk over with my site . Thanks =)

122. coconut oil 1 month ago