Semester7

Notes of courses done/attended in semester 7 in college

n dimensional vectors - last class me dekha
Any point in n-d space is a n-d vector
Similarity between points
- Use distance between two points as the criteria
- Dot product between two product gives cosine similarity
  - Similar vectors will give high dot product

The reason is the Concentration of Lp Norm
- Manhattan and Eucledian dists are spl cases of Lp Norm
- p = 1 => Manhattan, p = 2 => Euclidean

collectively these probs are called Curse of Dimensionality (CoD)
- Running time
- Overfitting
- Number of Samples reqd

cod_1

the given feature is not discriminated
- i;.e. I cannot draw a vertical line and say one side is squares, and one is circles
- so I am not able to distinguish 2 types of points in 1d space
- so, let’s add one more dimension and maintain same density
  - 9 points in 1d => 9x9 points in 2d?
  - but we have 9 points only bhai
Linearly Separable data
what we can do to make data not linearly separable in lower dim space to make it linearly separable in high dim space
to keep same density ,I will need 9x9x9 = 729 points but I only have 9 points
so data become sparse
and most of the techniques like 1nn do not work when data is sparse, bcz in high-dim spaces, concept of dist becomes meaning less, all appear at equidistant

cod_4

cod_5