Learn how to code a (almost) one liner python function to calculate (manually) cosine similarity or correlation matrices used in many data science algorithms using the broadcasting feature of numpy library in Python.
Do you think we can say that a professional MotoGP rider and the kid in the picture have the same passion for motorsports even if they will never meet and are different in all the other aspects of their life ? If you think yes then you grasped the idea of cosine similarity and correlation.
Now suppose you work for a pay tv channel and you have…
My previous story was about the math intuition behind PCA. Today my objective is to apply those concepts to a simple use case and in particular to the part which often is overlooked being that of principal components interpretation.
As you probably know PCA is a sophisticated tool for dimensionality reduction. In extreme summary, using PCA, a dataset with a high number of variables could be hopefully represented by a small number of new variables which are (special) linear combination of the originals. …
Hi, everybody, my name is Andrea Grianti in Milan, Italy. I wrote…
If the title looks puzzling let me say that I believe that learning Python (or R) will take you to Machine Learning, but learning Linear Algebra will take you everywhere.
So going from using sw libraries to creating your stuff means moving with agility from numbers to vectors (and matrices) and often translating “Summations” expressions into equivalent one shot Vector/Matrix operations.
Unfortunately moving between these two worlds has some traps for beginners that might be good to refresh as they come up often at later stage in more complex situations.
When we begin learning Linear Algebra (LA) the first chapters…
Step by step explanation on how EDM is represented in linear algebra and how to code it as a function in Python in just one line.
Hi everybody, in this post I want to explain my experience in figuring out how, a rather intuitive concept like that of the Euclidean Distance Matrix (EDM), could become a challenge if you decide to improve your (in my case Python) programming skills crossing the chasm from classical “for…loops” type of code toward the beauty of a single line of code using linear algebra concepts.
Why ? Because if you can solve a problem…
Preliminary note: This post is the result of personal study on the subject. I studied computer science years ago at Politecnico of Milano university but data, especially business intelligence, became my profession. Data science has been the next step even if my background is more on programming and IT. I realise that there’s a lot to learn in this field so I tried to write with a beginner/student approach. Any suggestion for improvement is appreciated.
The objective of this article is explaining the results of the K Means algorithm that you get when you run it on your data using…
IT Senior Manager and Consultant. Data Warehouse and Business Intelligence expertise in design and build. Freelance.