## Introduction

Cosine similarity is a popular metric used to measure the similarity between two vectors in a multi-dimensional space. It is widely employed in various fields, such as natural language processing, information retrieval, and recommendation systems. Cosine similarity measures the cosine of the angle between two vectors, and it ranges from -1 (completely dissimilar) to 1 (completely similar). A value close to 1 indicates a high similarity between the vectors.

In this article, we will explore how to calculate cosine similarity in Python using different methods and libraries, such as NumPy, scikit-learn and SciPy. We will walk through the steps to compute cosine similarity for both dense and sparse vectors.

## 1. Using NumPy

NumPy is a powerful library for numerical computations in Python. To calculate cosine similarity between two vectors using NumPy, follow these steps:

#### Step 1: Import the NumPy library

`import numpy as np`

#### Step 2: Define two vectors as NumPy arrays

`vector1 = np.array([1, 2, 3])vector2 = np.array([4, 5, 6])`

#### Step 3: Compute the dot product of the two vectors

`dot_product = np.dot(vector1, vector2)`

#### Step 4: Calculate the magnitudes (norms) of each vector

`norm_vector1 = np.linalg.norm(vector1)norm_vector2 = np.linalg.norm(vector2)`

#### Step 5: Compute the cosine similarity using the dot product and vector norm

`cosine_similarity = dot_product / (norm_vector1 * norm_vector2)`

#### Step 6: Print the cosine similarity

`print("Cosine Similarity:", cosine_similarity)`

#### Output

Cosine Similarity: 0.9746318461970762

### Complete Example

`import numpy as npvector1 = np.array([1, 2, 3])vector2 = np.array([4, 5, 6])dot_product = np.dot(vector1, vector2)norm_vector1 = np.linalg.norm(vector1)norm_vector2 = np.linalg.norm(vector2)cosine_similarity = dot_product / (norm_vector1 * norm_vector2)print("Cosine Similarity:", cosine_similarity)`

#### Output

Cosine Similarity: 0.9746318461970762

## 2. Using scikit-learn

Scikit-learn is a popular machine learning library that provides efficient implementations for various similarity metrics, including cosine similarity. To calculate cosine similarity using scikit-learn, follow these steps:

#### Step 1: Import the necessary module from scikit-learn

`from sklearn.metrics.pairwise import cosine_similarity`

#### Step 2: Define two vectors as NumPy arrays (same as before)

`vector1 = np.array([1, 2, 3])vector2 = np.array([4, 5, 6])`

#### Step 3: Reshape the vectors into 2D arrays (required by scikit-learn)

`vector1 = vector1.reshape(1, -1)vector2 = vector2.reshape(1, -1)`

#### Step 4: Calculate the cosine similarity using the 'cosine_similarity' function

`cosine_similarity_score = cosine_similarity(vector1, vector2)`

#### Step 5: Print the cosine similarity

`print("Cosine Similarity:", cosine_similarity_score[0][0])`

#### Output

Cosine Similarity: 0.9746318461970762

### Complete Example

`from sklearn.metrics.pairwise import cosine_similarityvector1 = np.array([1, 2, 3])vector2 = np.array([4, 5, 6])vector1 = vector1.reshape(1, -1)vector2 = vector2.reshape(1, -1)cosine_similarity_score = cosine_similarity(vector1, vector2)print("Cosine Similarity:", cosine_similarity_score[0][0])`

#### Output

Cosine Similarity: 0.9746318461970762

## 3. Using SciPy

SciPy is another powerful library for scientific and technical computing in Python. It includes a function to compute cosine similarity for dense vectors. To use SciPy for calculating cosine similarity, follow these steps:

#### Step 1: Import the necessary function from SciPy

`from scipy.spatial.distance import cosine`

#### Step 2: Define two vectors as NumPy arrays (same as before)

`vector1 = np.array([1, 2, 3]) vector2 = np.array([4, 5, 6])`

#### Step 3: Calculate the cosine similarity using the 'cosine' function

`cosine_similarity_score = 1 - cosine(vector1, vector2)`

#### Step 4: Print the cosine similarity

`print("Cosine Similarity:", cosine_similarity_score)`

#### Output

Cosine Similarity: 0.9746318461970761

### Complete Example

`from scipy.spatial.distance import cosinevector1 = np.array([1, 2, 3])vector2 = np.array([4, 5, 6])cosine_similarity_score = 1 - cosine(vector1, vector2)print("Cosine Similarity:", cosine_similarity_score)`

#### Output

Cosine Similarity: 0.9746318461970761

### Conclusion

In this article, we learned how to calculate cosine similarity in Python using various methods and libraries. We explored implementations using NumPy, scikit-learn, and SciPy, both for dense and sparse vectors. Cosine similarity is a powerful tool for measuring similarity between vectors and finds widespread application in various fields, especially in natural language processing and recommendation systems. Whether you are working with dense or sparse data, Python offers efficient libraries to compute cosine similarity and utilize it in your projects effectively.