You may exit out of this review and return later without penalty.
Close
Sharing assignments in OpenClass
OpenClass is on a mission to improve & democratize education. Through our platform, educators can share assignments privately or with our entire community to improve accessibility of the best learning resources around the world.
Our assignments guide students through research-backed study sessions to reinforce learning. If this review is relevant to your teachings, feel free to add it to one of your classes!
Our format is simple and engaging: students are asked a retrieval question, then shown corrective feedback and a multiple-choice mastery question to assess their understanding of the material. Assignments adapt to unique student needs and continue to draw questions until a student has demonstrated mastery.
# Cosine similarity
Give the formula for cosine similarity. Describe what it measures.
What's the cosine similarity between [-68, -92, 21, -19, 54] and [28, 32, 51, -78, -10]?
What's the cosine similarity between [86, -36, -34, -51] and [51, 67, 27, 52]?
What's the cosine similarity between [-53, 81, 96, 3] and [3, 49, 93, -1]?
What's the cosine similarity between [-76.81, 5.51] and [44.62, 36.53]?
When does ?
When does ?
What's the difference between PMI and PPMI, and why is PPMI preferred for NLP problems?
Explain the intuition behind TF-IDF.
Calculate TF-IDF for the documents in Figure A.
Complete the code to calculate the Euclidean distance. Do not use NumPy.
Complete the code to calculate **cosine similarity**. Do not use NumPy.
## Euclidean distance
Implement Euclidean distance using `numpy`:
where represents the Euclidean distance of vectors and .
## Dot product
Implement the function `dot_product` using `numpy`. Your function should take two `np.ndarray`s representing and and return a single number.
What is ?
## Dot product
Implement the function `dot_product` using `numpy`. Your function should take two `np.ndarray`s representing and and return a single number.
Given , what vector has the same direction, but 3 times the magnitude of ?
## Find the centroid
Implement the function `find_centroid` using `numpy`. Your function should take one `np.ndarray` representing a multirow matrix **X** and return a vector representing the centroid.
Create a term-term matrix for the sentences in Figure A.
Create a term-term matrix for the sentences in Figure A.
Complete the code to calculate the **2-norm** (aka Euclidean norm) of a vector. Do not use NumPy.
Complete the code to **normalize a vector to unit length** using the **2-norm** (aka Euclidean norm). Do not use NumPy for this.
What does a vector's 2-norm represent?
What does a vector's 2-norm represent?
If , what can you say about ?
If , what can you say about ?
Complete the code to calculate the frequency of words that co-occur within a symmetric window () for the given word.
When , what will be target word and context words in the -window for the sentence "The Spanish galleon disappeared into the night"?
Provide symmetric -windows of size , , , and for *Shakespeare* in the following sentence:
>the greatest Shakespeare play is Macbeth
If , what can you say about ?
(Student response here)
It's normalized. A unit vector has a **magnitude** of 1.
Which of the following vectors has been normalized?