Quiz 1
1
You have a fair coin that you toss eight times. What is the probability that you’ll get no more than seven heads?
- Probability is 1-chance of getting 8 heads
- \(\frac{1}{2}^8 = \frac{1}{256}\)
2
You have a fair coin that you toss eight times. What is the probability that you’ll get exactly seven heads?
- 7 different ways to get exactly seven heads
- \(7 \cdot \frac{1}{256} = \frac{7}{256}\)
- \(\frac{1}{2}^8 = \frac{1}{256}\) chance for each way
3
Let P(X) = 0.2, P(Y) = 0.4, P(X|Y) = 0.5. What is P(Y|X)?
- Bayes rule is \(P(A|B) = \frac{P(B|A)\cdot P(A)}{B}\)
- \(P(Y|X) = \frac{0.5 \cdot 0.4}{0.2} = 1\)
4
Let P(X) = 0.2, P(Y) = 0.4. If P(X|Y) = 0.2, what can you say about X & Y?
- Bayes rule can be used again, but this can also be reasoned out
- \(P(X) = 0.2 = P(X|Y)\)
- The probability of X is the same as the probability of X given Y
- Y has no relationship to X
- They are independent
5
Which of these numbers cannot be a probability?
0, 1.0, 1.5, 0.5
Probabilities are given as a chance of an event occurring against some other condition. This means that probabilities cannot be greater than 1 or less than 0.
- 1.5
6
A die is rolled and a coin is tossed simultaneously. What is the probability of getting an even number on the die and a head on the coin?
- Probability of even numbers on a 6 sided die is 1/2
- probability of a head on a coin flip is 1/2
- \(\frac{1}{2}^2 = \frac{1}{4}\)
7
Let \(f(x) = x^2\) What is its integral and differential?
- Power rule of integration \(\int x^2dx = \frac{x^3}{3} + c\) Where c is an unknown constant.
- Power rule of derivation \(\frac{df}{dx}x^2 = 2x\).
8
Let A be a 3x4 matrix of the following format
\(A = \begin{bmatrix}1 & 0 & 1 \\ 0 & 1 & 0 \\ 2 & 0 & 2 \\ 1 & 1 & 1 \end{bmatrix} \)
What is the rank of A?
- Compute echelon form
- \(A = \begin{bmatrix}1 & 0 & 1 \\ 0 & 1 & 0 \\ 2 & 0 & 2 \\ 1 & 1 & 1 \end{bmatrix} \rightarrow \begin{bmatrix}1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \)
- 2 non zero rows, rank=2
9
Quiz 2 explain
1. drill
A data-point in a dataset can be written as (1, 1, 1). What is the dimensionality of this dataset?
- 3
- 1
- 2
- 0
2. drill
Vectors are generally represented as:
- column vector
- linked-list
- matrix
- row vector
3. drill
Which of the following is true about a statistic?
- Parameter of the population estimated from samples
- Parameter of the population estimated by the entire population
- geoemetric view in 1D
- geometrix view in 2D
4. drill
Which of these statements is false
- In geometric view, each attribute is a random variable
- In geometric 2D, each data-point is a vector
- In probablilitic view, parameters are estimated
- For continuous attributes, mean of an attribute is expressed as an integration \(\int_{-\infty}^{\infty}xp\left(x\right)dx\)
5. drill
Which of the following is false
- Correlation measures linear relationships
- Cos(θ) is a measure of similarity
- Euclidean distance is a good measure for geometric distances
- Covariance is normalized correlation
Quiz 3 explain
1. drill
A matrix \(\Sigma\) is positive semidefinite if:
- \( x^T\Sigma x\in\mathbb{Z}\)
- \(x^T\Sigma x=0\)
- \(x^T\Sigma x\geq0\)
- \(x^T\Sigma x\leq 0\)
2. drill
The probability density of the Gaussian/Normal distribution is highest at
- \(\mu + \sigma\)
- \(\mu - \sigma\)
- \(\mu\)
- \(\mu + 2\sigma\)
3. drill
Similarity between pairs of categorical attributes can be obtained by
- Correlation
- Cosine
- Covariance
- \(\chi^{2}\) test
4. drill
CLT states that when random samples are drawn from any distribution:
- The samples are uniformly distributed
- The means of the samples are normally distributed
- The means of the samples are uniformly distributed
- The samples are normally distributed
5. drill
An attribute A takes 2 values {yes,no}, and attribute B takes 3 values {high,medium,low}. Which of the following is not true?
- The confusion matrix is of size 2 x 3
- If p-value of \(\chi^{2}\) test is 0.3 implies that A & B are independent
- Since A & B are categorical, correlation is NOT the correct metric to measure similarity
- The null hypothesis of \(\chi^2\) test is that variables are independent
Quiz 4 explain
1. drill
Your dataset has d binary attributes. Which of the following best describe the points?
- The origin in d-dimensions
- The corners of a d-dimensional hypercube
- The surface of a d-dimensional hypershere
- The shell of a d-dimensional hypersphere
2. drill
As \(d \rightarrow \infty,\) the volume of a unit hypershere goes to
- \(\infty\)
- 1
- 0
- e
3. drill
As \(d \rightarrow \infty\), which of the following is false?
- The probability of sampling points near the origin is high
- The volume of a unit hypercube is 1
- The volume of a hypercube with sides of length 2 goes to ∞
- The "corners" of a hypercube occupies more space than the inscribed hypercube
4. drill
In d-dimensional space, how many orthogonal axes do we have in addition to the major axes?
- \(\mathcal{O}(d)\)
- \(\mathcal{O}(d^2)\)
- \(\mathcal{O}(2^d)\)
- \(\mathcal{O}(d^3)\)
5. drill
A unit hypercube in 2D is best described as:
- a line with length = 1
- a circle with radius = 1
- a square with side = 1
- a circle with diameter = 1
Quiz 5 explain
1. drill
Let \(x_1,x_2,x_3 \) represent 3 features. Which of the following are NOT linear combinations of these features?
- \(0.4x_1 + 0.3x_2 + 0.6x_3\)
- \(4x_1^2 + 3x_2^2 + x_3^2\)
- \(4^2 x_1 + 3^2 x_2 + 6^2 x_3\)
- \(4x_1 + 3x_2 + 6x_3\)
Question 2 drill
Which one of the following statements about PCA is false?
- PCA projects the attributes into a space where covariance matrix is diagonal
- The first Principal Component points in the direction of maximum variance
- PCA is a non-linear dimensionality reduction technique
- PCA is useful for exploratory data analysis
Question 3 drill
Which one of the following statements about PCA is false?
- PCA works well for circular data
- The first PC points to maximum variance
- PCA computes eigen-value eigen-vector decomposition of the covariance matrix
- PCA works well for ellipsoidal data
Question 4 drill
The magnitude of vector x projected onto a unit vector u is
- \(x \times u\)
- \((x - \mu_x) \cdot (u - \mu_u)\)
- \(x\cdot u\)
- \(||x||||u||\)
Question 5 drill
Feature selection is:
- selecting a subset of attributes
- selecting principal components with maximum variance
- combining many features into one
- selecting principal components that are not orthogonal to each other
Quiz 6 explain
Covariance matrix, eigenvectors, eigenvalues
1.
If \(u_1, u_2, \dots, u_d\) , are eigenvectors (column vectors) of the covariance matrix \(\Sigma\), and \(\lambda_1, \lambda_2, \dots, \lambda_n\) are the eigenvalues, then:
- \(\Sigma = \lambda_1 u_1^T u_1 + \lambda_2 u_2^T u_2 + \dots \lambda_d u_d^T u_d\)
- \(\Sigma = \lambda_1 u_1^T + \lambda_2 u_2^T + \dots \lambda_d u_d^T\)
- \(\Sigma = \lambda_1 u_1 u_1^T + \lambda_2 u_2 u_2^T + \dots \lambda_d u_d u_d^T\)
- \(\Sigma = \lambda_1 u_1 + \lambda_2 u_2 + \dots \lambda_d u_d\)
Question 2 drill
The power method can determine (select the best answer)
- All eigenvalues and eigenvectors by deflation
- Eigen value/eigen vector corresponding to second-largest variance
- Eigen value/eigen vector corresponding to largest variance
- Eigen value/eigen vector corresponding to the smallest variance
Question 3 drill
If \(X^c \in \mathbb{R}^{n \times d}\) is a centered matrix and Σ its covariance matrix, which of the following is PCA?
- \(\Sigma = V\Delta V^T\)
- \(X = U\Delta V^T\)
- \(\Sigma = U\Delta V^T\)
- \(X = V\Delta V^T\)
question 4 drill
If \(X^c \in \mathbb{R}^{n \times d}\) is a centered matrix and Σ its covariance matrix, which of the following is SVD?
- \(X = U\Delta V^T\)
- \(\Sigma = U\Delta V^T\)
- \(X = V\Delta V^T\)
- \(\Sigma = V\Delta V^T\)
Question 5 drill
In Singular Value Decomposition, what does the matrix V represent?
- Eigenvectors of covariance of attributes
- Eigenvectors of covariance of data-points
- Matrix of eigenvalues on diagonal
- Deflated matrix after removing first Principal Component
Quiz 7 explain
LDA/PCA
1. drill
Which one of the following is not LDA?
- \(\max \frac{|m_1 - m_2|}{s_1^2 + s_2^2}\)
- \(\min \frac{s_1^2 + s_2^2}{(m_1 - m_2) . (m_1 - m_2)}\)
- \(\min \frac{|m_1 - m_2|}{s_1^2 + s_2^2}\)
- \(\max \frac{|m_2 - m_1|}{s_1^2 + s_2^2}\)
Question 2 drill
A dataset lies in d dimensions. Which one of the following is true (Choose best option)?
- PCA and LDA project data to 1 < d' <= d dimensions
- PCA projects data to 1 dimension, LDA projects data to 1 < d' <=d dimensions
- PCA projects data to d' <= d and LDA projects data to 1 dimension
- PCA and LDA project data to 1 dimension
Question 3 drill
Which of the following is true?
- LDA inputs data only. PCA inputs data and labels
- LDA inputs dataset and label. PCA inputs only dataset
- Both PCA & LDA input dataset only
- Both PCA & LDA input dataset and labels
Question 4 drill
A dataset lies in d dimensions. Which of the following is true of PCA & LDA?
- Both methods project data to higher dimension
- Both methods project data to lower dimension
- Both maximize variance in ℝd
- Both minimize variance in ℝd
Question 5 drill
Which of the following is a generalized eigenvector problem?
- \(Ax = \lambda x\)
- \(Ax = A^{-1}x\)
- \(Ax = \lambda B x\)
- \(Ax = x\)