Cluster Analysis Quiz 4

Test your knowledge with this Cluster Analysis Quiz featuring MCQs on k-means, k-medoids, k-means++, and k-median algorithms, along with key concepts like Manhattan distance, cosine similarity, CF tree split, and multi-class classification. Perfect for machine learning enthusiasts and data science learners to assess their understanding of unsupervised clustering techniques. Take the Cluster Analysis Quiz now and sharpen your skills!

Online Unsupervised machine learning technique cluster analysis quiz with answers

Online Unsupervised Machine Learning Cluster Analysis Quiz with Answers

1. When will a leaf entry in the CF tree split?

 
 

2. Considering the k-means algorithm, after the current iteration, we have three centroids (0, 1), (2, 1), and (-1, 2). Will points (0.5, 0.5) and (-0.5, 0) be assigned to the same cluster in the next iteration?

 
 

3. Considering the k- median algorithm, if points (-1, 3), (-3, 1), and (-2, -1) are the only points that are assigned to the first cluster now, what is the new centroid for this cluster?

 
 
 
 

4. The k-means++ algorithm is designed for better initialization for k-means, which will take the farthest point from the currently selected centroids. Suppose $k = 2$, and we have selected the first centroid as (0, 0). Among the following points (these are all the remaining points), which one should we take for the second centroid?

 
 
 
 

5. Which of the following statements is true?

 
 

6. Which of the following statements about k-medoids, k-median, and k-modes algorithms is correct?

 
 
 
 

7. Is it possible that the SSE strictly increases after we recompute new centers in the k-means algorithm? Why?

 
 

8. What are some common considerations and requirements for cluster analysis?

 
 
 
 

9. Given three vectors $A$, $B$, and $C$, suppose the cosine similarity between $A$ and $B$ is $cos(A, B) = 1.0$, and the similarity between $A$ and $C$ is $cos(A, C) = -1.0$. Can we determine the cosine similarity between $B$ and $C$?

 
 

10. Considering the k-means algorithm, if points (0, 3), (2, 1), and (-2, 2) are the only points that are assigned to the first cluster now, what is the new centroid for this cluster?

 
 
 
 

11. Which of the following statements, if any, is FALSE?

 
 
 
 

12. Which of the following statements is true?

 
 
 
 

13. Suppose $X$ is a random variable with $P(X = -1) = 0.5$ and $P(X = 1) = 0.5$. In addition, we have another random variable $Y=X*X$. What is the covariance between $X$ and $Y$?

 
 
 

14. If you need to choose between clustering and supervised learning for the following applications, which would you choose, clustering over supervised learning?

 
 
 
 

15. Which of the following statements is true?

 
 
 
 

16. Given the two-dimensional points (0, 3) and (4, 0), what is the Manhattan distance between those two points?

 
 
 
 

17. When dealing with multi-class classification problems, which loss function should be used?

 
 
 
 

18. In the k-medoids algorithm, after computing the new center for each cluster, is the center always guaranteed to be one of the data points in that cluster?

 
 

19. Is K-means guaranteed to find K clusters that lead to the global minimum of the SSE?

 
 

20. For k-means, will different initializations always lead to different clustering results?

 
 

Online Cluster Analysis Quiz with Answers

  • Is K-means guaranteed to find K clusters that lead to the global minimum of the SSE?
  • When dealing with multi-class classification problems, which loss function should be used?
  • Is it possible that the SSE strictly increases after we recompute new centers in the k-means algorithm? Why?
  • For k-means, will different initializations always lead to different clustering results?
  • In the k-medoids algorithm, after computing the new center for each cluster, is the center always guaranteed to be one of the data points in that cluster?
  • Which of the following statements is true?
  • What are some common considerations and requirements for cluster analysis?
  • Which of the following statements is true?
  • If you need to choose between clustering and supervised learning for the following applications, which would you choose, clustering over supervised learning?
  • Which of the following statements is true?
  • Given the two-dimensional points (0, 3) and (4, 0), what is the Manhattan distance between those two points?
  • Given three vectors $A$, $B$, and $C$, suppose the cosine similarity between $A$ and $B$ is $cos(A, B) = 1.0$, and the similarity between $A$ and $C$ is $cos(A, C) = -1.0$. Can we determine the cosine similarity between $B$ and $C$?
  • Suppose $X$ is a random variable with $P(X = -1) = 0.5$ and $P(X = 1) = 0.5$. In addition, we have another random variable $Y=X*X$. What is the covariance between $X$ and $Y$?
  • Considering the k-means algorithm, after the current iteration, we have three centroids (0, 1), (2, 1), and (-1, 2). Will points (0.5, 0.5) and (-0.5, 0) be assigned to the same cluster in the next iteration?
  • Considering the k-means algorithm, if points (0, 3), (2, 1), and (-2, 2) are the only points that are assigned to the first cluster now, what is the new centroid for this cluster?
  • The k-means++ algorithm is designed for better initialization for k-means, which will take the farthest point from the currently selected centroids. Suppose $k = 2$, and we have selected the first centroid as (0, 0). Among the following points (these are all the remaining points), which one should we take for the second centroid?
  • Considering the k- median algorithm, if points (-1, 3), (-3, 1), and (-2, -1) are the only points that are assigned to the first cluster now, what is the new centroid for this cluster?
  • Which of the following statements about k-medoids, k-median, and k-modes algorithms is correct?
  • Which of the following statements, if any, is FALSE?
  • When will a leaf entry in the CF tree split?

Try Deep Learning Quizzes

Machine Learning Interview Questions

Prepare for your next ML interview with these essential machine learning interview questions! Learn key concepts like training vs. test sets, popular algorithms (Linear Regression, SVM, Random Forest), classifiers, and model selection. Understand why data splitting matters and see real-world examples. Perfect for aspiring data scientists and ML engineers—boost your knowledge and ace your interview.

Machine Learning Interview Questions

Mastering machine learning interview questions is crucial for landing top AI/ML roles. These questions test your fundamental understanding of key concepts like algorithms, model evaluation, and real-world problem-solving. By preparing targeted ML interview questions, candidates demonstrate technical expertise, analytical thinking, and the ability to apply theory to practical scenarios – exactly what hiring managers seek in data science and machine learning roles

What is machine learning?

Machine learning is a branch of computer science that deals with system programming to automatically learn and improve with experience. For example, Robots are programmed to perform tasks based on data they gather from sensors. They automatically learn programs from data.

In other words, Machine Learning (ML) is a subset of artificial intelligence (AI) that enables systems to learn from data and improve their performance without explicit programming. Instead of following fixed rules, ML algorithms identify patterns, make predictions, or take actions based on training data.

Machine Learning Interview Questions

What are the Key Points of machine learning?

The key points of machine learning are:

  • Learns from Data: Improves accuracy over time with more input.
  • Automates Decisions: Used in recommendations, fraud detection, speech recognition, etc.
  • Types: Supervised (labeled data), Unsupervised (no labels), Reinforcement (trial & error).

What are “Training Set” and “Test Set”?

In machine learning, the training set and test set are defined as follows:

  • Training Set: The portion of data used to train a machine learning model. The model learns patterns from this data.
  • Test Set: A separate portion of data used to evaluate the model’s performance after training. It checks how well the model generalizes to unseen data.

Training Set: In various areas of information science, like machine learning, a dataset is used to discover the potentially predictive relationship known as the ‘Training Set’. The training set is an example given to the learner, while the Test set is used to test the accuracy of the hypotheses generated by the learner, and it is the set of examples held back from the learner. Training sets are distinct from the Test sets.

For example, suppose you have 1,000 data points; you might use 800 for training and 200 for testing.

Why Split Data in machine learning algorithms?

In different machine learning algorithms, the data is split into:

  • Prevents overfitting (memorizing training data instead of learning useful patterns).
  • Measures real-world accuracy before deployment.

The five popular algorithms of machine learning are:

  • Linear Regression: Used for predicting continuous values and fits a straight line to the data.
  • Logistic Regression: Used for binary classification (such as spam detection) and predicts probabilities between 0 and 1.
  • Decision Trees: Works for classification and regression (such as load approval) and splits data into branches based on feature values.
  • Random Forest: An ensemble method (multiple decision trees combined) that reduces overfitting and improves accuracy.
  • Support Vector Machine: Effective for classification tasks (such as image recognition) and finds the best boundary (hyperplane) between classes.
  • Neural Networks: deep learning for complex patterns
  • K-Nearest Neighbour (KNN): simple, instance-based learning

What is a classifier in machine learning?

A classifier in machine learning is an algorithm that assigns a label or category to input data based on its features. It is used in supervised learning where the model is trained on labeled data to predict discrete outcomes (classes).

What are the key points of a classifier in machine learning?

The key points are:

  • Purpose: Categorizes data (e.g., spam vs. not spam, cat vs. dog).
  • Examples of Classifiers:
    • Logistic Regression
    • Decision Trees
    • Random Forest
    • Support Vector Machines (SVM)
    • Neural Networks
  • Works by: Learning patterns from labeled training data, then predicting labels for new, unseen data.

Give an example that explains the concept of a classifier in machine learning

An email classifier predicts whether an incoming email is “spam” or “not spam.”

What is Model Selection in Machine Learning?

The process of selecting models among different mathematical models, which are used to describe the same data set, is known as Model Selection. Model selection is applied to the fields of statistics, machine learning, and data mining.

Model selection is the process of choosing the best-performing algorithm (or model) for a given dataset and problem. It involves comparing different models, tuning their parameters, and selecting the one that generalizes well to unseen data.

The key aspects of model selection in machine learning are:

  • Performance Comparison – Evaluating models using metrics (e.g., accuracy, precision, F1-score).
  • Cross-Validation – Testing models on different subsets of data to ensure reliability.
  • Bias-Variance Tradeoff – Balancing underfitting (too simple) vs. overfitting (too complex).
  • Hyperparameter Tuning – Optimizing model settings for better performance.

For example, choosing between a Random Forest and an SVM for a classification task based on cross-validation scores.

statistics help machine learning interview questions with answers

Neural Networks MCQs 3

Challenge yourself with these Neural Networks MCQs covering key concepts like activation functions (ReLU, Tanh), optimizers (Adam), loss functions, GANs, vanishing gradients, and more! Perfect for ML beginners and AI enthusiasts. Evaluate your understanding and boost your neural networks expertise today! Let us start with the Neural Networks MCQs now.

Online Neural Networks MCQs with Answers
Please go to Neural Networks MCQs 3 to view the test

Online Neural Networks MCQs with Answers

  • What are the primary functions of an artificial neuron in a neural network?
  • What does an optimizer do in the context of training a neural network?
  • Which activation function is most likely to suffer from the vanishing gradient problem?
  • Select the characteristics of the ReLU activation function.
  • Which activation function is defined by the equation $f(x) = \frac{1}{1+e^{−x}}$.
  • What is the primary purpose of a loss function in training a neural network?
  • Select all the scenarios where Mean Squared Error (MSE) would be a more suitable loss function than Binary Cross Entropy.
  • Select all characteristics that apply to the Tanh activation function.
  • What is the main advantage of using RMSprop over standard SGD?
  • Which of the following statements accurately describe the Adam optimizer?
  • What is a key characteristic of Generative Adversarial Networks (GANs)?
  • Which neural network architecture is most suitable for tasks involving sequential data, such as text or speech?
  • What function is commonly used as the loss function in a regression model with Keras?
  • Select the optimizers that use momentum to accelerate gradient vectors in the relevant direction.
  • In the context of neural networks, what is the primary role of an optimizer?
  • Which of the following neural network types are designed to handle long-term dependencies in sequential data?
  • What are some common metrics used to evaluate a regression model in Keras?
  • Which type of neural network is best suited for image recognition tasks?
  • Which of the following steps are involved in creating a regression model using a multilayer perceptron neural network?
  • Which of the following are characteristics of an effective loss function in neural network training?

Try Deep Learning Quiz