Below is a full set of 100 multiple choice questions covering the basics of machine learning,including the detailed answers and explanations for each one This set is ideal for learners,educators, and anyone seeking to test or reinforce their foundational knowledge in the field.
A Clustering
B Predicting outcomes using labeled data
C Sampling data points
D Reducing database size
K-means clustering groups unlabeled data based on similarity.
A Assigning data to discrete categories
B Estimating a continuous value
C Grouping unlabeled data
D Reducing features
Answer: A
Classification assigns each data point to a specific class.
100 Machine Learning MCQs with Detailed
Answer Explanations
Questions 1 100 with Answers and Explanations
1 What is the main objective of supervised learning?
2 Which of the following is an example of unsupervised learning?
3 Which type of problem is classification?
Trang 2A Model is too simple
B Model is too flexible and memorizes the data
C Model cannot be trained
5 Which method is mainly used for dimensionality reduction?
6 What do we call data that the model has never seen during training?
7 Which of the following is a supervised algorithm?
8 What does the activation function do in a neural network?
Trang 3Loss measures the error between predicted outcomes and actual results.
A Mean Squared Error
9 Which is NOT a step in data preprocessing?
10 Which term describes the difference between predicted and actual values?
11 Which evaluation metric is best for classification?
12 What is a hyperparameter?
13 What does K in K Nearest Neighbors represent?
Trang 4Preprocessing is not a machine learning type but a data preparation method.
A Using all features
B Increasing model depth
D Decision tree regression
14 Which learning type involves a reward system?
15 What is a confusion matrix used for?
16 Which is NOT a type of machine learning?
17 Which technique helps avoid overfitting?
18 Which of these is NOT a regression technique?
Trang 5B Reduces variance by combining many trees
C Needs little data
SVM can use kernel tricks to operate in high-dimensional feature spaces.
A Convolutional Neural Network
B Recurrent Neural Network
D Rectified Output Curve
19 What is the output of a clustering algorithm?
20 What is the main advantage of random forests?
21 Which machine learning algorithm uses a “kernel trick”?
22 What kind of neural network is mostly used for sequential data?
23 What does ROC stand for in model evaluation?
Trang 6Dimensionality reduction decreases the number of input variables.
A Increase dataset size
B Detect outliers
C Evaluate model generalization
D Optimize model parameters
Answer: C
Splitting data helps assess model performance on unseen data.
A Mean Absolute Error
B F1 Score
C Mean Squared Error
D R2 Score
Answer: B
F1 Score is used for classification, not regression.
A Makes the model faster
B Prevents underfitting
C Provides robust generalization estimation
D Increases data size
24 Which step reduces features to a lower-dimensional form?
25 What is the purpose of the train-test split?
26 Which measurement is NOT commonly used for regression evaluation?
27 What is cross-validationʼs main benefit?
28 Which method is used to handle missing values?
Trang 7Decision trees can model non-linear relationships in data.
A Ending the project early
B Stopping training before overfitting
C Pausing during training
D Training one layer at a time
BeautifulSoup is used for web scraping, not ML.
A Data cleaning only
B Creating new input variables
C Data evaluation
D Hyperparameter tuning
29 Ensemble learning combines:
30 Which algorithm works best for non-linear boundaries?
31 What is early stopping?
32 Which library is NOT typically used for ML in Python?
33 What is feature engineering?
Trang 8KNN stores training data and processes queries at prediction time (“lazy learner”).
A A basic neural unit
B One backward pass
C One full pass over the training data
D A day of training
34 Which one is a feature scaling method?
35 What is “bagging” short for?
36 Which algorithm is lazy in its training phase?
37 A perceptron is:
38 What is an epoch in deep learning?
Trang 939 Which metric is suitable for imbalanced datasets?
40 What is the main goal of clustering?
41 Which ML task is best for time series forecasting?
42 Which optimizes neural network training?
43 Which of the following is a loss function for classification?
Trang 10Overfitting reduces model generalization to new data.
A Proportion of true positives among all positive predictions
B Proportion of true positives among all actual positives
C Proportion of true positives among all negatives
D Proportion of false positives among all predictions
Answer: A
Precision focuses on the accuracy of positive predictions.
A Creates new labels
B Converts categorical values to numbers
C Removes null values
44 Which is NOT a clustering algorithm?
45 Which method does NOT improve model generalization?
46 What is precision?
47 What does “label encoding” do?
48 A ROC curve closer to the top left signifies:
Trang 1149 Which method reduces the number of features but keeps variance?
50 Which step is NOT part of machine learning workflow?
51 Which machine learning library supports deep learning?
52 Which algorithm is best for text classification?
53 Which model parameter is adjusted during training?
Trang 12Decision trees are prone to overfitting small or noisy datasets.
A Categorical features into multiple binary features
Decision trees use a greedy approach at split points.
A Model memorizes data
B Model canʼt capture data structure
C Model is too flexible
D Too many features
54 Which algorithm is likely to overfit small datasets?
55 A one-hot encoding expands:
56 Which is a greedy algorithm?
57 What does “underfitting” mean?
58 Which type of learning requires expert labeling?
Trang 13Isolation Forest is designed for detecting anomalies or outliers in data.
A Remove irrelevant features
B Add more features
59 What is a “decision boundary”?
60 Which method is used for anomaly detection?
61 What does feature selection do?
62 Which is NOT an ensemble method?
63 What splits data into k subsets, trains and validates k times?
Trang 14Regression predicts a continuous numeric value.
A Choosing between too simple or too complex models
B Encoding categorical values
C Imputing missing data
D Measuring classification accuracy
The median is less sensitive to outliers compared to the mean.
A Combines weak learners sequentially
B Uses all data at once
C Trains models independently
D Uses only regression trees
64 Which is an output of regression?
65 What is the bias-variance tradeoff?
66 Which is NOT an activation function?
67 Which one improves robustness to outliers?
68 Whatʼs a key feature of “boosting”?
Trang 15Feature importance measures the influence of each input variable on predictions.
A Adds more neurons
B Randomly disables neurons during training
C Removes irrelevant features
D Adds noise to input
Answer: B
Dropout randomly turns off neurons to reduce overfitting.
A Number of data points
B Number of features
C Number of data splits/folds
D Number of hyperparameters
69 Which is a benefit of feature scaling?
70 Which technique handles class imbalance?
71 Which evaluates the importance of each feature?
72 Which best describes dropout in neural networks?
73 What characterizes the k in k-fold cross-validation?
Trang 16Answer: C
'k' is the number of train/validation splits.
A Makes synthetic data
74 What is a “generator” in machine learning?
75 Which regularization adds all absolute values of weights?
76 Which ML algorithm is based on probability?
77 What is data augmentation?
Trang 17A Trains one model for each class
B Trains one model for all classes
C Trains only on minority class
D Trains two models for each class
Batch size defines how many samples are used for each parameter update.
A Training from scratch
B Reusing pretrained model knowledge
Word embeddings represent text in numerical vector form.
A Batch gradient descent
B Stochastic gradient descent
C Cross-validation
D Overfitting
Answer: B
SGD updates parameters for each sample, increasing speed with big data.
78 What does “one-vs-rest” strategy do?
79 What is batch size in deep learning?
80 What is Transfer Learning?
81 Whatʼs the main use of “word embeddings”?
82 Which increases training speed with large data?
Trang 18A Size of training set
B Step size in parameter updates
Increasing complexity can cause overfitting; the other options help reduce it.
A Data used for model selection
83 What is the “learning rate”?
84 Which technique is NOT used to reduce overfitting?
85 What is a “blind test set”?
86 Which algorithm is NOT tree-based?
87 What is a “confounding variable”?
Trang 19Answer: B
A confounder affects both independent and dependent variables, biasing model results.
A More than two classes
B Unequal class distribution
Autoencoders are neural networks for learning compressed representations.
A Sequence of processing steps
B Model weights
C Activation function
D Random variable
88 What does “imbalanced data” mean?
89 What is “feature extraction”?
90 Which Python package is primarily for numerical computing?
91 What is an Autoencoder?
92 What is a “pipeline” in ML?
Trang 20Silhouette score measures how well data points fit within their clusters.
A Not enough data
B Too many features degrade performance
C Low accuracy
D Underfitting
Answer: B
As feature count increases, data sparsity can negatively impact models.
A Recurrent Neural Network
B Convolutional Neural Network
B Reduces model accuracy
C Reduces computation time
D Adds noise
93 Which is a method for text vectorization?
94 Which metric evaluates clustering quality?
95 What does the “curse of dimensionality” refer to?
96 Which network works best for images?
97 Which is a positive effect of feature selection?
Trang 21Answer: C
Less features = faster computations and simpler models.
A Final model evaluation
B Tuning model parameters
C Imputing missing values
Larger networks need more data and regularization to avoid overfitting.
You now have a complete 100-question machine learning MCQ set, covering core ideas and providing explanations for each answer.
98 What is the “validation set” for?
99 Which concept refers to applying machine learning to itself?
100 Which is a drawback of large neural networks?