Python Basics for Data Science MCQ Questions and Answers Part – 2

0
35
MCQs

Python Basics for Data Science MCQ Questions and Answers Part – 1 

Python Basics for Data Science MCQ Questions and Answers Part – 2

Section 6: Advanced Machine Learning (51-70)

  1. What is the purpose of train_test_split() in Scikit-learn?
  • a) Splitting DataFrames
  • b) Splitting a dataset into training and testing sets
  • c) Filtering missing values
  • d) Performing PCA
    Answer: b) Splitting a dataset into training and testing sets
  1. What is the main advantage of using Principal Component Analysis (PCA)?
  • a) Increases data dimensionality
  • b) Reduces dimensionality while preserving variance
  • c) Improves feature importance
  • d) Converts numerical data to categorical
    Answer: b) Reduces dimensionality while preserving variance
  1. Which function is used to encode categorical variables in Scikit-learn?
  • a) StandardScaler()
  • b) OneHotEncoder()
  • c) MinMaxScaler()
  • d) LabelBinarizer()
    Answer: b) OneHotEncoder()
  1. What is the purpose of cross_val_score() in Scikit-learn?
  • a) Finds the best hyperparameters
  • b) Performs cross-validation
  • c) Trains a deep learning model
  • d) Computes classification accuracy
    Answer: b) Performs cross-validation
  1. Which ensemble method combines weak learners to form a strong learner?
  • a) Boosting
  • b) PCA
  • c) K-Means
  • d) Cross-validation
    Answer: a) Boosting
  1. What is an imbalanced dataset?
  • a) A dataset with missing values
  • b) A dataset where one class has significantly more samples than another
  • c) A dataset without categorical variables
  • d) A dataset with equal class distribution
    Answer: b) A dataset where one class has significantly more samples than another
  1. Which method is used to handle imbalanced datasets?
  • a) Feature scaling
  • b) SMOTE (Synthetic Minority Over-sampling Technique)
  • c) PCA
  • d) Cross-validation
    Answer: b) SMOTE (Synthetic Minority Over-sampling Technique)
  1. Which boosting algorithm is widely used for structured data?
  • a) KNN
  • b) XGBoost
  • c) Logistic Regression
  • d) Linear Regression
    Answer: b) XGBoost
  1. What does hyperparameter tuning do?
  • a) Reduces dataset size
  • b) Optimizes model performance
  • c) Increases test accuracy
  • d) Improves data quality
    Answer: b) Optimizes model performance
  1. Which function is used to find the best hyperparameters in Scikit-learn?
  • a) cross_val_score()
  • b) GridSearchCV()
  • c) train_test_split()
  • d) StandardScaler()
    Answer: b) GridSearchCV()
  1. What is feature engineering?
  • a) A machine learning algorithm
  • b) Creating new features from raw data
  • c) Training a deep learning model
  • d) Optimizing hyperparameters
    Answer: b) Creating new features from raw data
  1. Which technique helps reduce overfitting in decision trees?
  • a) Increasing depth
  • b) Pruning
  • c) Adding more features
  • d) Reducing dataset size
    Answer: b) Pruning
  1. What is bagging in ensemble learning?
  • a) Reducing dimensionality
  • b) Training multiple models on random subsets of data
  • c) Applying multiple activation functions
  • d) Combining similar features
    Answer: b) Training multiple models on random subsets of data
  1. What is the primary goal of k-means clustering?
  • a) Group similar data points together
  • b) Perform feature scaling
  • c) Increase accuracy of classification models
  • d) Reduce data redundancy
    Answer: a) Group similar data points together
  1. What metric is commonly used to evaluate clustering performance?
  • a) R-squared
  • b) Silhouette Score
  • c) F1-score
  • d) Accuracy
    Answer: b) Silhouette Score
  1. What is the primary function of a confusion matrix?
  • a) Evaluate classification model performance
  • b) Improve model accuracy
  • c) Transform data features
  • d) Optimize neural networks
    Answer: a) Evaluate classification model performance
  1. Which metric is used to evaluate a regression model?
  • a) Accuracy
  • b) Mean Squared Error (MSE)
  • c) F1-score
  • d) Confusion Matrix
    Answer: b) Mean Squared Error (MSE)
  1. What is the key advantage of using a Random Forest over a single Decision Tree?
  • a) Requires less data
  • b) Reduces overfitting
  • c) Runs faster
  • d) Requires no hyperparameter tuning
    Answer: b) Reduces overfitting
  1. What is the main function of a neural network activation function?
  • a) Introduce non-linearity
  • b) Reduce loss
  • c) Normalize input values
  • d) Increase model size
    Answer: a) Introduce non-linearity
  1. What is dropout used for in deep learning?
  • a) Increase training time
  • b) Improve feature selection
  • c) Prevent overfitting
  • d) Reduce activation function complexity
    Answer: c) Prevent overfitting

Section 7: Deep Learning (71-85)

  1. What is the fundamental unit of a neural network?
  • a) Batch size
  • b) Neuron
  • c) Feature selection
  • d) Gradient descent
    Answer: b) Neuron
  1. What type of activation function is commonly used in hidden layers?
  • a) Sigmoid
  • b) Softmax
  • c) ReLU
  • d) Linear
    Answer: c) ReLU
  1. What is the main function of an optimizer in deep learning?
  • a) Improve model visualization
  • b) Minimize the loss function
  • c) Increase learning rate
  • d) Reduce dataset size
    Answer: b) Minimize the loss function
  1. What is the role of a convolutional layer in a CNN?
  • a) Process textual data
  • b) Detect spatial patterns in images
  • c) Normalize feature values
  • d) Remove overfitting
    Answer: b) Detect spatial patterns in images
  1. What is the primary difference between CNN and RNN?
  • a) CNN is for text, RNN is for images
  • b) CNN is for images, RNN is for sequential data
  • c) CNN is supervised, RNN is unsupervised
  • d) CNN uses more parameters
    Answer: b) CNN is for images, RNN is for sequential data
  1. What is the main challenge of training deep neural networks?
  • a) Lack of training data
  • b) Vanishing gradient problem
  • c) High training speed
  • d) Small dataset sizes
    Answer: b) Vanishing gradient problem
  1. What is the primary function of dropout in neural networks?
  • a) Improve accuracy
  • b) Reduce overfitting
  • c) Increase training speed
  • d) Improve data visualization
    Answer: b) Reduce overfitting
  1. Which optimizer is commonly used in deep learning?
  • a) SGD
  • b) Adam
  • c) Linear Regression
  • d) XGBoost
    Answer: b) Adam
  1. What is the primary role of a fully connected layer in a neural network?
  • a) Reduce overfitting
  • b) Combine learned features for final classification
  • c) Normalize input data
  • d) Generate new data points
    Answer: b) Combine learned features for final classification
  1. What is transfer learning in deep learning?
  • a) Using different datasets for training
  • b) Training a model from scratch
  • c) Reusing a pre-trained model for a different task
  • d) Reducing training time using feature selection
    Answer: c) Reusing a pre-trained model for a different task
  1. Which deep learning framework is widely used for NLP and vision tasks?
  • a) Scikit-learn
  • b) OpenCV
  • c) TensorFlow
  • d) NumPy
    Answer: c) TensorFlow
  1. What is batch normalization used for in deep learning?
  • a) Improve visualization
  • b) Reduce model complexity
  • c) Stabilize and accelerate training
  • d) Improve data augmentation
    Answer: c) Stabilize and accelerate training
  1. What is an epoch in deep learning?
  • a) A new layer in a neural network
  • b) One complete pass through the entire training dataset
  • c) A type of activation function
  • d) A convolution operation
    Answer: b) One complete pass through the entire training dataset
  1. What is the primary advantage of using an autoencoder?
  • a) Increase model accuracy
  • b) Perform unsupervised feature learning
  • c) Improve dataset size
  • d) Reduce overfitting
    Answer: b) Perform unsupervised feature learning
  1. What is reinforcement learning primarily used for?
  • a) Text processing
  • b) Image classification
  • c) Decision-making in dynamic environments
  • d) Data augmentation
    Answer: c) Decision-making in dynamic environments

Section 8: Natural Language Processing (86-100)

  1. What is Tokenization in NLP?
  • a) Splitting text into words or sentences
  • b) Removing stopwords
  • c) Stemming words
  • d) Applying TF-IDF
    Answer: a) Splitting text into words or sentences
  1. What does TF-IDF stand for?
  • a) Term First-In Document Frequency
  • b) Text Feature In-depth
  • c) Term Frequency – Inverse Document Frequency
  • d) Token Feature Index
    Answer: c) Term Frequency – Inverse Document Frequency
  1. What is the purpose of Word2Vec in NLP?
  • a) Convert text into images
  • b) Represent words as numerical vectors
  • c) Classify documents
  • d) Remove stopwords
    Answer: b) Represent words as numerical vectors
  1. Which model is commonly used in NLP for text generation?
  • a) CNN
  • b) Transformer
  • c) K-Means
  • d) Decision Tree
    Answer: b) Transformer
  1. What is the difference between stemming and lemmatization?
  • a) No difference
  • b) Stemming cuts words, lemmatization gives root words
  • c) Lemmatization uses AI
  • d) Stemming is better
    Answer: b) Stemming cuts words, lemmatization gives root words
  1. Which deep learning model is widely used in NLP tasks?
  • a) Decision Trees
  • b) BERT (Bidirectional Encoder Representations from Transformers)
  • c) SVM
  • d) Random Forest
    Answer: b) BERT
  1. What is the primary use of Named Entity Recognition (NER)?
  • a) Detecting sentiment
  • b) Identifying names, locations, and entities in text
  • c) Tokenizing sentences
  • d) Predicting next words in a sentence
    Answer: b) Identifying names, locations, and entities in text
  1. Which NLP technique is used to summarize text?
  • a) Named Entity Recognition
  • b) Text Summarization
  • c) Sentiment Analysis
  • d) Topic Modeling
    Answer: b) Text Summarization
  1. What is the main goal of sentiment analysis?
  • a) Classify emails
  • b) Determine the emotional tone of a text
  • c) Translate text
  • d) Convert speech to text
    Answer: b) Determine the emotional tone of a text
  1. Which of the following is a popular NLP dataset?
  • a) CIFAR-10
  • b) IMDB Reviews
  • c) MNIST
  • d) COCO
    Answer: b) IMDB Reviews
  1. What is BLEU score used for in NLP?
  • a) Evaluate machine translation quality
  • b) Measure topic coherence
  • c) Detect entities in text
  • d) Perform text clustering
    Answer: a) Evaluate machine translation quality
  1. Which language model is used in OpenAI’s ChatGPT?
  • a) CNN
  • b) GPT (Generative Pre-trained Transformer)
  • c) RNN
  • d) LSTM
    Answer: b) GPT
  1. What is Zero-shot learning in NLP?
  • a) Making predictions without prior training on a specific task
  • b) Removing stopwords
  • c) Sentiment analysis
  • d) Topic modeling
    Answer: a) Making predictions without prior training on a specific task
  1. Which framework is used for NLP in Python?
  • a) NLTK
  • b) OpenCV
  • c) TensorFlow
  • d) PyTorch
    Answer: a) NLTK
  1. What is POS tagging in NLP?
  • a) Identifying parts of speech in text
  • b) Detecting named entities
  • c) Summarizing text
  • d) Translating text
    Answer: a) Identifying parts of speech in text
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments