Data Science, Machine Learning & Artificial Intelligence interview questions and answers for practice
Topic: Machine Learning Concepts
Machine learning is a branch of computer science which deals with system programming to automatically learn and improve with experience. For example: Robots are programmed so that they can perform the task based on data they gather from sensors. It automatically learns programs from data.
Machine learning relates with the study, design and development of the algorithms that give computers the capability to learn without being explicitly programmed. While, data mining can be defined as the process in which the unstructured data tries to extract knowledge or unknown interesting patterns. During this process machine, learning algorithms are used.
a) Find clusters of the data
b) Find low-dimensional representations of the data
c) Find interesting directions in data
d) Interesting coordinates and correlations
e) Find novel observations/ database cleaning
In machine learning, when a statistical model describes random error or noise instead of underlying relationship ‘overfitting’ occurs. When a model is excessively complex, overfitting is normally observed, because of having too many parameters with respect to the number of training data types. The model exhibits poor performance which has been overfit.
The possibility of overfitting exists as the criteria used for training the model is not the same as the criteria used to judge the efficacy of a model
By using a lot of data overfitting can be avoided, overfitting happens relatively as you have a small dataset, and you try to learn from it. But if you have a small database and you are forced to come with a model based on that. In such situation, you can use a technique known as cross validation. In this method the dataset splits into two section, testing and training datasets, the testing dataset will only test the model while, in training dataset, the datapoints will come up with the model.
In this technique, a model is usually given a dataset of a known data on which training (training data set) is run and a dataset of unknown data against which the model is tested. The idea of cross validation is to define a dataset to “test” the model in the training phase.
The inductive machine learning involves the process of learning by examples, where a system, from a set of observed instances tries to induce a general rule.
a) Decision Trees
b) Neural Networks (back propagation)
c) Probabilistic networks
d) Nearest Neighbor
e) Support vector machines
The different types of techniques in Machine Learning are
a) Supervised Learning
b) Unsupervised Learning
c) Semi-supervised Learning
d) Reinforcement Learning
e) Transduction
f) Learning to Learn
a) Model building
b) Model testing
c) Applying the model
The standard approach to supervised learning is to split the set of examples into the training set and the test.
In various areas of information science like machine learning, a set of data is used to discover the potentially predictive relationship known as ‘Training Set’. Training set is an example given to the learner, while Test set is used to test the accuracy of the hypothesis generated by the learner, and it is the set of examples held back from the learner. Training set are distinct from Test set.
The different approaches in Machine Learning are;
a) Concept Vs Classification Learning
b) Symbolic Vs Statistical Learning
c) Inductive Vs Analytical Learning
a) Artificial Intelligence
b) Rule based inference
a) Classifications
b) Speech recognition
c) Regression
d) Predict time series
e) Annotate strings
Machine learning in where mathematical foundations is independent of any classifier or learning algorithm is referred as algorithm independent machine learning?
Designing and developing algorithms according to the behaviours based on empirical data are known as Machine Learning. While artificial intelligence in addition to machine learning, also covers other aspects like knowledge representation, natural language processing, planning, robotics etc.
A classifier in a Machine Learning is a system that inputs a vector of discrete or continuous feature values and outputs a single discrete value, the class.
Pattern Recognition can be used in
a) Computer Vision
b) Speech Recognition
c) Data Mining
d) Statistics
e) Information Retrieval
f) Bioinformatics
Genetic programming is one of the two techniques used in machine learning. The model is based on the testing and selecting the best choice among a set of results.
Inductive Logic Programming (ILP) is a subfield of machine learning which uses logical programming representing background knowledge and examples.
The process of selecting models among different mathematical models, which are used to describe the same data set is known as Model Selection. Model selection is applied to the fields of statistics, machine learning and data mining.
The two methods used for predicting good probabilities in Supervised Learning are
a) Platt Calibration
b) Isotonic Regression
These methods are designed for binary classification, and it is not trivial.
When there is sufficient data ‘Isotonic Regression’ is used to prevent an overfitting issue.
The difference is that the heuristics for decision trees evaluate the average quality of several disjointed sets while rule learners only evaluate the quality of the set of instances that is covered with the candidate rule.
In Machine Learning, Perceptron is an algorithm for supervised classification of the input into one of several possible non-binary outputs.
Bayesian logic program consists of two components. The first component is a logical one; it consists of a set of Bayesian Clauses, which captures the qualitative structure of the domain. The second component is a quantitative one, it encodes the quantitative information about the domain.
Bayesian Network is used to represent the graphical model for probability relationship among a set of variables.
Instance based learning algorithm is also referred as Lazy learning algorithm as they delay the induction or generalization process until classification is performed.
The important components of relational evaluation techniques are
a) Data Acquisition
b) Ground Truth Acquisition
c) Cross Validation Technique
d) Query Type
e) Scoring Metric
f) Significance Test
The different methods to solve Sequential Supervised Learning problems are
a) Sliding-window methods
b) Recurrent sliding windows
c) Hidden Markov models
d) Maximum entropy Markov models
e) Conditional random fields
f) Graph transformer networks
The areas in robotics and information processing where sequential prediction problem arises are
a) Imitation Learning
b) Structured prediction
c) Model based reinforcement learning
PAC (Probably Approximately Correct) learning is a learning framework that has been introduced to analyse learning algorithms and their statistical efficiency.
a) Sequence prediction
b) Sequence generation
c) Sequence recognition
d) Sequential decision
Sequence learning is a method of teaching and learning in a logical manner.
The two techniques of Machine Learning are
a) Genetic Programming
b) Inductive Learning
The recommendation engine implemented by major ecommerce websites uses Machine Learning.
In Machine Learning and statistics, dimension reduction is the process of reducing the number of random variables under considerations and can be divided into feature selection and feature extraction
ai interview questions, ai question, ai quiz questions, ai viva questions, artificial intelligence interview questions, artificial intelligence mcq, artificial intelligence question, data science interview question, data science interview questions pdf, data science interviews, data science mcqs, data science questions, data scientist interview questions, deep learning interview questions, interview questions on machine learning, machine learning interview question, machine learning interview questions and answers, machine learning lab viva questions, machine learning mcq, machine learning question, machine learning quiz questions, machine learning quiz questions and answers, mcq on artificial intelligence, ml interview questions, python data science interview questions, python data science interview questions and answers pdf, python interview questions for data science, questions about ai, statistics interview questions for data science