Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
Machine Learning is the application of AI which offers system the capacity to learn and improve their work based on the experiences without being overtly programmed. Machine Learning interview questions and answers by Besant Technologies providing the skillful details to all our students and giving the best to all our students. Our trainees are completely and fully skilled professional those are having lots of years’ experience and they prepared these machine learning interview questions and answers. These machine learning interview questions and answers are a sample only and picked as top best.
Best Machine Learning Interview Questions and Answers
Besant Technologies trained students are having the luxury life by getting placed in top MNC companies and earning lots of huge amount as salary. We have lots of best feed backs for the machine learning interview questions and answers prepared by us and these questions are fully analyzed and prepared by having a tie-up with the top MNC companies. Do pursue in the best Machine learning institute in Chennai by Besant Technologies and get placed and stay happy.
Since in positively skewed data, mean in greater than median, we overestimate the value of missing observations.
Because the Harmonic mean gives more weightage to lower values. Thus, we will only get high F1 score if both Precision and Recall are higher.
Most Machine learning algorithms require numbers as input. On converting categorical values to factors we get numerical values and also we don’t have to deal with dummy variables.
We can use both factor() and as.factor() to convert variables to factors.
No. We can get perfect precision in many ways, but it doesn’t mean that our model predicts every value accurately. For ex. if we make one single positive prediction and make sure it is correct, our precision reaches 100%. Generally, precision is used with other metrics (recall) to measure the performance.
It does not always converge to the same point as in some cases it reaches a local-minima instead of a global optimal point.
It implies that the dependent variable should be a linear function of parameters. For the same reason, Polynomial regression is classified as linear though it fits a non-linear model between the dependent and independent variable.
A logistic model outputs a value between 0 and 1. To convert these probabilities into classes we use decision boundaries. We can set equal or unequal boundaries depending upon the requirement.
One dimensional can be separated using a point, two dimensional using a line and three dimensional can be separated by a hyperplane.
Both are feature scaling techniques.
Standardization is less affected by outliers as compared to Normalisation.
Standardization doesn’t bound values to a specific range which may be a problem for some algorithms where an input is bounded between ranges.
When we have many outliers in the data, Mean absolute error is a better choice.
Online learning. Because in Online learning each learning step is fast and cheap, and the system can be trained by feeding data instances sequentially.
Visualization, Dimensionality reduction, association rule learning
There are 3 stages to build mode in machine learning. Those are
- Model Building:- Choose the suitable algorithm for the model and train it according to the requirement of your problem.
- Model Testing:- Check the accuracy of the model through the test data
- Applying the model:- Make the required changes after testing and apply the final model which we have at the end.
In the world, we had humans and get computers. What is the main difference between humans and computers are humans learn from past experience but for computers need to tell what to do through a set of instructions. But we need to prepare computers to learn from their past experiences as like humans. That we called machine learning. But for computers for their past experience have some name called data. Basically, we no need to fear about machine learning as it is we need to train computer or machine as like you through data.
As per my knowledge many people already using machine learning in their everyday life. Let us suppose when you are engaging with the internet you are actually expressing your preferences, likes, dislikes through your search. So all these things picked up by cookies coming on to your computer. From that, we can evaluate the behavior of a user. Basically, that will help us to increase a progress of a user through the internet. Navigation is also one of the examples where we are using machine learning to find a distance between two places through using optimization techniques. I think people going to more engage with machine learning in the near future is health.
Example:- If you see now, Actually watson is being to use for health. It looking at and scans of body data & trying to understand symptoms of cancer. These are the things machine learning used in the movement.
Machine learning and human learning actually quite similar. Machine learning is about an algorithm or computer. Actually engaging with its environment with data and adopting a coding too on the things that it learns. Let us suppose a program fails to make the right predictions & it will balance itself in some sense. In order to make better predictions next time. Now, That is very similar the way human learns. Human is actually engaging with its environment & learning from it. So, Machine learning has an aspect of kind of an evolutionary aspect to it. Which I think quite new to the area of this artificial intelligence.
More People thought of that A.I. that mean artificial intelligence may be than machine learning. Artificial intelligence actually it’s like which we go to see that alan turing aim was to somehow make a machine have the sort of intelligence that human might have. In particularly a program actually to convince you that it is human if you chat it with it. But i think artificial intelligence has evolved since then to make a unique sort of intelligence that machine might have. Machine learning has slightly a different quality. It is like a more specific part of artificial intelligence. Which is the idea of the program is going to change a world through a coding that makes interactions as same like humans. By the end, the program might not know actually how the program is written. Because it’s been changing as it’s been interacting. So, might be when we look at the program & see the actual program we do not know why necessarily it decided to write these decisions in a particular way. Because there are a lot of connections between artificial intelligence have with machine learning.
I think machine learning is become such a big deal because of big data. We now had access to so much of data that machine can interact with it. So, I think this would be a problem where machine learning is going to make great progress. Its like big exploitation to the data. So, one of the big challenges is for artificial intelligence is a computer vision. One of the things like humans do their job in an incredible way
Example:- When humans look at a picture and they will interpret that picture very well. For computers, it is very difficult.
Because we are trying to program the thing from bottom to upwards. But now we can expose an algorithm to many pictures as it can learn as its going learn. So, I think the sort of ability for machine actually to view its environment and interpret and read it. Where it can make a lot of progress. Frankly, where are this data machine learning would be successful? For Example recommendations on the internet and navigation like whenever we drive we are giving new information to it and that’s being used and adapt to change the progress to a higher level. Likewise, health filed one of the biggest filed where a machine can study a lot of data that doctors can’t study and cant maintain that much data.
I think this is very important that a royal society to do a project on machine learning to realize themselves to know, how much impact machine learning is going to create in the future. There are some people who even did not heard about what is machine learning till now. That going to be changed in our society in near future. In order to address their potential or in order to address their phase/state where they are right now. When the world is moving so forward with these cutting-edge technologies. I think it’s all about transparency that we need to tell the potential of all the things where we can go when we learn these things in our future. It’s all about looking into the future to make predictions
Let see you are performing some task or you conducted some experiment or you conducted some test and whatever the test is associated with you or whatever the output came from your test or task is actually a negative but you actually predicted as a positive. That means you performed some experiment and output is actually negative but you predicted as a positive. So, Those kinds of cases will lie under false positive. In false negative exactly negative of the previous case called false positive. Actually, there are some outputs which are actually positive but you predicted a negative. So those kinds of cases lies under false negative
When you are analyzing a problem. If that problem consisting patterns and that pattern we can’t extract from mathematical equations. If you found such kind of problem then we need to use machine learning to extract those pattern by using lots of data. These above key features are helpful to predict whether the problem is a machine learning problem or not.
Example:– We need to find whether the number is even or odd. This example seems very simple. Yes, this problem is very simple because we know the logic to find whether a number is odd or even and we also know about the mathematics behind this problem. Let us suppose the number is divided by 2. Then the remainder is 1 then we call that number is odd. Whether the remainder is 0 then we call that number as an even. So, this problem has some pattern but we can solve through mathematical equations. do not need a lot of data also. So, definitely, this is not a machine learning problem. Till now this is not a machine learning problem but if you want to make it as a machine learning problem. We can do one thing. We can feed lots of data as an individual number by telling the number is odd the number is even. The machine will classify whether the number is even or odd. But as we know logic and mathematics behind this problem this problem can’t come under as a machine learning problem.
Example1:- Let’s say you have a lot of photos or images. We need to find whether particular photo contains human face or not. Here there is a pattern that we need to find human across all the photos. Can we solve this problem through mathematical equations? So, it’s very difficult. SImply we can take this lot of data and we can feed this data to the algorithm as a training data. Which means training the machine using this data. After this training, we will get some mathematical equations based on the patterns that we got from training. But humans can’t write this logic as their own. So Definitely, this is a machine learning problem. Here machine will automatically form a rule based on training data. That rule is nothing but to detect whether the photo contains human face or not.
If we are concerning about accuracy then one can test with different algorithms and cross-validate them to know whether you are getting good accuracy or not. Let us suppose When your problem having some small training dataset we need to use models which having low variance and high bias. Or else When your problem having large training dataset we need to use models which having high variance and low bias. If we follow these things we will easily get o know which algorithm is better to solve your machine learning algorithm.
AI is a way to make computer to think. Whereas ML is an application of AI which provides ability for computer to learn from experiences.
Stats are used to find the relationship between relevant data, but ML depend on data without any statistical influence. Stats derive to a conclusion on the basis of evidence and reasoning of data, whereas ML is used to optimize the data.
Neural network models are used to process the data, these are derived functions based on biological neurons, which are found in human brains. As ML duty to find the patterns from the data, this neural network models helps to find the patterns from complex data.
When we have distributed the data in a graph, if the shape looks like ‘bell curve’ with a mean value at center, then this will be called as Normal Distribution. This is widely used distribution in statistics.
This is same as the normal distribution, but with a average of ‘0’ and SD is equal to ‘1.
Regression is powerful. It’s versatile because it can be used for all kinds of data, including non-linear relationships. In fact, regression can be really thought of as the first crossover hit from machine learning that has gained wide acceptance in everyday life.
The residuals of a regression are the difference between the actual and the fitted values of the dependent variable. If the regression was a perfect fit, the residuals would all be equal to 0.
Detecting credit card fraud : Suppose you have some number of credit card customers who are supplying their credit cards to some payment application. The challenge is to work out which of those transactions the application should reject, because they’re likely fraudulent.
Predicting customer churn : For each caller, you want the call center staff to be able to figure out how likely that customer is to churn, that is, switch to a competitor.
Predict imminent failure : Suppose we’ve got a bunch of devices, robots, thermostats, whatever, that generate lots of streaming data that’s being handled by some kind of real time data processing software. That software is looking for anomalies or patterns that predict imminent failure.
Categorical variables which take on discrete values may need special treatment and preprocessing before you can feed them into a machine learning module. This is because machine learning modules can only accept numeric data.
Nearest Neighbors Model – Use the ratings of “most similar” users.
Latent Factor Analysis – Solve for underlying factors that drive the ratings.
Logistic Regression helps estimate how probabilities of categorical variables are influenced by causes.
ML-based:
- Dynamic
- Experts optional
- Corpus required
- Training step
- Rule-based:
Static
- Experts required
- Corpus optional
- No training step
print(100*(1.1 ** 7))
a_list=[1, ‘hello’, [1,2,3 ] , True]
a_list[1]
A=[1,’a’]
B=[2,1,’d’]
A+B
Ans : 10
A=1
“1”
A=”1″
B=”2″
C=A+B
“12”
Create a dictionary “album_sales_dict” where the keys are the album name and the sales in millions are the values.
album_sales_dict= { “The Bodyguard”:50, “Back in Black”:50,”Thriller”:65}
Gt=(‘pop’, ‘rock’, ‘soul’, ‘hard rock’, ‘soft rock’, ’R&B’, ‘progressive rock’, ‘disco’)
len(genres_tuple)
C_tuple = (-5,1,-3)
C_list = sorted(C_tuple)
C_list
if rating>8:
print “Amazing !”
rating = 8.5
if rating > 8:
print “this album is amazing”
else:
print “this album is ok
album_year = 1979
if album_year < 1980 or album_year == 1991 or album_year == 1993:
print (“this album came out already”)
for i in range(-5,6):
print(i)
Genres=array(‘rock’, ‘R&B’, ‘Soundtrack’ ‘R&B’, ‘soul’, ‘pop’) Make sure you follow Python conventions.
Genres=[ ‘rock’, ‘R&B’, ‘Soundtrack’ ‘R&B’, ‘soul’, ‘pop’]
for Genre in Genres:
print(Genre)
PlayListRatings = array(10,9.5,10, 8,7.5, 5,10, 10):
PlayListRatings = [10,9.5,10, 8,7.5, 5,10, 10]
i=0;
Rating=100
while(Rating>6):
Rating=PlayListRatings[i]
i=i+1
print(Rating)
new_squares=array(); squares=[‘orange’,’orange’,’purple’,’blue ‘,’orange’] new_squares=[];
i=0
while(squares[i]==’orange’):
new_squares.append(squares[i])
i=i+1
1.Reduce number of features.
- Manually select which features to keep.
- Model selection algorithm .
2.Regularization.
- Keep all the options, however cut back magnitude/values of parameters .
- Works well once we have heaps of options, every of that contributes a touch to predicting .
print(E[::2])
NLP:
– Natural Language Processing shortly called as NLP.
– It’s nothing but an processing and prehaps based understanding.
a=np.array([1,2,3,4,5])
b=np.array([1,0,1,0,1])
a*b
a=np.array([1,0])
b=np.array([0,1])
Plotvec2(a,b)
print(“the dot product is”,np.dot(a,b) )
set([‘rap’,’house’,’electronic music’,’rap’])
A=[1,2,2,1]
B=set([1,2,2,1])
print(“the sum of A is:”,sum(A))
print(“the sum of B is:”,sum(B))
album_set2 = set( ‘AC/DC’,’Back in Black’, ‘The Dark Side of the Moon’ ) album_set3=album_set1.union(album_set2)
album_set3
soundtrack_dic = [ ‘The Bodyguard’:’1992′, ‘Saturday Night Fever’:’1977′] The values are “1992” and “1977”
Factors are identified by experts and Factors are product attributes
Factors are derived using machine learning techniques and Factors may be related to product attributes or may be abstract.
Python 3 is cleaner and much faster than its predecessor, and is definitely the future.
However, some packages have still not moved to Python 3, so Python 2 offers stable third-party packages, and because it’s been there in the arena for a long time it has better community support also.
Some special features of Python 3 have backward compatibility with Python 2, so you can use Python 2 and still get those features.
Jupyter Notebook has become a go-to tool and environment for most of the data scientists these days.
Jupyter Notebook, commonly known as IPython Notebook, has become an integral part of data science projects due to its ability to combine code blocks along with human-friendly text that can be formatted using markdowns.
We can also view the images and videos right in your notebook.
We can work on it using your favorite web browsers, and
Not only supports Python code in it, you can also run codes in other languages as well, such as R or Julia or Scala, right in the single notebook.
- Data from Databases
- Data Through APIs
- Data Using Web Scraping
- Data from files like csv/xls/notepad
and so on.
Centrality measure provides you a number that you can use to represent the entire set of values for a certain feature. This number will be central to the data, and that’s why we call it central tendency.
Involves activities such as looking into potential issues in the data and solving them using appropriate techniques.
In the cross-validation setup, instead of two parts, we split the data into three parts, training, test, and cross-validation. Now, you pass the training set to the model and apply the training process to get the trained model, and then we evaluate the performance of the trained model on a cross-validation dataset.
Model persistence is a technique where you take your trained model and write or persist it to the disk. And once you have your model saved on the disk, you can use it whenever you want.
import matplotlib.pyplot as plt
plt.hist(life_exp,bins=10)
# Display histogram
plt.show()
europe = {‘spain’:’madrid’, ‘france’:’paris’, ‘germany’:’berlin’,
‘norway’:’oslo’, ‘italy’:’rome’, ‘poland’:’warsaw’, ‘austria’:’vienna’ }
for e in europe.keys():
print(“the capital of “+e+” is “+europe[e])