I totally forgot about writing some new stuff here, but recently I found these 160 data science interview questions on Hackernoon, and decided to try to answer each one of them in order to force me to study all of those interesting topics. I will post my answers (hopefully, right and comprehensible) trying to write ~2⁄3 answers each couple of days.
If you spot anything wrong, contact me please!
What is SGD — stochastic gradient descent, what’s the difference with the usual gradient descent
The SGD is an optimization algorithm that, just like gradient descent, looks to minimize a cost function by changing the value of the input to said function. The difference between the two is that, while gradient descent computes and changes all of the parameters at once, SGD only computes some of the parameters (in this case, called mini-batch) or even one at a time.
The result is that it takes less computations to find a minimum, but it might find a less optimal result than gradient descent.
Which metrics for evaluating regression models do you know
There are two main error metrics: one is the Mean Squared Error, the mean of difference between predicted value and actual value, squared; the other is the Mean Absolute Error, the mean of the same difference, but taking the absolute value instead of the squared value.
The R squared error: R squared is a value that represent how much of the total variance of the dataset is represented by the model we are testing. It’s value is 1 minus the ratio between explained variance of the model and total variance of the dataset.
What are MSE and RMSE
MSE is the mean squared error, as explained in the first question is a metric for evaluating regression models. RMSE, or Root Mean Squared Error, is simply the squared root of MSE.