Get R ^ 2 value from scipy.linalg.lstsq
I have a 3D dataset using a function scipy.linalg.lstsq
.
I used:
# best-fit quadratic curve
A = np.c_[np.ones(data.shape[0]), data[:,:2], np.prod(data[:,:2], axis=1), data[:,:2]**2]
C,_,_,_ = scipy.linalg.lstsq(A, data[:,2])
#evaluating on grid
Z = np.dot(np.c_[np.ones(XX.shape), XX, YY, XX*YY, XX**2, YY**2], C).reshape(X.shape)
But how can I get the R ^ 2 value from this surface? Is there a way to check the value of the fitting result?
Any ideas related to this would be much appreciated.
thank.
source to share
Following http://en.wikipedia.org/wiki/Coefficient_of_determination :
B = data[:,2]
SStot = ((B - B.mean())**2).sum()
SSres = ((B - np.dot(A,C))**2).sum()
R2 = 1 - SSres / SStot
As noted in the Wikipedia article, R2 has many drawbacks. Scipy / numpy compares poorly to a library like statsmodels as far as I know.
If you want to run multivariate regressions since you need to compute the standard errors of the estimated ex-post coefficients, t-stats, p-values, etc. etc. if you want to know what's going on in your data.
There are many posts dedicated to running OLS with Python, so just pick one like: http://www.datarobot.com/blog/ordinary-least-squares-in-python/
source to share