Defining standard error in scipy.stats.linregress
I am using scipy.stats.linregress function to perform simple linear regression on some 2D data like:
from scipy import stats x = [5.05, 6.75, 3.21, 2.66] y = [1.65, 26.5, -5.93, 7.96] gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y)
The function documentation states what std_err
is:
Standard error of the estimate
I'm not sure what that means. This old answer says it represents the "gradient line standard error", but that "was not always the behavior of this library".
Can I get a precise definition of what exactly this parameter represents?
source to share
This is a standard measure in statistics. See wikipedia for a description of how to calculate it. Unfortunately stackoverflow doesn't seem to support LaTeX, so it doesn't make sense to write and explain the equations here.
Essentially, it std_err
should give a value for each coefficient represented in the gradient. In simple terms std_err
, you will know how well the gradient fits (higher values ββmean less accurate) for your data.
Other helpful answers on stats.stackexchange sites here and here .
source to share
As of Dec 2016, I think it is still showing the standard error of the slope of the OLS regression line. I calculated the regression of some datasets using orthogonal distance regression as part of the scipy package, and the output sd_beta[1]
(representing the standard error of the slope of the regression line) was very similar to std_err
that calculated by scipy.stats.linregress .
source to share