Ordinary least squares regression giving incorrect prediction

I am using statsmodels OLS to match a series of dots in a line:

import statsmodels.api as sm
Y = [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15]
X = [[73.759999999999991], [73.844999999999999], [73.560000000000002], 
    [73.209999999999994], [72.944999999999993], [73.430000000000007], 
    [72.950000000000003], [73.219999999999999], [72.609999999999999], 
    [74.840000000000003], [73.079999999999998], [74.125], [74.75],
    [74.760000000000005]]

ols = sm.OLS(Y, X)
r = ols.fit()
preds = r.predict()
print preds

      

And I am getting the following results:

[ 7.88819844  7.89728869  7.86680961  7.82937917  7.80103898  7.85290687
  7.8015737   7.83044861  7.76521269  8.00369809  7.81547643  7.92723304
  7.99407312  7.99514256]

      

This is about 10 times. What am I doing wrong? I tried to add a constant that just makes the values โ€‹โ€‹1000 times bigger. I'm not very good at statistics, so maybe I need to do something with the data?

+3


source to share


1 answer


I think you switched your answer and your predictor as suggested by Michael Mayer in his comment. If you plot the prediction data from your model, you get something like this:

import statsmodels.api as sm
import numpy as np
import matplotlib.pyplot as plt

Y = np.array([1,2,3,4,5,6,7,8,9,11,12,13,14,15])
X = np.array([ 73.76 ,  73.845,  73.56 ,  73.21 ,  72.945,  73.43 ,  72.95 ,
    73.22 ,  72.61 ,  74.84 ,  73.08 ,  74.125,  74.75 ,  74.76 ])
Design = np.column_stack((np.ones(14), X))
ols = sm.OLS(Y, Design).fit()
preds = ols.predict()

plt.plot(X, Y, 'ko')
plt.plot(X, preds, 'k-')
plt.show()

      

enter image description here



If you switch X and Y, which I think you want, you get:

Design2 = np.column_stack((np.ones(14), Y))
ols2 = sm.OLS(X, Design2).fit()
preds2 = ols2.predict()
print preds2
[ 73.1386399   73.21305699  73.28747409  73.36189119  73.43630829
  73.51072539  73.58514249  73.65955959  73.73397668  73.88281088
  73.95722798  74.03164508  74.10606218  74.18047927]

plt.plot(Y, X, 'ko')
plt.plot(Y, preds2, 'k-')
plt.show()

      

enter image description here

+4


source







All Articles