Kolmogorov-Smirnov test for normality in MATLAB - data normalization?

I am using the Kolmogorov-Smirnov test in MATLAB to determine the normality of each column of a data matrix prior to performing generalized linear regression. Approximate data vector:

data = [8126,3163,9129,5399,8682,1126,1053,7805,2989,2758,3277,1152,6994,6833];

      

The test passes and gives me the result. However, when I plot the empirical cumulative distribution function (cdf) (blue) and the standard normal cdf (red) for visual comparison, the scale of such a data vector is such that the plot is not useful:

exampleCDF

The code used to build this figure:

[h,p,ksstat,cv] = kstest(data);
[f,x_values] = ecdf(data);
figure()
F = plot(x_values,f);
set(F,'LineWidth',2);
hold on
G = plot(x_values,normcdf(x_values,0,1),'r-');
set(G,'LineWidth',2);
legend([F G],...
    'Empirical CDF','Standard Normal CDF',...
    'Location','SE');

      

Does this mean my test result is invalid? If so, can I just normalize the data, eg.

dataN=(data-min(data))./(max(data)-min(data)); 

      

while maintaining the validity of the test?

Thank you for your time,

Laura

+3


source to share


1 answer


Thanks to Luis Mendo I solved this problem. normcdf

requires the mean and standard deviation of the data vector as input, which I have not changed from the example code I was working on. Modified code:



[h,p,ksstat,cv] = kstest(data);
[f,x_values] = ecdf(data);
figure()
F = plot(x_values,f);
set(F,'LineWidth',2);
hold on
variableMean = mean(data);
variableSD = std(data);
G = plot(x_values,normcdf(x_values,variableMean,variableSD),'r-');
set(G,'LineWidth',2);
legend([F G],...
    'Empirical CDF','Standard Normal CDF',...
    'Location','SE');

      

+3


source







All Articles