Gaussian fit score

I would like to know how to determine how well a Gaussian function fits my data.

Here are some of the plots I've tested the methods with. Currently I am just using RMSE as per the sample (red is appropriate, blue is sample).

For example, here are two good options:

Good fit

enter image description here

And here are 2 scary prefixes that should be flagged as bad data:

enter image description hereenter image description here

In general, I'm looking for suggestions for additional metrics to measure good fit. Also, as you can see in the second "good" fit, there may sometimes be other out-of-data peaks. They are currently penalized with RSME, although they should not be.

+3


source to share


3 answers


I'm looking for suggestions for additional metrics to gauge the goodness of the fit.

The Kolmogorov-Smirnov (KS) test with one sample would be a good starting point.



I offer the Wikipedia article as an introduction.

The test is available in SciPy as scipy.stats.kstest

. The function calculates and returns both the KS statistic and the p-value .

0


source


You can try quantile-quantile (qq) with probplot from statistics:

import pylab
from stats import probplot

plot = probplot(data, dist='norm', plot=pylab)
pylab.show()

      



Calculate the quantile for the probability diagram and, if desired, show the plot.

Plots the probability of the sampled data against the quantiles of a specified theoretical distribution (default normal distribution). probplot optionally calculates a best fit line for the data and displays the results using Matplotlib or a given plot function.

0


source


There are other ways of assessing good fit, but most of them are not emission resistant.

There MSE

is - the RMSE

root mean square error, which you already know, and which is its root.

But you can also measure it using MAE

- Average absolute error and MAPE

- Average absolute percentage error.

Also, there is the Kolmogorov-Smirnov test which is much more complicated and you will probably need a library for that while MAE

, MAPE

and MSE

you can easily implement yourself calmly.

(If you are dealing with uncontrolled data and / or classification that does not fit your case, curves ROC

and confusion matrix are also metrics of accuracy.)

0


source







All Articles