Missing values ​​in MS Excel LINEST, TREND, LOGEST and GROWTH functions

I am using the GROWTH function (or LINEST or TREND or LOGEST, everyone does the same problem) in Excel 2003. But there is a problem that if some data is missing, the function refuses to give a result:

enter image description here

You can download the file here .

Is there a workaround? Looking for an easy and elegant solution.

  • I don't want an obvious workaround to get rid of the missing value - that would mean dropping the column as well as corrupting the graph and that would create problems in my other tables where I would have more rows and missing data in different columns. Another obvious workaround is to use one information for the regression and the other for the graph, but again this is annoying and only creates clutter in the worksheet!

  • Is there a way to tell excell is the NA value?

  • Another idea is to skip missing meaning in the expression. Can a set of cells be addressed that is not contiguous? For example, instead of =GROWTH($B2:$AH2; $B1:$AH1; B1)

    like in my example, use something like:

    =GROWTH({$B2:$I2,$K2:$AH2}; {$B1:$I1,$K1:$AH1}; B1)

  • I would certainly like not to write my own expressions. I need to explain this to my colleagues how to do all this and it will be much more difficult. I want a simple and elegant solution.

+3


source to share


3 answers


I know this is old ... but if you or someone else can find the answer, have you tried using the function FORECAST

? It will compute a trend with missing values ​​(unless any # N / A cells exist).

In my case, I needed to create a brushless graph with missing values, but I also needed to calculate the trend from the data. So first I will bind the plot to the dataset that I put # N / A for each missing value: for example,IF(ISBLANK(B2),NA(),B2)



But then I would calculate the numbers of predictions with the original data: =FORECAST(B1,$B2:$AH2,$B1:$AH1)

If I am missing something, this should take care of it. Basically you get two strings of the same numbers, but FORECAST

there are spaces for the calculation , and the other replaces each space with NA()

for the graph.

+3


source


It turns out to be "trivial" if you know the trick.

To use LINEST with missing values, you need to create an X-matrix (r rows by columns c) and a Y-vector (r rows by one column) as usual. You also need to create an additional column in the X-matrix that will serve as the indicator variable. Place this column immediately to the left of matrix X. So, if the X matrix starts in column B, place an additional column in column A. Set this indicator value to zero for each row you want to skip. Set this value to one for each line you want to include. Multiply each column of the X-matrix and Y-vector by this indicator variable. Place this new expanded X matrix and new Y vector elsewhere in the spreadsheet.You should now have a new X-matrix (r rows by c + 1 columns) and a Y-vector with rows of zeros straight across for each row to be omitted. THIS IS CRITICAL!

Now use the LINEST function as usual, specifying the entire Y-vector and the extended r × (c + 1) X-matrix (with indicator column included as the first two function parameters, "False" (that is, zero) as the third parameter and "TRUE" (that is, one) or "FALSE" (that is, zero) as the fourth parameter of the function. Correct parameter estimates are displayed in the first line of LINEST output. The LINEST output values ​​are incorrect except for the value in the fifth row and second column (residual sum of squares) if you specified "TRUE" to get statistics.

If you specified the fourth parameter of the function as "TRUE" to get statistics, you need to correct the output for incorrect values. The values ​​on lines 2, 3 and 4 of the extended output are incorrect; the value in column 1 of row 5 is also incorrect. You need to fix them.

Make a copy of the first line of LINEST output elsewhere on the sheet. If you specified "TRUE" for statistics, reserve four blank lines below this copy. Copy line 5 column 2 from original LINEST output to line 5 column 2 of the new output space

Step one: calculate the correct number of degrees of freedom to replace the value in column 2 of row 4 of the LINEST output. Find the number of parameters in the model; this is c + 1. You can use the COUNT function to count the number of columns in the extended X-matrix. Then add all the values ​​in the X-matrix indicator column. Suppose four strings have all zero values. Use the SUM function: this gives r - 4 = the number of rows with "1" in the indicator column. The correct degrees of freedom are the difference: SUM (indicator column) - COUNT (magnified X-matrix columns). This is the value to be placed in the row 4 column of the new output space.



Step two: fix line 2 and line 3 column 2. Split the wrong df (line 4 column 2) in the original LINEST output with the correct df (line 4 column 2) in the new output space. Take the square root of this quotient. Multiple values ​​in row 2 and column 3 in row 3 in the original LINEST output space with this correction factor to get the correct standard errors of the parameters and the correct standard error Y.

Step three: correct the regression sum of squares. The original LINEST output has a value for sum of squares due to non-regression regression for the mean in column 1 of row 5 of the output; we want the regression sum of squares to be adjusted for the mean. We need to calculate the correction for the mean. It is the sum of the squares of the squares of the Y vectors divided by the sum of the values ​​in the indicator column. Subtract that from the value in column 1 of row 5 of the original LINEST output, and place the answer in column 1 of row 5 of the new output space.

Step four: correct the ratio F in column 4 of row 1. We need to compute the mean squares due to regression and due to residuals. The mean square due to regression (numerator in F-ratio) is the value in column 1 of row 5 of the new output space divided by c, the number of columns in the original X-matrix before incrementing. The mean square due to residuals (denominator in F-ratio) is column 2 of row 5 of the new output space, separated by column 2 of row 4 of the new output space. Calculate the ratio F from these two intermediate values ​​and place the result in column 1 of row 4 of the new output space.

Step five: correct the R-squared value in column 1 of row 3. That's 1 - (column of row 5 divided by the sum of column 1 of row 5 and row 5 of column 2) using the values ​​from the new output space.

Test your work: make a copy of the expanded X matrix and Y vector elsewhere in the spreadsheet. Replace all records with zero for those strings that have zero in the indicator variable. Delete all cells in rows with zeros by moving the cells up. You should now have an X-matrix and a Y-vector with fewer rows, but no missing values. Delete the indicator column. Now use LINEST to create a regression model for this reduced dataset, but this time set the third parameter to TRUE (including a constant). These results should be identical to the results you have in the new output space.

+2


source


My solution has two parts:

  • To avoid a break in a chart placed in cells where data is missing =NA()

    , it generates an error #N/A

    , and these types of errors are handled by the charts exactly the way you want: the line is interpolated between the available points that surround the missing. More details here: http://www.j-walk.com/ss/excel/usertips/tip024.htm
  • If you want a trendline, why not use the built-in routine? I've added exponential information to your data and it matches 100% of your calculated values GROWTH

    . And it handles it correctly #N/A

    . To make sure the trendline matches your data, simply replace #N/A

    with the temporary average of the two adjacent cells (297 for your sample) - then it will compute the series GROWTH

    and you will see that it exactly matches the added trendline. Read about trendlines here: http://office.microsoft.com/en-001/excel-help/add-a-trendline-to-a-chart-HP005198462.aspx and http://www.computergaga.com/ excel / 2003 / intermediate / charts / add_a_trendline.html

Your application solutions file is shared: https://www.dropbox.com/s/j7htrk9ih2jtcq6/TrendlineNA.xls

Hope this was helpful!

0


source