How do I delete observations with no data in Stata?

I have data with IDs that may or may not have all values. I want to delete ONLY cases without data in them; if there are observations with one value, I want to store them. For example, if my dataset is:

ID val1 val2 val3 val4
1 23 . 24 75
2 . . . .
3 45 45 70 9

      

I only want to remove ID 2 as it is the only one that has no data - only an ID.

I tried Statalist and Google but couldn't find anything relevant.

+3


source to share


3 answers


This will also work on strings as long as they are empty:

ds id*, not
egen num_nonmiss = rownonmiss(`r(varlist)'), strok
drop if num_nonmiss == 0

      



Gets a list of variables that are not identifiers and discards any cases that only have an identifier.

+7


source


Brian Albert Monroe is quite correct that anyone using dropmiss

(SJ) would have to install it first. Since there is interest in different ways to solve this problem, I will add one more.

 foreach v of var val* { 
     qui count if missing(`v') 
     if r(N) == _N local todrop `todrop' `v' 
 }
 if "`todrop'" != "" drop `todrop' 

      

While this should be a comment on Brian's answer, I'll add a comment here: (a) this format is more suitable for showing code (b) the comment follows from my code above. I agree that it unab

is a useful command and have often appreciated it publicly. This is unnecessary here, however, as Brian's loops can easily start with something like



 foreach v of var * { 

      

UPDATE September 2015: see http://www.statalist.org/forums/forum/general-stata-discussion/general/1308777-missings-now-available-from-ssc-new-program-for-managing-missings for information on missings

, considered by the author as an improvement on dropmiss

. Syntax drop

observations if and only if all values are absent missings dropobs

.

+4


source


Another way to do this is to help you learn how flexible local macros don't set anything extra in Stata. I rarely see code using locales that store commands or boolean conditions, although this is often very useful.

    // Loop through all variables to build a useful local
    foreach vname of varlist _all {    

            // We don't want to include ID in our drop condition, so don't execute the remaining code if our loop is currently on ID
            if "`vname'" == "ID" continue  

            // This local stores all the variable names except 'ID' and a logical condition that checks if it is missing
            local dropper "`dropper' `vname' ==. &"     
    }

    // Let see all the observations which have missing data for all variables except for ID
    // The '1==1' bit is a condition to deal with the last '&' in the `dropper' local, it is of course true.

    list if `dropper' 1==1

    // Now let drop those variables
    drop if `dropper' 1==1

    // Now check they're all gone
    list if `dropper' 1==1

    // They are.

      

Now it dropmiss

can be handy after downloading and installing it, but if you write a do file to be used by someone else if they don't have it as well dropmiss

, your code won't work on their machine.

With this approach, if you remove the comment lines and the two unnecessary list commands, it's a rather rare 5 lines of code to run out of the box with Stata.

0


source







All Articles