Stata tabstat will change order / sort?
I am using tabstat
in Stata and using estpost
and esttab
to get its output in LaTeX. I have
tabstat
to display statistics for groups. For example,
tabstat assets, by(industry) missing statistics(count mean sd p25 p50 p75)
I have a question: is there a way for tabstat
(or other Stata commands) to display the output ordered by mean value, so that those categories that have higher means will be on top. By default, Stata is displayed in alphabetical order industry
when I use tabstat
.
tabstat
does not offer such a hook, but there is an approach to such problems that is general and simple enough to understand.
You are not giving a reproducible example, so we need one:
. sysuse auto, clear
(1978 Automobile Data)
. gen Make = word(make, 1)
. tab Make if foreign
Make | Freq. Percent Cum.
------------+-----------------------------------
Audi | 2 9.09 9.09
BMW | 1 4.55 13.64
Datsun | 4 18.18 31.82
Fiat | 1 4.55 36.36
Honda | 2 9.09 45.45
Mazda | 1 4.55 50.00
Peugeot | 1 4.55 54.55
Renault | 1 4.55 59.09
Subaru | 1 4.55 63.64
Toyota | 3 13.64 77.27
VW | 4 18.18 95.45
Volvo | 1 4.55 100.00
------------+-----------------------------------
Total | 22 100.00
Make
here is like your variable industry
: it is a string variable, so in tables Stata will show it in alphabetical (alphanumeric) order.
In the process of work, there are several simple steps, some are optional.
Calculate the variable you want to sort by. egen
often useful.
. egen mean_mpg = mean(mpg), by(Make)
Map these values ββto a variable with different integer values. ... Since the two groups may have the same average (or different summary statistics), make sure you break the bindings on the original string variable.
. egen group = group(mean_mpg Make)
This variable is created to be 1 for the group with the lowest average (or other summary statistic), 2 for the next lowest, and so on. If the opposite order is required, as in this question, flip the grouping variable.
. replace group = -group
(74 real changes made)
There is a problem with this new variable: the values ββof the original string variable are Make
nowhere visible here. labmask
(which will be installed on the Stata Journal website after search labmask
) here. We use the values ββof the original string variable as value labels for the new variable. (The idea is that the value labels become the "mask" that the whole variable carries.)
. labmask group, values(Make)
Additionally, work with the variable label of the new integer variable.
. label var group "Make"
We can now use tables using the categories of the new variable.
. tabstat mpg if foreign, s(mean) by(group) format(%2.1f)
Summary for variables: mpg
by categories of: group (Make)
group | mean
--------+----------
Subaru | 35.0
Mazda | 30.0
VW | 28.5
Honda | 26.5
Renault | 26.0
Datsun | 25.8
BMW | 25.0
Toyota | 22.3
Fiat | 21.0
Audi | 20.0
Volvo | 17.0
Peugeot | 14.0
--------+----------
Total | 24.8
-------------------
Note: other strategies are sometimes better or better here.
-
If you
collapse
put your data into a new dataset, you can dosort
it as you please. -
graph bar
andgraph dot
are good at displaying summary statistics by group, and the sort order can be configured directly.
I would look at the package egenmore
on the SSC. You can get this package by typing Stata ssc install egenmore
. In particular, I would look at the entry for axis()
in the help file egenmore
. This is an example that does exactly what you want.