Sorting a data frame in R by multiple columns at the same time
So I have a large dataframe (7000 rows) that looks like this:
head(mx)
Stem Progenitor Astrocyte Neuron genename
ENSRNOG00000000007 0.0517698 0.700234 0.11753300 4.591050 Gad1
ENSRNOG00000000010 0.0536043 0.471518 0.00741803 2.280760 Cbln1
ENSRNOG00000000012 0.0163017 0.285178 1.89533000 0.268405 Tcf15
ENSRNOG00000000024 2.7904200 0.703727 13.96940000 4.944650 HEBP1
ENSRNOG00000000028 2.5059900 2.563040 4.83952000 0.840013 Nde1
ENSRNOG00000000029 1.6204500 2.928300 15.58360000 1.750350 Myh11
I need to sort this dataframe in such a way that it is ordered from maximum to minimum by whatever value in the first four columns. So, for example, sorting for these 5 lines would be:
Stem Progenitor Astrocyte Neuron genename
ENSRNOG00000000029 1.6204500 2.928300 15.58360000 1.750350 Myh11
ENSRNOG00000000024 2.7904200 0.703727 13.96940000 4.944650 HEBP1
ENSRNOG00000000028 2.5059900 2.563040 4.83952000 0.840013 Nde1
ENSRNOG00000000007 0.0517698 0.700234 0.11753300 4.591050 Gad1
ENSRNOG00000000010 0.0536043 0.471518 0.00741803 2.280760 Cbln1
ENSRNOG00000000012 0.0163017 0.285178 1.89533000 0.268405 Tcf15
I know that I can sort a dataframe one column at a time using a command like:
mx <- mx[with(mx, order(-Stem, -Progenitor, -Astrocyte, -Neuron)),]
But this in the above example puts Tcf15 above Gad1 and Cbln1. Is there a way to sort by the highest value in any of the four columns? I could write some script to do this, manually iterate over the dataframe and sort into a new dataframe using Rbind, but this is terribly inefficient and I'm sure there is a better way to do it.
Order a maximum of four columns using pmax
mx <- mx[with(mx, order(-pmax(Stem, Progenitor, Astrocyte, Neuron))),]
With dplyr, these are:
library(dplyr) arrange(ms, desc(pmax(Stem, Progenitor, Astrocyte, Neuron)))