Area Plot: Cannot Get Stacking in Correct Order - Legend is out of sync with data

I am very new to R and am not an experienced programmer. I have a problem with ggplot using geom_area to generate a complex diagram for wind direction. I want me to fold from bottom to top okN, NE, E, SE, S, SW, W, NW

I got it to order the labels, but the problem is that the colors no longer relate to the data in the chart. Below are the various things I've tried and the resulting graphs.

The .file data is taken from another program, but a small subset looks like this for 3 days: The last column is for the solution I found, but VERY clumsy, however I'm more concerned about the fact that the labels are no longer associated with the data in ggplot. and I'm wondering where I went wrong.

My data.frame looks like this and is called knime.in

:

         Day of year WD Binned Count(Time) WD Binned Number
    Row0          119         E         324                3
    Row1          119         N          32                1
    Row2          119        NE         240                2
    Row3          119        NW         149                8
    Row4          119         S          65                5
    Row5          119        SE          94                4
    Row6          119        SW         209                6
    Row7          119         W         279                7
    Row8          120         E         435                3
    Row9          120         N          68                1
    Row10         120        NE         112                2
    Row11         120        NW          46                8
    Row12         120         S          15                5
    Row13         120        SE         130                4
    Row14         120        SW          52                6
    Row15         120         W         588                7
    Row16         121         E         114                3
    Row17         121         N          34                1
    Row18         121        NE           6                2
    Row19         121        NW         282                8
    Row20         121         S          55                5
    Row21         121        SE         101                4
    Row22         121        SW         194                6
    Row23         121         W         594                7

      

First attempt to use the coefficient:

require (ggplot2)

knime.in$"WD Binned" <- factor(knime.in$"WD Binned", levels = c("N","NE","E","SE","S","SW","W","NW"))

ggplot(knime.in, aes(x = knime.in$"Day of year", y = (knime.in$"Count(Time)"-1), fill = knime.in$"WD Binned")) +  geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")

      

The second attempt was to use levels:

require (ggplot2)

levels(knime.in$"WD Binned") <- c("N","NE","E","SE","S","SW","W","NW")

ggplot(knime.in, aes(x = knime.in$"Day of year", y = (knime.in$"Count(Time)"-1), fill = knime.in$"WD Binned")) +  geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")

      

For reference without anything:

require (ggplot2)

ggplot(knime.in, aes(x = knime.in$"Day of year", y = (knime.in$"Count(Time)"-1), fill = knime.in$"WD Binned")) +  geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")

      

and finally, the kludge that worked, by ordering on the numeric column i, had to be created elsewhere (since I couldn't ho for the custom order).

require (ggplot2)

dt <- knime.in[order(knime.in$"WD Binned Number"),] #order the data so that it will be stacked correctly

dt$"WD Binned" <- factor(dt$"WD Binned", levels = c("N","NE","E","SE","S","SW","W","NW")) ggplot(dt, aes(x = dt$"Day of year", y = (dt$"Count(Time)"-1)/1440, fill = dt$"WD Binned")) + geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")

      

Let's take day 120 as an example. From the data we should have:

N  = 68
NE = 112
E  = 435
SE = 130
S  = 15
SW = 52
W  = 588
NW = 46

      

If we look at the diagrams:

enter image description here Attempt 1 = Chart Text labels in correct order, stacking alphabetically, colors refer to labels (so the only problem is that stacking is not in the order I want)

enter image description here Attempt 2 = Chart Text labels in correct order, stacking in "alphabetical" order related to REAL data, but colors are stacked in correct order, but data is incorrect with respect to color, for example N is dark brown according to legend, but dark brown on the graph is actually data for East

enter image description here Attempt 3 (above) = data and colors and labels are all in sync, but not in the order I want

enter image description here Final work (above) = Since I wanted all the time, stacking with N at the bottom, the legend colors and legend labels refer to the correct data items in the chart

Many thanks

Peter

+3


source to share


1 answer


As @Henrik said, you must name your variables correctly. You can solve this problem as follows:

# reading the data (with appropriately named variables)
knime.in <- structure(list(Day.of.year = c(119L, 119L, 119L, 119L, 119L, 119L, 119L, 119L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 121L, 121L, 121L, 121L, 121L, 121L, 121L, 121L),
                           WD.Binned = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("E", "N", "NE", "NW", "S", "SE", "SW", "W"), class = "factor"),
                           Count = c(324L, 32L, 240L, 149L, 65L, 94L, 209L, 279L, 435L, 68L, 112L, 46L, 15L, 130L, 52L, 588L, 114L, 34L, 6L, 282L, 55L, 101L, 194L, 594L)), .Names = c("Day.of.year", "WD.Binned", "Count"),
                      class = "data.frame", row.names = c(NA, -24L))

# rearranging the factor levels
knime.in$WD.Binned <- factor(knime.in$WD.Binned, levels = c("N","NE","E","SE","S","SW","W","NW"))

# loading required packages
library(ggplot2)
library(dplyr)

# rearranging the data with dplyr
knime.in <- knime.in %>% group_by(Day.of.year) %>% arrange(WD.Binned)

# rearranging the data in base R
knime.in <- knime.in[order(knime.in$WD.Binned),]

# creating the area plot    
ggplot(knime.in, aes(x = Day.of.year, y = (Count-1), fill = WD.Binned)) +
  geom_area(stat="identity") + 
  scale_x_continuous("\nDay of the year", expand=c(0,0), breaks=c(119,120,121)) +
  scale_y_continuous("Count", expand=c(0,0), breaks=c(250,500,750,1000,1250)) +
  scale_fill_brewer(palette="BrBG") +
  theme_classic()

      

which gives: enter image description here


Reply to comment :



When you read the data using knime.in <- structure(...code...)

and graph, you get the following output: enter image description here

Now let's look at the levels WD.Binned

with levels(knime.in$WD.Binned)

. As you can see, they are in the same order as the legend. Now also look at your dataframe (c View(knime.in)

) and you will see that the row order is also the same as the legend. This shouldn't surprise you, as levels are presented in the order in which they appear in your dataset.

When you change the order of the levels with knime.in$WD.Binned <- factor(knime.in$WD.Binned, levels=c("N","NE","E","SE","S","SW","W","NW"))

, you only change the order of the levels, but you do not change the order of the data. When you create a graph, you see that the data is displayed in the order in which it is stored in your dataframe: enter image description here

Therefore, you must also change the order of your data. This is done with knime.in <- knime.in[order(knime.in$WD.Binned),]

(or equivalent dplyr

). You can now get a graph showing the levels in the correct order as I showed in the first graph of this answer.

+2


source







All Articles