Area Plot: Cannot Get Stacking in Correct Order - Legend is out of sync with data
I am very new to R and am not an experienced programmer. I have a problem with ggplot using geom_area to generate a complex diagram for wind direction. I want me to fold from bottom to top okN, NE, E, SE, S, SW, W, NW
I got it to order the labels, but the problem is that the colors no longer relate to the data in the chart. Below are the various things I've tried and the resulting graphs.
The .file data is taken from another program, but a small subset looks like this for 3 days: The last column is for the solution I found, but VERY clumsy, however I'm more concerned about the fact that the labels are no longer associated with the data in ggplot. and I'm wondering where I went wrong.
My data.frame looks like this and is called knime.in
:
Day of year WD Binned Count(Time) WD Binned Number
Row0 119 E 324 3
Row1 119 N 32 1
Row2 119 NE 240 2
Row3 119 NW 149 8
Row4 119 S 65 5
Row5 119 SE 94 4
Row6 119 SW 209 6
Row7 119 W 279 7
Row8 120 E 435 3
Row9 120 N 68 1
Row10 120 NE 112 2
Row11 120 NW 46 8
Row12 120 S 15 5
Row13 120 SE 130 4
Row14 120 SW 52 6
Row15 120 W 588 7
Row16 121 E 114 3
Row17 121 N 34 1
Row18 121 NE 6 2
Row19 121 NW 282 8
Row20 121 S 55 5
Row21 121 SE 101 4
Row22 121 SW 194 6
Row23 121 W 594 7
First attempt to use the coefficient:
require (ggplot2)
knime.in$"WD Binned" <- factor(knime.in$"WD Binned", levels = c("N","NE","E","SE","S","SW","W","NW"))
ggplot(knime.in, aes(x = knime.in$"Day of year", y = (knime.in$"Count(Time)"-1), fill = knime.in$"WD Binned")) + geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")
The second attempt was to use levels:
require (ggplot2)
levels(knime.in$"WD Binned") <- c("N","NE","E","SE","S","SW","W","NW")
ggplot(knime.in, aes(x = knime.in$"Day of year", y = (knime.in$"Count(Time)"-1), fill = knime.in$"WD Binned")) + geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")
For reference without anything:
require (ggplot2)
ggplot(knime.in, aes(x = knime.in$"Day of year", y = (knime.in$"Count(Time)"-1), fill = knime.in$"WD Binned")) + geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")
and finally, the kludge that worked, by ordering on the numeric column i, had to be created elsewhere (since I couldn't ho for the custom order).
require (ggplot2)
dt <- knime.in[order(knime.in$"WD Binned Number"),] #order the data so that it will be stacked correctly
dt$"WD Binned" <- factor(dt$"WD Binned", levels = c("N","NE","E","SE","S","SW","W","NW")) ggplot(dt, aes(x = dt$"Day of year", y = (dt$"Count(Time)"-1)/1440, fill = dt$"WD Binned")) + geom_area(stat="identity")+ scale_fill_brewer(palette="BrBG")
Let's take day 120 as an example. From the data we should have:
N = 68
NE = 112
E = 435
SE = 130
S = 15
SW = 52
W = 588
NW = 46
If we look at the diagrams:
Attempt 1 = Chart Text labels in correct order, stacking alphabetically, colors refer to labels (so the only problem is that stacking is not in the order I want)
Attempt 2 = Chart Text labels in correct order, stacking in "alphabetical" order related to REAL data, but colors are stacked in correct order, but data is incorrect with respect to color, for example N is dark brown according to legend, but dark brown on the graph is actually data for East
Attempt 3 (above) = data and colors and labels are all in sync, but not in the order I want
Final work (above) = Since I wanted all the time, stacking with N at the bottom, the legend colors and legend labels refer to the correct data items in the chart
Many thanks
Peter
source to share
As @Henrik said, you must name your variables correctly. You can solve this problem as follows:
# reading the data (with appropriately named variables)
knime.in <- structure(list(Day.of.year = c(119L, 119L, 119L, 119L, 119L, 119L, 119L, 119L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 121L, 121L, 121L, 121L, 121L, 121L, 121L, 121L),
WD.Binned = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("E", "N", "NE", "NW", "S", "SE", "SW", "W"), class = "factor"),
Count = c(324L, 32L, 240L, 149L, 65L, 94L, 209L, 279L, 435L, 68L, 112L, 46L, 15L, 130L, 52L, 588L, 114L, 34L, 6L, 282L, 55L, 101L, 194L, 594L)), .Names = c("Day.of.year", "WD.Binned", "Count"),
class = "data.frame", row.names = c(NA, -24L))
# rearranging the factor levels
knime.in$WD.Binned <- factor(knime.in$WD.Binned, levels = c("N","NE","E","SE","S","SW","W","NW"))
# loading required packages
library(ggplot2)
library(dplyr)
# rearranging the data with dplyr
knime.in <- knime.in %>% group_by(Day.of.year) %>% arrange(WD.Binned)
# rearranging the data in base R
knime.in <- knime.in[order(knime.in$WD.Binned),]
# creating the area plot
ggplot(knime.in, aes(x = Day.of.year, y = (Count-1), fill = WD.Binned)) +
geom_area(stat="identity") +
scale_x_continuous("\nDay of the year", expand=c(0,0), breaks=c(119,120,121)) +
scale_y_continuous("Count", expand=c(0,0), breaks=c(250,500,750,1000,1250)) +
scale_fill_brewer(palette="BrBG") +
theme_classic()
which gives:
Reply to comment :
When you read the data using knime.in <- structure(...code...)
and graph, you get the following output:
Now let's look at the levels WD.Binned
with levels(knime.in$WD.Binned)
. As you can see, they are in the same order as the legend. Now also look at your dataframe (c View(knime.in)
) and you will see that the row order is also the same as the legend. This shouldn't surprise you, as levels are presented in the order in which they appear in your dataset.
When you change the order of the levels with knime.in$WD.Binned <- factor(knime.in$WD.Binned, levels=c("N","NE","E","SE","S","SW","W","NW"))
, you only change the order of the levels, but you do not change the order of the data. When you create a graph, you see that the data is displayed in the order in which it is stored in your dataframe:
Therefore, you must also change the order of your data. This is done with knime.in <- knime.in[order(knime.in$WD.Binned),]
(or equivalent dplyr
). You can now get a graph showing the levels in the correct order as I showed in the first graph of this answer.
source to share