Dplyr: cross-tab with pipes
There are two questions about dplyr, which in my case are related to the problem I am trying to solve:
- How can I cross-classify usage
data_frame
with pipes when trying to pipe the resulting series of operations toxtabs
? - The pipe argument is usually denoted
.
indplyr
andmagrittr
, but it is also the token used to denote everything else in the formula interface. I know there is some open issue indplyr
(can't find it right now) that talks about replacing.
with_
.
Here's an example:
wakefield::r_data_frame(
n = 100,
cat1 = r_sample_factor(x = LETTERS[1:3]),
cat2 = r_sample_factor(x = LETTERS[1:3]),
cat3 = r_sample_factor(x = LETTERS[1:3]),
bin1 = r_sample_logical()
) %>%
dplyr::filter(bin1) %>%
xtabs(. ~ cat1 + cat2 + cat3, data = .)
which fails with output:
Error in model.frame.default(formula = . ~ cat1 + cat2 + cat3, data = .) :
invalid type (list) for variable '.'
because it magrittr
replaces the first one .
with the result of data_frame
previous calculations. One way is to completely omit the first period, for example:
wakefield::r_data_frame(
n = 100,
cat1 = r_sample_factor(x = LETTERS[1:3]),
cat2 = r_sample_factor(x = LETTERS[1:3]),
cat3 = r_sample_factor(x = LETTERS[1:3]),
bin1 = r_sample_logical()
) %>%
dplyr::filter(bin1) %>%
xtabs( ~ cat1 + cat2 + cat3, data = .)
But what if .
you had to cross to the other side formula
?
Edit:
As @MrFlick pointed out, xtabs
doesn't accept RHS anyway .
. I thought this problem could also be illustrated using the RHS conflict .
I was expecting with code:
wakefield::r_data_frame(
n = 100,
cat1 = r_sample_factor(x = LETTERS[1:3]),
cat2 = r_sample_factor(x = LETTERS[1:3]),
cat3 = r_sample_factor(x = LETTERS[1:3]),
bin1 = r_sample_logical()
) %>%
dplyr::filter(bin1) %>%
dplyr::select(-bin1) %>%
xtabs( ~ ., data = .)
but this works exactly as expected. Can someone explain why magrittr
not trying to replace the first .
with data_frame
?
source to share