R: Divide data in ggplot based on another factor

I start with R, so I have little experience. I ran into an issue when trying to split a scatterplot into groups based on infection status. In our example, the dataset consists of the log-transformed logapfhap2 antibody levels. The infection status of any Pf inf is coded as "Yes" or "No" and gives information about whether someone was infected during the subsequent period. I am plotting time points (x) versus antibody levels (y). For time points 1 and 14, I would like to make 2 groups by infection state.

This is the bulk of the code I use to plot the data without splitting into groups:

ggplot() + 
    geom_jitter(data=data2, aes(x='1', y=logapfhap2, colour='PfHAP2A')) + 
    geom_jitter(data=data2,aes(x='14', y=logbpfhap2, colour='PfHAP2B')) + 
    geom_jitter(data=TRC, aes(x='C', y=PfHAP2, colour='PfHAP2C'))

      

which results in this graph:

enter image description here

Then I tried to break it (I'm only showing the first time here), which returns an error.

ggplot() + 
    geom_jitter(data=data2[data2$any_Pf_inf=='Yes'], 
                aes(x='1inf', y=logapfhap2[data2$any_Pf_inf=='Yes'], 
                colour='PfHAP2A')) + 
    geom_jitter(data=data2[data2$any_Pf_inf=='No'], 
                aes(x='1un', y=logapfhap2[data2$any_Pf_inf=='No'], 
                colour='PfHAP2B')) 

      

I wanted to create this graph enter image description here, but I am getting this error:

Error: boolean index vector length must be 1 or 55, received: 482

Hope this is clear! Can anyone help me with this problem? Thank!

EDIT Not sure if this makes it clearer, but this is what my data looks like: enter image description here enter image description here

+3


source to share


1 answer


I just tried other things and I solved it now!

ggplot()+ 
      geom_jitter(data=data2[data2$any_Pf_inf=='Yes',], 
          aes(x='1inf', y=logapfhap2, 
          colour='PfHAP2A')) + 
      geom_jitter(data=data2[data2$any_Pf_inf=='No',],
          aes(x='1un', y=logbpfhap2, 
          colour='PfHAP2B'))

      



Obviously, you need to add a comma after [data2 $ any_Pf_inf == 'Yes',] to retrieve rows instead of columns.

+1


source







All Articles