R- How to build correct pie charts in haploNet haplotyp Networks {pegas} {ape} {adegenet}

While using the haploNet package to generate some plots on the haplotype network, I used a script available on the internet for this. However, I think something is wrong. The script is available as a wood muscle example. The code I use is:

x <- read.dna(file="Masto.fasta",format="fasta")
h <- haplotype(x)
net <- haploNet(h)
plot(net)

plot(net, size = attr(net, "freq"), fast = TRUE)
plot(net, size = attr(net, "freq"))
plot(net, size=attr(net, "freq"), scale.ratio = 2, cex = 0.8

table(rownames(x))

ind.hap<-with(
    stack(setNames(attr(h, "index"), rownames(h))), 
    table(hap=ind, pop=rownames(x)[values])
)
ind.hap 

plot(net, size=attr(net, "freq"), scale.ratio = 2, cex = 0.8, pie=ind.hap)
legend(50,50, colnames(ind.hap), col=rainbow(ncol(ind.hap)), pch=20)

legend(x=7,y=10,c("Baeti ero","Felege weyni","Golgole naele","Hagare selam","Ruba feleg","Ziway"),c("red","yellow","green","turquoise","blue","magenta"))

      

However, as you build ind.hap, you may notice that some of the lines are in the wrong place. You can see it here:

      pop
hap    Baetiero ETH022 ETH742 Felegeweyni Golgolenaele Rubafeleg
  I           0      0      1           0            0         0
  II          0      1      0           0            0         0
  III         1      0      0           1            0         1
  IV          2      0      0           0            0         3
  IX          0      0      0           1            0         0
  V           4      0      0           0            2         0
  VI          4      0      0           1            0         4
  VII         2      0      0           1            0         0
  VIII        0      0      0           1            0         1
  X           3      0      0           0            1         0
  XI          0      0      0           0            1         1
  XII         0      0      0           1            0         0
  XIII        0      0      0           0            0         1

      

You can see that line IX is not in the correct place. It wouldn't be a big problem, but the program takes up line 9 to make a pie graph for IX, which is VIII data. This is the result: (I couldn't insert the image as my reputation is below 10 ... you still get the image by executing the whole file)

You can see that for V through IX this is not how it should be (these are the swapped lines). For example: IX only has 1 haplotype, but there is a pie chart for 2 haplotypes (which both have 50% of the chart) that is generated using VIII data. Since the strings are sorted alphabetically, not ascending, but this is inherent in the package, I don't know what to do. I'm far from a master in R, so try not to be too abstract, but provide some code instead.

If someone knows this package very well, please also explain why there are these strange extra lines behind the real diagrams (they are numbered on them), since they were not visible in the woodmouse example (maybe because what is also wrong?)

Thanx in advance

0


source to share


1 answer


I was struggling with the same problem but believe I came up with a solution.

The problem is that the step of doing table

haplotype counts per "population" is ordering the haplotypes alphabetically. So, for example, haplotype "IX" precedes "V". On the other hand, the function haplotype()

sorts the haplotypes according to their "numerical" order. And this is what creates the inconsistency in the plotting.

This can be solved by sorting the haplotype object by "label" as described in the help ?haplotype

.

I'll use sample data woodmouse

for illustration:

# Sample 9 distinct haplotypes
library(pegas)
data(woodmouse)
x <- woodmouse[sample(9, 100, replace = T), ]

      

To keep things simple, I am creating a function to create a table of haplotype counters (based on this post ):

countHap <- function(hap = h, dna = x){
    with(
        stack(setNames(attr(hap, "index"), rownames(hap))),
        table(hap = ind, pop = attr(dna, "dimnames")[[1]][values])
    )
}

      

Now let's look at the result without sorting haplotypes:

h <- haplotype(x) # create haplotype object
net <- haploNet(h) # create haploNet object

plot(net, pie = countHap(), size = attr(net, "freq"), legend = T)

      

enter image description here

Now let's look at our count table to check these results:



countHap(h, x)

      pop
hap    No0906S No0908S No0909S No0910S No0912S No0913S No304 No305 No306
  I          0       0       0       0       0       0     0     8     0
  II         0       0       0       0       0       0     9     0     0
  III        0       0       0       0       0       0     0     0    10
  IV        16       0       0       0       0       0     0     0     0
  IX         0       0       0       0       0       8     0     0     0
  V          0      12       0       0       0       0     0     0     0
  VI         0       0      10       0       0       0     0     0     0
  VII        0       0       0      13       0       0     0     0     0
  VIII       0       0       0       0      14       0     0     0     0

      

Things don't match: for example, haplotype "V" should appear in a separate "# 0908S", but instead it is colored as a separate "# 0913S" (which should be a label for haplotype "IX").

Now release the haplotypes:

h <- haplotype(x)
h <- sort(h, what = "labels") # This is the extra step!!
net <- haploNet(h)

plot(net, pie = countHap(), size = attr(net, "freq"), legend = T)

      

enter image description here

And now everything is fine!

Additionally :

While not requested by the OP, I thought to leave it here in case it is of interest to anyone else. Sometimes it is convenient for me to indicate haplotypes by their frequency. This can be done by changing the labels of the haplotypes to their equal frequencies:

attr(h, "labels") <- attr(h, "freq")
plot(net, pie = countHap(), size = attr(net, "freq"), legend = T)

      

enter image description here

+3


source







All Articles