“relevel” factors when using them to create dummy variables

June 29, 2007 at 10:11 am | In Data manipulation, GNU-R | 2 Comments

?relevel

: The levels of a factor are re-ordered so that the level specified
by ‘ref’ is first and the others are moved down. This is useful
for ‘contr.treatment’ contrasts which take the first level as the
reference.

Eben Moglen in Delhi on June 9

June 6, 2007 at 3:43 pm | In Linux | Leave a Comment

Working with ggplot, version 2

June 6, 2007 at 3:05 pm | In GNU-R, Graphics | Leave a Comment

In my previous post on this issue, I had presented a code that made weighted boxplots and annotated them with boxplot statistics and the mean values. The problem with that code was that it printed these annotations right on the vertical axes of the boxplots. Also, a relatively minor problem was that, when the values of two statistics were too close to each other, they were printed one on top of the other.

As the horizontal axis was discrete, ggplot was not able to position the annotations in between two boxplots

With Hadley’s help again, I have managed to fix both of these problems. In the output of the following code, annotations are printed at a fixed distance away from the vertical axis. This is achieved by using grid rather than ggplot to print annotations. With a minor tweak, the distance between annotations is increased if they are too close to each other.

The revised code is as follows.

library(ggplot)
vplot2 <- function(dataset,xvar,yvar,v1,v2){
  ggopt(axis.colour="black")
  p <- ggplot(dataset,aesthetics=list(x=x,y=y, weight=Multiplier),
                 colour="black")
  p$xlabel<-xvar
  p$ylabel<-yvar
  (p<-ggpoint(ggboxplot(p,colour="black",orientation="vertical")))
  split(dataset,dataset$x)->cl
  dots <- do.call(rbind, lapply(cl, function(df) {
	data.frame(x = df[1, ]$x,dots = boxplot_stats_weighted(df$y,
                   weights=df$Multiplier)$stats[3])
      }))
  means<-do.call(rbind, lapply(cl,function(df){
        data.frame(
                   x=df[1,]$x,
                   mean=weighted.mean(df$y,df$Multiplier)
                   )
      }))
  (p<-ggpoint(p, data=means, aes=list(x=x, y=mean), colour="magenta"))
  (pscontinuous(p,variable="y",range=c(v1,v2))->p)
  print(p,pretty=F)
  grid.text(format(dots$dots,digits=2),x=unit(as.numeric(dots$x)+0.4,
            "native"), y=unit(dots$dots,"native"),gp=gpar(col="blue"),
            vp="layout::panel_1_1")
  dots$dots->means$dots
  (means$mean-means$dots)/(v2-v1)->means$diff
  ifelse(means$diff>=0 & means$diff<0.03,0.03,means$diff)->means$diff
  ifelse(means$diff<0 & means$diff>-0.03,-0.03,means$diff)->means$diff
  means$dots+(means$diff*(v2-v1))->means$y
  grid.text(format(means$mean,digits=2),x=unit(as.numeric(means$x)+0.4,
            "native"), y=unit(means$y,"native"),gp=gpar(col="magenta"),
            vp="layout::panel_1_1")
}


Blog at WordPress.com. | Theme: Pool by Borja Fernandez.
Entries and comments feeds.