r - Reshape cast compare to one level -
i have data want compare value of 1 level of variable other levels of variable. each time write code wish easier. here's example of problem:
suppose want compare average cost of diamonds of cut average cost of best cut diamonds. make things fair want each clarity, separately.
let's check have enough data:
> with(diamonds,table(cut,clarity)) clarity cut i1 si2 si1 vs2 vs1 vvs2 vvs1 if fair 210 466 408 261 170 69 17 9 96 1081 1560 978 648 286 186 71 84 2100 3240 2591 1775 1235 789 268 premium 205 2949 3575 3357 1989 870 616 230 ideal 146 2598 4282 5071 3589 2606 2047 1212
okay no zeroes in idea, let's calculate mean.
> claritycut<-ddply(diamonds,.(clarity,cut),summarize,price=mean(price)) > claritycut clarity cut price 1 i1 fair 3703.533 2 i1 3596.635 3 i1 4078.226 4 i1 premium 3947.332 5 i1 ideal 4335.726 6 si2 fair 5173.916 7 si2 4580.261 8 si2 4988.688 9 si2 premium 5545.937 10 si2 ideal 4755.953 ...
the end result want is:
clarity variable ratio 1 i1 fair 0.8541899 2 i1 0.8295348 3 i1 0.9406098 4 i1 premium 0.9104200 5 i1 ideal 1.0000000 6 si2 fair 1.0878822 7 si2 0.9630586 8 si2 1.0489356 9 si2 premium 1.1661043 10 si2 ideal 1.0000000 ...
but i'm not sure how neatly. of rest of question concerns intermediate step in calculation - divide.
now want calculate relative price of cuts vs ideals. here's data frame i'd expect see partway through calculation - extracting 1 level of cut:
> claritycutideal <- join(subset(claritycut,cut!="ideal"),summarize(subset(claritycut,cut=="ideal"),ideal=price,clarity)) > print(claritycutideal) joining by: clarity clarity cut price ideal 1 i1 fair 3703.533 4335.726 2 i1 3596.635 4335.726 3 i1 4078.226 4335.726 4 i1 premium 3947.332 4335.726 5 si2 fair 5173.916 4755.953 6 si2 4580.261 4755.953 7 si2 4988.688 4755.953 8 si2 premium 5545.937 4755.953 ...
which works, it's fiddly write above statement, , still need finish calculation off, mentioning ideal name again.
> mutate(claritycutideal,ratio=price/ideal)
it feels want like
> cast(claritycut,clarity~cut) using clarity, cut id variables clarity fair good premium ideal 1 i1 3703.533 3596.635 4078.226 3947.332 4335.726 2 si2 5173.916 4580.261 4988.688 5545.937 4755.953 3 si1 4208.279 3689.533 3932.391 4455.269 3752.118 4 vs2 4174.724 4262.236 4215.760 4550.331 3284.550 ...
this totally unsuitable mean calculation, since need know names of recast levels in calculation:
i'd recast, way filter levels extracted and leave rest untouched, instance:
> cast(claritycut,clarity~cut,subset=cut=="ideal")
which exists, doesn't retain unfiltered levels.
i need melt again, , while there's recast, there's no remelt.
does have neat trick this?
or perhaps i'm looking @ wrong way - marginal calculations me?
the following works exactly right fiddly:
> valuevars=function(x)x[!names(x)%in%attr(x,"idvars")] > melt(ddply(cast(claritycut,clarity~cut),.(clarity), function(x)valuevars(x)/x$ideal))
i'm not sure neat enough, there 2 liner:
# code claritycut <- ddply(diamonds,.(clarity,cut),summarize,price=mean(price)) # 1 work transform(merge(claritycut, subset(claritycut, cut=="ideal"), by="clarity"), ratio = price.x / price.y) # 2 way ddply(claritycut, .(clarity), function(x) data.frame(cut=x$cut, rate=x$price / subset(x, cut == "ideal")$price)) # 3 way ddply(claritycut, .(clarity), summarize, cut=cut, rate=price / price[cut == "ideal"])
and 4) here one-liner version:
ddply(diamonds, .(clarity), function(x) transform(ddply(x, .(cut), summarize, rate=mean(price)), rate=rate/mean(subset(x, cut=="ideal")$price)))
but complicated.
Comments
Post a Comment