r - Reshape cast compare to one level -


i have data want compare value of 1 level of variable other levels of variable. each time write code wish easier. here's example of problem:

suppose want compare average cost of diamonds of cut average cost of best cut diamonds. make things fair want each clarity, separately.

let's check have enough data:

> with(diamonds,table(cut,clarity))            clarity cut           i1  si2  si1  vs2  vs1 vvs2 vvs1   if   fair       210  466  408  261  170   69   17    9          96 1081 1560  978  648  286  186   71     84 2100 3240 2591 1775 1235  789  268   premium    205 2949 3575 3357 1989  870  616  230   ideal      146 2598 4282 5071 3589 2606 2047 1212 

okay no zeroes in idea, let's calculate mean.

> claritycut<-ddply(diamonds,.(clarity,cut),summarize,price=mean(price)) > claritycut    clarity       cut    price 1       i1      fair 3703.533 2       i1      3596.635 3       i1 4078.226 4       i1   premium 3947.332 5       i1     ideal 4335.726 6      si2      fair 5173.916 7      si2      4580.261 8      si2 4988.688 9      si2   premium 5545.937 10     si2     ideal 4755.953 ... 

the end result want is:

   clarity  variable     ratio 1       i1      fair 0.8541899 2       i1      0.8295348 3       i1 0.9406098 4       i1   premium 0.9104200 5       i1     ideal 1.0000000 6      si2      fair 1.0878822 7      si2      0.9630586 8      si2 1.0489356 9      si2   premium 1.1661043 10     si2     ideal 1.0000000 ... 

but i'm not sure how neatly. of rest of question concerns intermediate step in calculation - divide.

now want calculate relative price of cuts vs ideals. here's data frame i'd expect see partway through calculation - extracting 1 level of cut:

> claritycutideal <- join(subset(claritycut,cut!="ideal"),summarize(subset(claritycut,cut=="ideal"),ideal=price,clarity)) > print(claritycutideal) joining by: clarity    clarity       cut    price    ideal 1       i1      fair 3703.533 4335.726 2       i1      3596.635 4335.726 3       i1 4078.226 4335.726 4       i1   premium 3947.332 4335.726 5      si2      fair 5173.916 4755.953 6      si2      4580.261 4755.953 7      si2 4988.688 4755.953 8      si2   premium 5545.937 4755.953 ... 

which works, it's fiddly write above statement, , still need finish calculation off, mentioning ideal name again.

> mutate(claritycutideal,ratio=price/ideal) 

it feels want like

> cast(claritycut,clarity~cut) using clarity, cut id variables   clarity     fair     good  premium    ideal 1      i1 3703.533 3596.635  4078.226 3947.332 4335.726 2     si2 5173.916 4580.261  4988.688 5545.937 4755.953 3     si1 4208.279 3689.533  3932.391 4455.269 3752.118 4     vs2 4174.724 4262.236  4215.760 4550.331 3284.550 ... 

this totally unsuitable mean calculation, since need know names of recast levels in calculation:

i'd recast, way filter levels extracted and leave rest untouched, instance:

> cast(claritycut,clarity~cut,subset=cut=="ideal") 

which exists, doesn't retain unfiltered levels.

i need melt again, , while there's recast, there's no remelt.

does have neat trick this?

or perhaps i'm looking @ wrong way - marginal calculations me?


the following works exactly right fiddly:

> valuevars=function(x)x[!names(x)%in%attr(x,"idvars")] > melt(ddply(cast(claritycut,clarity~cut),.(clarity),              function(x)valuevars(x)/x$ideal)) 

i'm not sure neat enough, there 2 liner:

# code claritycut <- ddply(diamonds,.(clarity,cut),summarize,price=mean(price))  # 1 work transform(merge(claritycut, subset(claritycut, cut=="ideal"), by="clarity"),   ratio = price.x / price.y)  # 2 way ddply(claritycut, .(clarity),        function(x) data.frame(cut=x$cut,                               rate=x$price / subset(x, cut == "ideal")$price))  # 3 way ddply(claritycut, .(clarity),        summarize, cut=cut, rate=price / price[cut == "ideal"]) 

and 4) here one-liner version:

ddply(diamonds, .(clarity),        function(x) transform(ddply(x, .(cut),                                    summarize, rate=mean(price)),                              rate=rate/mean(subset(x, cut=="ideal")$price))) 

but complicated.


Comments

Popular posts from this blog

c# - How to set Z index when using WPF DrawingContext? -

razor - Is this a bug in WebMatrix PageData? -

visual c++ - Using relative values in array sorting ( asm ) -