Assignment via `:=` in a `for` loop (R data.table) -



Assignment via `:=` in a `for` loop (R data.table) -

i'm trying assign new variables within for loop (i'm trying create variables mutual structure, subsample-dependent).

i've tried life of me re-produce error on sample info , can't. here's code works & gets gist of want do:

dt<-data.table(id=rep(1:100,each=20),period=rep(-9:10,100), grp=rep(sample(4,size=100,replace=t),each=20), y=runif(2000,min=0,max=5),key=c("id","period"))[,x:=cumsum(y),by=id] dt2<-dt[id %in% seq(1,100,by=2),] dt3<-dt[id %in% seq(1,100,by=3),] (dd in list(dt,dt2,dt3)){ setkey(setkey(dd,grp)[dd[period==0,sum(x),by=grp],x_at_0_by_grp:=v1],id,period) }

this works fine--however, when own code, generates invalid .internal.selfref warning (and doesn't create variable want):

in [.data.table(setkey(dt, treatment), dt[posting_rel == 0, sum(current_balance), : invalid .internal.selfref detected , fixed taking re-create of whole table := can add together new column reference. @ before point, data.table has been copied r (or been created manually using structure() or similar). avoid key<-, names<- , attr<- in r (and oddly) may re-create whole data.table. utilize set* syntax instead avoid copying: ?set, ?setnames , ?setattr. also, in r<=v3.0.2, list(dt1,dt2) copied entire dt1 , dt2 (r's list() used re-create named objects); please upgrade r>v3.0.2 if biting. if message doesn't help, please study datatable-help root cause can fixed.

in fact, when subset info only columns needed within merge, works fine on info (though doesn't save original info sets).

this suggests me it's problem keying, i'm explicitly setting keys every step of way. i'm lost on how debug here because can't error repeat except on total info set.

if break out operation steps, error arises @ merge step:

for (dd in list(dt,dt2,dt3)){ dummy<-dd[period==0,sum(x),by=grp] setkey(dd,grp) dd[dummy,x_at_0_by_grp:=v1] #***error here*** setkey(dd,id,period) }

quick update--also produces error if cast lapply instead of within for loop.

any ideas on earth going on here?

update: i've come workaround doing:

nnames<-c("dt","dt2","dt3") dt_list<-list(dt,dt2,dt3) (ii in 1:3){ dummy<-copy(dt_list[[ii]]) dummy[,x_at_0_by_grp:=sum(x[period==0]),by=grp] assign(nnames[ii],dummy) }

would still understand what's going on, , perhaps improve way of assigning variables iteratively in situations this.

with 20-30 criteria, keeping them outside of list (with manual names dt2, etc.) clunky, i'll assume have them in dt_list.

i suggest making tables stat you're computing, , rbinding them:

xxt <- rbindlist(lapply(1:length(dt_list),function(i) dt_list[[i]][,list(cond=i,xx=sum(x[period==0])),by=grp]))

which creates

grp cond xx 1: 1 1 623.3448 2: 2 1 784.8438 3: 4 1 699.2362 4: 3 1 367.7196 5: 1 2 323.6268 6: 4 2 307.0374 7: 2 2 447.0753 8: 3 2 185.7377 9: 1 3 275.4897 10: 4 3 243.0214 11: 2 3 149.6041 12: 3 3 166.3626

you can merge if want vars. example, dt2:

myi = 2 setkey(dt_list[[myi]],grp)[xxt[cond==myi,list(grp,xx)]]

this doesn't resolve bug you're running into, think improve approach.

r data.table

Comments

Popular posts from this blog

java - How to set log4j.defaultInitOverride property to false in jboss server 6 -

c - GStreamer 1.0 1.4.5 RTSP Example Server sends 503 Service unavailable -

Using ajax with sonata admin list view pagination -