python - Concise way updating values based on column values -
python - Concise way updating values based on column values -
background: have dataframe values need update using specific conditions. original implementation inherited used lot nested if statements wrapped in loop, obfuscating going on. readability in mind, rewrote this:
# other widgets df.loc[( (df.product == 0) & (df.prod_type == 'otherwidget') & (df.region == 'us') ), 'product'] = 5 # supplier x - clients df.loc[( (df.product == 0) & (df.region.isin(['uk','us'])) & (df.supplier == 'x') ), 'product'] = 6 # supplier y - client df.loc[( (df.product == 0) & (df.region.isin(['uk','us'])) & (df.supplier == 'y') & (df.client == 'a') ), 'product'] = 1 # supplier y - client b df.loc[( (df.product == 0) & (df.region.isin(['uk','us'])) & (df.supplier == 'y') & (df.client == 'b') ), 'product'] = 3 # supplier y - client c df.loc[( (df.product == 0) & (df.region.isin(['uk','us'])) & (df.supplier == 'y') & (df.client == 'c') ), 'product'] = 4
problem: works well, , makes conditions clear (in opinion), i'm not exclusively happy because it's taking lot of space. there anyway improve readability/conciseness perspective?
per edchum's recommendation, created mask conditions. code below goes bit overboard in terms of masking, gives general sense.
prod_0 = ( df.product == 0 ) ptype_ow = ( df.prod_type == 'otherwidget' ) rgn_ukus = ( df.region.isin['uk', 'us'] ) rgn_us = ( df.region == 'us' ) supp_x = ( df.supplier == 'x' ) supp_y = ( df.supplier == 'y' ) clnt_a = ( df.client == 'a' ) clnt_b = ( df.client == 'b' ) clnt_c = ( df.client == 'c' ) df.loc[(prod_0 & ptype_ow & reg_us), 'prod_0'] = 5 df.loc[(prod_0 & rgn_ukus & supp_x), 'prod_0'] = 6 df.loc[(prod_0 & rgn_ukus & supp_y & clnt_a), 'prod_0'] = 1 df.loc[(prod_0 & rgn_ukus & supp_y & clnt_b), 'prod_0'] = 3 df.loc[(prod_0 & rgn_ukus & supp_y & clnt_c), 'prod_0'] = 4
python pandas
Comments
Post a Comment