python dictionary remove duplicate key values pairs -
python dictionary remove duplicate key values pairs -
i have file need remove duplicated pairs (marked in bold).
the input file:
at1g01010 = 0005634 **at1g01010 = 0006355** at1g01010 = 0003677 at1g01010 = 0007275 **at1g01010 = 0006355 at1g01010 = 0006355** at1g01010 = 0006888 **at1g01020 = 0016125** at1g01020 = 0016020 **at1g01020 = 0005739** **at1g01020 = 0016125** at1g01020 = 0003674 at1g01020 = 0005783 **at1g01020 = 0005739** **at1g01020 = 0006665 at1g01020 = 0006665**
expected output:
at1g01010 = 0005634 at1g01010 = 0006355 at1g01010 = 0003677 at1g01010 = 0007275 at1g01010 = 0006888 at1g01020 = 0016125 at1g01020 = 0016020 at1g01020 = 0005739 at1g01020 = 0003674 at1g01020 = 0005783 at1g01020 = 0006665
so remove duplicates first made dictionary. after creating dictionary tried coding:
import sys ara_go_file = open (sys.argv[1]).readlines() ara_id_list = [] ara_go_list = [] lines in ara_go_file: split_lines = lines.split(' ') ara_id = split_lines[0] ara_id_list.append(ara_id) go_id_split = split_lines[-1] go_id = go_id_split.split('\n')[0] ara_go_list.append(go_id) ara_id_go_dic = dict (zip(ara_id_list, ara_go_list)) ##ara_id_go_dic (this name of dict have created) new_dict = {} # made new dict re-create info n remove duplicate pairs k in ara_id_go_dic.items(): if k[0] in new_dict: if k[1] not in new_dict[k[0]]: new_dict[k[0]].append(k[1]) else: new_dict[k[0]]=[k[1]] print new_dict
i don’t know making mistake.
please allow me know error or else if there other way remove duplicate pairs.
you can utilize set
remove duplicated elements:
>>> s="""at1g01010 = 0006355 ... at1g01010 = 0003677 ... at1g01010 = 0007275 ... at1g01010 = 0006355 ... at1g01010 = 0006355 ... at1g01010 = 0006888 ... at1g01020 = 0016125 ... at1g01020 = 0016020 ... at1g01020 = 0005739 ... at1g01020 = 0016125 ... at1g01020 = 0003674 ... at1g01020 = 0005783 ... at1g01020 = 0005739 ... at1g01020 = 0006665 ... at1g01020 = 0006665""" >>> j in set([i in s.split('\n')]): ... print j ... at1g01010 = 0005634 at1g01020 = 0016020 at1g01010 = 0007275 at1g01010 = 0006355 at1g01020 = 0006665 at1g01010 = 0003677 at1g01020 = 0005783 at1g01020 = 0016125 at1g01020 = 0005739 at1g01020 = 0003674 at1g01010 = 0006888
python python-2.7 python-3.x
Comments
Post a Comment