how to limit the duplicate to 5 in pandas data frames?k Lv a.li57_Gg 5pxXd nsefwium3Ww is8ia.r
6
col1= ['A','B','A','C','A','B','A','C','A','C','A','A','A']
col2= [1,1,4,2,4,5,6,3,1,5,2,1,1]
df = pd.DataFrame({'col1':col1, 'col2':col2})
for A we have [1,4,4,6,1,2,1,1], 8 items but i want to limit the size to 5 while converting Data frame to dict/list
Output:
Dict = {'A':[1,4,4,6,1],'B':[1,5],'C':[2,3,5]}
python pandas
add a comment |
2 Answers
active
oldest
votes
6
Use pandas.DataFrame.groupby with apply:
df.groupby('col1')['col2'].apply(lambda x:list(x.head(5))).to_dict()
Output:
{'A': [1, 4, 4, 6, 1], 'B': [1, 5], 'C': [2, 3, 5]}
add a comment |
3
Use DataFrame.groupby with lambda function, convert to list and filter first 5 values by indexing, last convert to dictionary by Series.to_dict:
d = df.groupby('col1')['col2'].apply(lambda x: x.tolist()[:5]).to_dict()
print (d)
{'A': [1, 4, 4, 6, 1], 'B': [1, 5], 'C': [2, 3, 5]}
add a comment |
dflook like? – Chris 8 hours ago