Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
218 views
in Technique[技术] by (71.8m points)

python - Pandas: How to Find a % of a Group?

*** Disclaimer: I am a total noob. I am trying to learn Pandas by solving a problem at work. This is a subset of my total problem but I am trying to solve the pieces before I tackle the project. I appreciate your patience! ***

I am trying to find out what percentage each Fund is of the States total.

Concept: We have funds(departments) that are based in states. The funds have different levels of compensation for different projects. I first need to total(group) the funds so I know the total compensation per fund.

I also need to total(group) the compensation by state so I can later figure out the fund % by state.

I have converted my data to sample code here:

import pandas as pd

#sample data

data = {'Fund':['1000','1000','2000','2000','3000','3000','4000','4000'], 
    'State':['AL','AL','FL','FL','AL','AL','NC','NC'],
    'Compensation':[2000,2500,1500,1750,4000,3200,1450,3000]}

# Create DataFrame (employees)  employees = pd.DataFrame(data)

If the pic doesn't come over here is what I did:

print(employees)
employees.groupby('Fund').Compensation.sum()
employees.groupby('State').Compensation.sum()

I've spent a good portion of the day on my actual data trying to figure out how to get the:

Fund's compensation is __% of total compensation for State or..

Fund_1000 is 38% of AL total compensation.

Thanks for your patience and your help!

John


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This should do the work:

df['total_state_compensataion'] = df.groupby('State')['Compensation'].transform(sum)
df['total_state_fund_compensataion'] = df.groupby(['State','Fund'])['Compensation'].transform(sum)
df['ratio']=df['total_state_fund_compensataion'].div(df['total_state_compensataion'])
>>>df.groupby(['State','Fund'])['ratio'].mean().to_dict()

out[1] {('AL', '1000'): 0.38461538461538464,
 ('AL', '3000'): 0.6153846153846154,
 ('FL', '2000'): 1.0,
 ('NC', '4000'): 1.0}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...