Here’s how to group your data by specific columns and apply functions to other columns in a Pandas DataFrame in Python.
Create the DataFrame with some example data
import pandas as pd # Make up some data. data = [ {'unit': 'archer', 'building': 'archery_range', 'number_units': 1, 'civ': 'spanish'}, {'unit': 'militia', 'building': 'barracks', 'number_units': 2, 'civ': 'spanish'}, {'unit': 'pikemen', 'building': 'barracks', 'number_units': 3, 'civ': 'spanish'}, {'unit': 'pikemen', 'building': 'barracks', 'number_units': 4, 'civ': 'huns'}, ] # Create the DataFrame. df = pd.DataFrame(data) # View the DataFrame. df
You should see a DataFrame that looks like this:
unit building number_units civ 0 archer archery_range 1 spanish 1 militia barracks 2 spanish 2 pikemen barracks 3 spanish 3 pikemen barracks 4 huns
Example 1: Groupby and sum specific columns
Let’s say you want to count the number of units, but separate the unit count based on the type of building.
Continue reading “Python Pandas – How to groupby and aggregate a DataFrame”