narwhals.GroupBy
agg(*aggs, **named_aggs)
Compute aggregations for each group of a group by operation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
aggs
|
IntoExpr | Iterable[IntoExpr]
|
Aggregations to compute for each group of the group by operation, specified as positional arguments. |
()
|
named_aggs
|
IntoExpr
|
Additional aggregations, specified as keyword arguments. |
{}
|
Returns:
Type | Description |
---|---|
DataFrameT
|
A new Dataframe. |
Examples:
Group by one column or by multiple columns and call agg
to compute
the grouped sum of another column.
>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame(
... {
... "a": ["a", "b", "a", "b", "c"],
... "b": [1, 2, 1, 3, 3],
... "c": [5, 4, 3, 2, 1],
... }
... )
>>> df_pl = pl.DataFrame(
... {
... "a": ["a", "b", "a", "b", "c"],
... "b": [1, 2, 1, 3, 3],
... "c": [5, 4, 3, 2, 1],
... }
... )
We define library agnostic functions:
>>> @nw.narwhalify
... def func(df):
... return df.group_by("a").agg(nw.col("b").sum()).sort("a")
>>> @nw.narwhalify
... def func_mult_col(df):
... return df.group_by("a", "b").agg(nw.sum("c")).sort("a", "b")
We can then pass either pandas or Polars to func
and func_mult_col
:
>>> func(df_pd)
a b
0 a 2
1 b 5
2 c 3
>>> func(df_pl)
shape: (3, 2)
┌─────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ a ┆ 2 │
│ b ┆ 5 │
│ c ┆ 3 │
└─────┴─────┘
>>> func_mult_col(df_pd)
a b c
0 a 1 8
1 b 2 4
2 b 3 2
3 c 3 1
>>> func_mult_col(df_pl)
shape: (4, 3)
┌─────┬─────┬─────┐
│ a ┆ b ┆ c │
│ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ a ┆ 1 ┆ 8 │
│ b ┆ 2 ┆ 4 │
│ b ┆ 3 ┆ 2 │
│ c ┆ 3 ┆ 1 │
└─────┴─────┴─────┘