narwhals.Series
Narwhals Series, backed by a native series.
Warning
This class is not meant to be instantiated directly - instead:
-
If the native object is a series from one of the supported backend (e.g. pandas.Series, polars.Series, pyarrow.ChunkedArray), you can use
narwhals.from_native
:narwhals.from_native(native_series, allow_series=True) narwhals.from_native(native_series, series_only=True)
-
If the object is a generic sequence (e.g. a list or a tuple of values), you can create a series via
narwhals.new_series
:narwhals.new_series( name=name, values=values, native_namespace=narwhals.get_native_namespace(another_object), )
dtype
property
Get the data type of the Series.
Returns:
Type | Description |
---|---|
DType
|
The data type of the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_dtype(s_native: IntoSeriesT) -> nw.dtypes.DType:
... s = nw.from_native(s_native, series_only=True)
... return s.dtype
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_dtype
:
>>> agnostic_dtype(s_pd)
Int64
>>> agnostic_dtype(s_pl)
Int64
>>> agnostic_dtype(s_pa)
Int64
implementation
property
Return implementation of native Series.
This can be useful when you need to use special-casing for features outside of Narwhals' scope - for example, when dealing with pandas' Period Dtype.
Returns:
Type | Description |
---|---|
Implementation
|
Implementation. |
Examples:
>>> import narwhals as nw
>>> import pandas as pd
>>> s_native = pd.Series([1, 2, 3])
>>> s = nw.from_native(s_native, series_only=True)
>>> s.implementation
<Implementation.PANDAS: 1>
>>> s.implementation.is_pandas()
True
>>> s.implementation.is_pandas_like()
True
>>> s.implementation.is_polars()
False
name
property
Get the name of the Series.
Returns:
Type | Description |
---|---|
str
|
The name of the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data, name="foo")
>>> s_pl = pl.Series("foo", data)
We define a library agnostic function:
>>> def agnostic_name(s_native: IntoSeries) -> str:
... s = nw.from_native(s_native, series_only=True)
... return s.name
We can then pass any supported library such as pandas or Polars
to agnostic_name
:
>>> agnostic_name(s_pd)
'foo'
>>> agnostic_name(s_pl)
'foo'
shape
property
Get the shape of the Series.
Returns:
Type | Description |
---|---|
tuple[int]
|
A tuple containing the length of the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_shape(s_native: IntoSeries) -> tuple[int]:
... s = nw.from_native(s_native, series_only=True)
... return s.shape
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_shape
:
>>> agnostic_shape(s_pd)
(3,)
>>> agnostic_shape(s_pl)
(3,)
>>> agnostic_shape(s_pa)
(3,)
__arrow_c_stream__(requested_schema=None)
Export a Series via the Arrow PyCapsule Interface.
Narwhals doesn't implement anything itself here:
- if the underlying series implements the interface, it'll return that
- else, it'll call
to_arrow
and then defer to PyArrow's implementation
See PyCapsule Interface for more.
__getitem__(idx)
Retrieve elements from the object using integer indexing or slicing.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
idx
|
int | slice | Sequence[int]
|
The index, slice, or sequence of indices to retrieve.
|
required |
Returns:
Type | Description |
---|---|
Any | Self
|
A single element if |
Examples:
>>> from typing import Any
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_get_first_item(s_native: IntoSeriesT) -> Any:
... s = nw.from_native(s_native, series_only=True)
... return s[0]
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_get_first_item
:
>>> agnostic_get_first_item(s_pd)
np.int64(1)
>>> agnostic_get_first_item(s_pl)
1
>>> agnostic_get_first_item(s_pa)
1
We can also make a function to slice the Series:
>>> def agnostic_slice(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s[:2].to_native()
>>> agnostic_slice(s_pd)
0 1
1 2
dtype: int64
>>> agnostic_slice(s_pl)
shape: (2,)
Series: '' [i64]
[
1
2
]
>>> agnostic_slice(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
2
]
]
__iter__()
abs()
Calculate the absolute value of each element.
Returns:
Type | Description |
---|---|
Self
|
A new Series with the absolute values of the original elements. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [2, -4, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a dataframe-agnostic function:
>>> def agnostic_abs(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.abs().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_abs
:
>>> agnostic_abs(s_pd)
0 2
1 4
2 3
dtype: int64
>>> agnostic_abs(s_pl)
shape: (3,)
Series: '' [i64]
[
2
4
3
]
>>> agnostic_abs(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
2,
4,
3
]
]
alias(name)
Rename the Series.
Notes
This method is very cheap, but does not guarantee that data will be copied. For example:
s1: nw.Series
s2 = s1.alias("foo")
arr = s2.to_numpy()
arr[0] = 999
may (depending on the backend, and on the version) result in
s1
's data being modified. We recommend:
- if you need to alias an object and don't need the original
one around any more, just use `alias` without worrying about it.
- if you were expecting `alias` to copy data, then explicily call
`.clone` before calling `alias`.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The new name. |
required |
Returns:
Type | Description |
---|---|
Self
|
A new Series with the updated name. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data, name="foo")
>>> s_pl = pl.Series("foo", data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_alias(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.alias("bar").to_native()
We can then pass any supported library such as pandas or Polars, or
PyArrow to agnostic_alias
:
>>> agnostic_alias(s_pd)
0 1
1 2
2 3
Name: bar, dtype: int64
>>> agnostic_alias(s_pl)
shape: (3,)
Series: 'bar' [i64]
[
1
2
3
]
>>> agnostic_alias(s_pa)
<pyarrow.lib.ChunkedArray object at 0x...>
[
[
1,
2,
3
]
]
all()
Return whether all values in the Series are True.
Returns:
Type | Description |
---|---|
Any
|
A boolean indicating if all values in the Series are True. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [False, True, False]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_all(s_native: IntoSeries) -> bool:
... s = nw.from_native(s_native, series_only=True)
... return s.all()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_all
:
>>> agnostic_all(s_pd)
np.False_
>>> agnostic_all(s_pl)
False
>>> agnostic_all(s_pa)
False
any()
Return whether any of the values in the Series are True.
Notes
Only works on Series of data type Boolean.
Returns:
Type | Description |
---|---|
Any
|
A boolean indicating if any values in the Series are True. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [False, True, False]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_any(s_native: IntoSeries) -> bool:
... s = nw.from_native(s_native, series_only=True)
... return s.any()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_any
:
>>> agnostic_any(s_pd)
np.True_
>>> agnostic_any(s_pl)
True
>>> agnostic_any(s_pa)
True
arg_max()
Returns the index of the maximum value.
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_arg_max(s_native: IntoSeries):
... s = nw.from_native(s_native, series_only=True)
... return s.arg_max()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_arg_max
:
>>> agnostic_arg_max(s_pd)
np.int64(2)
>>> agnostic_arg_max(s_pl)
2
>>> agnostic_arg_max(s_pa)
2
arg_min()
Returns the index of the minimum value.
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_arg_min(s_native: IntoSeries):
... s = nw.from_native(s_native, series_only=True)
... return s.arg_min()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_arg_min
:
>>> agnostic_arg_min(s_pd)
np.int64(0)
>>> agnostic_arg_min(s_pl)
0
>>> agnostic_arg_min(s_pa)
0
arg_true()
Find elements where boolean Series is True.
Returns:
Type | Description |
---|---|
Self
|
A new Series with the indices of elements that are True. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, None, None, 2]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_arg_true(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_null().arg_true().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_arg_true
:
>>> agnostic_arg_true(s_pd)
1 1
2 2
dtype: int64
>>> agnostic_arg_true(s_pl)
shape: (2,)
Series: '' [u32]
[
1
2
]
>>> agnostic_arg_true(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
2
]
]
cast(dtype)
Cast between data types.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dtype
|
DType | type[DType]
|
Data type that the object will be cast into. |
required |
Returns:
Type | Description |
---|---|
Self
|
A new Series with the specified data type. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [True, False, True]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a dataframe-agnostic function:
>>> def agnostic_cast(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.cast(nw.Int64).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_cast
:
>>> agnostic_cast(s_pd)
0 1
1 0
2 1
dtype: int64
>>> agnostic_cast(s_pl)
shape: (3,)
Series: '' [i64]
[
1
0
1
]
>>> agnostic_cast(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
0,
1
]
]
clip(lower_bound=None, upper_bound=None)
Clip values in the Series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lower_bound
|
Self | Any | None
|
Lower bound value. |
None
|
upper_bound
|
Self | Any | None
|
Upper bound value. |
None
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with values clipped to the specified bounds. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_clip_lower(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.clip(2).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_clip_lower
:
>>> agnostic_clip_lower(s_pd)
0 2
1 2
2 3
dtype: int64
>>> agnostic_clip_lower(s_pl)
shape: (3,)
Series: '' [i64]
[
2
2
3
]
>>> agnostic_clip_lower(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
2,
2,
3
]
]
We define another library agnostic function:
>>> def agnostic_clip_upper(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.clip(upper_bound=2).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_clip_upper
:
>>> agnostic_clip_upper(s_pd)
0 1
1 2
2 2
dtype: int64
>>> agnostic_clip_upper(s_pl) # doctest: +NORMALIZE_WHITESPACE
shape: (3,)
Series: '' [i64]
[
1
2
2
]
>>> agnostic_clip_upper(s_pa) # doctest: +ELLIPSIS
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
2,
2
]
]
We can have both at the same time
>>> data = [-1, 1, -3, 3, -5, 5]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_clip(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.clip(-1, 3).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to `agnostic_clip`:
>>> agnostic_clip(s_pd)
0 -1
1 1
2 -1
3 3
4 -1
5 3
dtype: int64
>>> agnostic_clip(s_pl) # doctest: +NORMALIZE_WHITESPACE
shape: (6,)
Series: '' [i64]
[
-1
1
-1
3
-1
3
]
>>> agnostic_clip_upper(s_pa) # doctest: +ELLIPSIS
<pyarrow.lib.ChunkedArray object at ...>
[
[
-1,
1,
-3,
2,
-5,
2
]
]
count()
Returns the number of non-null elements in the Series.
Returns:
Type | Description |
---|---|
Any
|
The number of non-null elements in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_count(s_native: IntoSeries) -> int:
... s = nw.from_native(s_native, series_only=True)
... return s.count()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_count
:
>>> agnostic_count(s_pd)
np.int64(3)
>>> agnostic_count(s_pl)
3
>>> agnostic_count(s_pa)
3
cum_count(*, reverse=False)
Return the cumulative count of the non-null values in the series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
reverse
|
bool
|
reverse the operation |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with the cumulative count of non-null values. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = ["x", "k", None, "d"]
We define a library agnostic function:
>>> def agnostic_cum_count(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.cum_count(reverse=True).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_cum_count
:
>>> agnostic_cum_count(pd.Series(data))
0 3
1 2
2 1
3 1
dtype: int64
>>> agnostic_cum_count(pl.Series(data))
shape: (4,)
Series: '' [u32]
[
3
2
1
1
]
>>> agnostic_cum_count(pa.chunked_array([data]))
<pyarrow.lib.ChunkedArray object at ...>
[
[
3,
2,
1,
1
]
]
cum_max(*, reverse=False)
Return the cumulative max of the non-null values in the series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
reverse
|
bool
|
reverse the operation |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with the cumulative max of non-null values. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 3, None, 2]
We define a library agnostic function:
>>> def agnostic_cum_max(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.cum_max().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_cum_max
:
>>> agnostic_cum_max(pd.Series(data))
0 1.0
1 3.0
2 NaN
3 3.0
dtype: float64
>>> agnostic_cum_max(pl.Series(data))
shape: (4,)
Series: '' [i64]
[
1
3
null
3
]
>>> agnostic_cum_max(pa.chunked_array([data]))
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
3,
null,
3
]
]
cum_min(*, reverse=False)
Return the cumulative min of the non-null values in the series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
reverse
|
bool
|
reverse the operation |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with the cumulative min of non-null values. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [3, 1, None, 2]
We define a library agnostic function:
>>> def agnostic_cum_min(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.cum_min().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_cum_min
:
>>> agnostic_cum_min(pd.Series(data))
0 3.0
1 1.0
2 NaN
3 1.0
dtype: float64
>>> agnostic_cum_min(pl.Series(data))
shape: (4,)
Series: '' [i64]
[
3
1
null
1
]
>>> agnostic_cum_min(pa.chunked_array([data]))
<pyarrow.lib.ChunkedArray object at ...>
[
[
3,
1,
null,
1
]
]
cum_prod(*, reverse=False)
Return the cumulative product of the non-null values in the series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
reverse
|
bool
|
reverse the operation |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with the cumulative product of non-null values. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 3, None, 2]
We define a library agnostic function:
>>> def agnostic_cum_prod(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.cum_prod().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_cum_prod
:
>>> agnostic_cum_prod(pd.Series(data))
0 1.0
1 3.0
2 NaN
3 6.0
dtype: float64
>>> agnostic_cum_prod(pl.Series(data))
shape: (4,)
Series: '' [i64]
[
1
3
null
6
]
>>> agnostic_cum_prod(pa.chunked_array([data]))
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
3,
null,
6
]
]
cum_sum(*, reverse=False)
Calculate the cumulative sum.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
reverse
|
bool
|
reverse the operation |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with the cumulative sum of non-null values. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [2, 4, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a dataframe-agnostic function:
>>> def agnostic_cum_sum(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.cum_sum().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_cum_sum
:
>>> agnostic_cum_sum(s_pd)
0 2
1 6
2 9
dtype: int64
>>> agnostic_cum_sum(s_pl)
shape: (3,)
Series: '' [i64]
[
2
6
9
]
>>> agnostic_cum_sum(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
2,
6,
9
]
]
diff()
Calculate the difference with the previous element, for each element.
Notes
pandas may change the dtype here, for example when introducing missing
values in an integer column. To ensure, that the dtype doesn't change,
you may want to use fill_null
and cast
. For example, to calculate
the diff and fill missing values with 0
in a Int64 column, you could
do:
s.diff().fill_null(0).cast(nw.Int64)
Returns:
Type | Description |
---|---|
Self
|
A new Series with the difference between each element and its predecessor. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [2, 4, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a dataframe-agnostic function:
>>> def agnostic_diff(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.diff().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_diff
:
>>> agnostic_diff(s_pd)
0 NaN
1 2.0
2 -1.0
dtype: float64
>>> agnostic_diff(s_pl)
shape: (3,)
Series: '' [i64]
[
null
2
-1
]
>>> agnostic_diff(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
null,
2,
-1
]
]
drop_nulls()
Drop null values.
Notes
pandas handles null values differently from Polars and PyArrow. See null_handling for reference.
Returns:
Type | Description |
---|---|
Self
|
A new Series with null values removed. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [2, 4, None, 3, 5]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_drop_nulls(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.drop_nulls().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_drop_nulls
:
>>> agnostic_drop_nulls(s_pd)
0 2.0
1 4.0
3 3.0
4 5.0
dtype: float64
>>> agnostic_drop_nulls(s_pl)
shape: (4,)
Series: '' [i64]
[
2
4
3
5
]
>>> agnostic_drop_nulls(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
2,
4,
3,
5
]
]
ewm_mean(*, com=None, span=None, half_life=None, alpha=None, adjust=True, min_periods=1, ignore_nulls=False)
Compute exponentially-weighted moving average.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
com
|
float | None
|
Specify decay in terms of center of mass, \(\gamma\), with |
None
|
span
|
float | None
|
Specify decay in terms of span, \(\theta\), with |
None
|
half_life
|
float | None
|
Specify decay in terms of half-life, \(\tau\), with |
None
|
alpha
|
float | None
|
Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\). |
None
|
adjust
|
bool
|
Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings
|
True
|
min_periods
|
int
|
Minimum number of observations in window required to have a value (otherwise result is null). |
1
|
ignore_nulls
|
bool
|
Ignore missing values when calculating weights.
|
False
|
Returns:
Type | Description |
---|---|
Self
|
Series |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(name="a", data=data)
>>> s_pl = pl.Series(name="a", values=data)
We define a library agnostic function:
>>> def agnostic_ewm_mean(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.ewm_mean(com=1, ignore_nulls=False).to_native()
We can then pass any supported library such as pandas or Polars
to agnostic_ewm_mean
:
>>> agnostic_ewm_mean(s_pd)
0 1.000000
1 1.666667
2 2.428571
Name: a, dtype: float64
>>> agnostic_ewm_mean(s_pl)
shape: (3,)
Series: 'a' [f64]
[
1.0
1.666667
2.428571
]
fill_null(value=None, strategy=None, limit=None)
Fill null values using the specified value.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
value
|
Any | None
|
Value used to fill null values. |
None
|
strategy
|
Literal['forward', 'backward'] | None
|
Strategy used to fill null values. |
None
|
limit
|
int | None
|
Number of consecutive null values to fill when using the 'forward' or 'backward' strategy. |
None
|
Notes
pandas handles null values differently from Polars and PyArrow. See null_handling for reference.
Returns:
Type | Description |
---|---|
Self
|
A new Series with null values filled according to the specified value or strategy. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, None]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_fill_null(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.fill_null(5).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_fill_null
:
>>> agnostic_fill_null(s_pd)
0 1.0
1 2.0
2 5.0
dtype: float64
>>> agnostic_fill_null(s_pl)
shape: (3,)
Series: '' [i64]
[
1
2
5
]
>>> agnostic_fill_null(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
2,
5
]
]
Using a strategy:
>>> def agnostic_fill_null_with_strategy(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.fill_null(strategy="forward", limit=1).to_native()
>>> agnostic_fill_null_with_strategy(s_pd)
0 1.0
1 2.0
2 2.0
dtype: float64
>>> agnostic_fill_null_with_strategy(s_pl)
shape: (3,)
Series: '' [i64]
[
1
2
2
]
>>> agnostic_fill_null_with_strategy(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
2,
2
]
]
filter(other)
Filter elements in the Series based on a condition.
Returns:
Type | Description |
---|---|
Self
|
A new Series with elements that satisfy the condition. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [4, 10, 15, 34, 50]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_filter(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.filter(s > 10).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_filter
:
>>> agnostic_filter(s_pd)
2 15
3 34
4 50
dtype: int64
>>> agnostic_filter(s_pl)
shape: (3,)
Series: '' [i64]
[
15
34
50
]
>>> agnostic_filter(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
15,
34,
50
]
]
gather_every(n, offset=0)
Take every nth value in the Series and return as new Series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
Gather every n-th row. |
required |
offset
|
int
|
Starting index. |
0
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with every nth value starting from the offset. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3, 4]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function in which gather every 2 rows, starting from a offset of 1:
>>> def agnostic_gather_every(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.gather_every(n=2, offset=1).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_gather_every
:
>>> agnostic_gather_every(s_pd)
1 2
3 4
dtype: int64
>>> agnostic_gather_every(s_pl)
shape: (2,)
Series: '' [i64]
[
2
4
]
>>> agnostic_gather_every(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
2,
4
]
]
head(n=10)
Get the first n
rows.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
Number of rows to return. |
10
|
Returns:
Type | Description |
---|---|
Self
|
A new Series containing the first n characters of each string. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = list(range(10))
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function that returns the first 3 rows:
>>> def agnostic_head(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.head(3).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_head
:
>>> agnostic_head(s_pd)
0 0
1 1
2 2
dtype: int64
>>> agnostic_head(s_pl)
shape: (3,)
Series: '' [i64]
[
0
1
2
]
>>> agnostic_head(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
0,
1,
2
]
]
is_between(lower_bound, upper_bound, closed='both')
Get a boolean mask of the values that are between the given lower/upper bounds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
lower_bound
|
Any | Self
|
Lower bound value. |
required |
upper_bound
|
Any | Self
|
Upper bound value. |
required |
closed
|
Literal['left', 'right', 'none', 'both']
|
Define which sides of the interval are closed (inclusive). |
'both'
|
Notes
If the value of the lower_bound
is greater than that of the upper_bound
,
then the values will be False, as no value can satisfy the condition.
Returns:
Type | Description |
---|---|
Self
|
A boolean Series indicating which values are between the given bounds. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3, 4, 5]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_is_between(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_between(2, 4, "right").to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_between
:
>>> agnostic_is_between(s_pd)
0 False
1 False
2 True
3 True
4 False
dtype: bool
>>> agnostic_is_between(s_pl)
shape: (5,)
Series: '' [bool]
[
false
false
true
true
false
]
>>> agnostic_is_between(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
false,
false,
true,
true,
false
]
]
is_duplicated()
Get a mask of all duplicated rows in the Series.
Returns:
Type | Description |
---|---|
Self
|
A new Series with boolean values indicating duplicated rows. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3, 1]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_is_duplicated(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_duplicated().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_duplicated
:
>>> agnostic_is_duplicated(s_pd)
0 True
1 False
2 False
3 True
dtype: bool
>>> agnostic_is_duplicated(s_pl)
shape: (4,)
Series: '' [bool]
[
true
false
false
true
]
>>> agnostic_is_duplicated(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
true,
false,
false,
true
]
]
is_empty()
Check if the series is empty.
Returns:
Type | Description |
---|---|
bool
|
A boolean indicating if the series is empty. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
Let's define a dataframe-agnostic function that filters rows in which "foo" values are greater than 10, and then checks if the result is empty or not:
>>> def agnostic_is_empty(s_native: IntoSeries) -> bool:
... s = nw.from_native(s_native, series_only=True)
... return s.filter(s > 10).is_empty()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_empty
:
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
>>> agnostic_is_empty(s_pd), agnostic_is_empty(s_pl), agnostic_is_empty(s_pa)
(True, True, True)
>>> data = [100, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
>>> agnostic_is_empty(s_pd), agnostic_is_empty(s_pl), agnostic_is_empty(s_pa)
(False, False, False)
is_finite()
Returns a boolean Series indicating which values are finite.
Warning
Different backend handle null values differently. is_finite
will return
False for NaN and Null's in the Dask and pandas non-nullable backend, while
for Polars, PyArrow and pandas nullable backends null values are kept as such.
Returns:
Type | Description |
---|---|
Self
|
Expression of |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [float("nan"), float("inf"), 2.0, None]
We define a library agnostic function:
>>> def agnostic_is_finite(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_finite().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_finite
:
>>> agnostic_is_finite(pd.Series(data))
0 False
1 False
2 True
3 False
dtype: bool
>>> agnostic_is_finite(pl.Series(data))
shape: (4,)
Series: '' [bool]
[
false
false
true
null
]
>>> agnostic_is_finite(pa.chunked_array([data]))
<pyarrow.lib.ChunkedArray object at ...>
[
[
false,
false,
true,
null
]
]
is_first_distinct()
Return a boolean mask indicating the first occurrence of each distinct value.
Returns:
Type | Description |
---|---|
Self
|
A new Series with boolean values indicating the first occurrence of each distinct value. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 1, 2, 3, 2]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_is_first_distinct(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_first_distinct().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_first_distinct
:
>>> agnostic_is_first_distinct(s_pd)
0 True
1 False
2 True
3 True
4 False
dtype: bool
>>> agnostic_is_first_distinct(s_pl)
shape: (5,)
Series: '' [bool]
[
true
false
true
true
false
]
>>> agnostic_is_first_distinct(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
true,
false,
true,
true,
false
]
]
is_in(other)
Check if the elements of this Series are in the other sequence.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
other
|
Any
|
Sequence of primitive type. |
required |
Returns:
Type | Description |
---|---|
Self
|
A new Series with boolean values indicating if the elements are in the other sequence. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_is_in(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_in([3, 2, 8]).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_in
:
>>> agnostic_is_in(s_pd)
0 False
1 True
2 True
dtype: bool
>>> agnostic_is_in(s_pl)
shape: (3,)
Series: '' [bool]
[
false
true
true
]
>>> agnostic_is_in(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
false,
true,
true
]
]
is_last_distinct()
Return a boolean mask indicating the last occurrence of each distinct value.
Returns:
Type | Description |
---|---|
Self
|
A new Series with boolean values indicating the last occurrence of each distinct value. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 1, 2, 3, 2]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_is_last_distinct(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_last_distinct().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_last_distinct
:
>>> agnostic_is_last_distinct(s_pd)
0 False
1 True
2 False
3 True
4 True
dtype: bool
>>> agnostic_is_last_distinct(s_pl)
shape: (5,)
Series: '' [bool]
[
false
true
false
true
true
]
>>> agnostic_is_last_distinct(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
false,
true,
false,
true,
true
]
]
is_nan()
Returns a boolean Series indicating which values are NaN.
Returns:
Type | Description |
---|---|
Self
|
A boolean Series indicating which values are NaN. |
Notes
pandas handles null values differently from Polars and PyArrow. See null_handling for reference.
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [0.0, None, 2.0]
>>> s_pd = pd.Series(data, dtype="Float64")
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data], type=pa.float64())
>>> def agnostic_self_div_is_nan(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_nan().to_native()
>>> print(agnostic_self_div_is_nan(s_pd))
0 False
1 <NA>
2 False
dtype: boolean
>>> print(agnostic_self_div_is_nan(s_pl))
shape: (3,)
Series: '' [bool]
[
false
null
false
]
>>> print(agnostic_self_div_is_nan(s_pa))
[
[
false,
null,
false
]
]
is_null()
Returns a boolean Series indicating which values are null.
Notes
pandas handles null values differently from Polars and PyArrow. See null_handling for reference.
Returns:
Type | Description |
---|---|
Self
|
A boolean Series indicating which values are null. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, None]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_is_null(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_null().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_null
:
>>> agnostic_is_null(s_pd)
0 False
1 False
2 True
dtype: bool
>>> agnostic_is_null(s_pl)
shape: (3,)
Series: '' [bool]
[
false
false
true
]
>>> agnostic_is_null(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
false,
false,
true
]
]
is_sorted(*, descending=False)
Check if the Series is sorted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
descending
|
bool
|
Check if the Series is sorted in descending order. |
False
|
Returns:
Type | Description |
---|---|
bool
|
A boolean indicating if the Series is sorted. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> unsorted_data = [1, 3, 2]
>>> sorted_data = [3, 2, 1]
Let's define a dataframe-agnostic function:
>>> def agnostic_is_sorted(s_native: IntoSeries, descending: bool = False):
... s = nw.from_native(s_native, series_only=True)
... return s.is_sorted(descending=descending)
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_sorted
:
>>> agnostic_is_sorted(pd.Series(unsorted_data))
False
>>> agnostic_is_sorted(pd.Series(sorted_data), descending=True)
True
>>> agnostic_is_sorted(pl.Series(unsorted_data))
False
>>> agnostic_is_sorted(pl.Series(sorted_data), descending=True)
True
>>> agnostic_is_sorted(pa.chunked_array([unsorted_data]))
False
>>> agnostic_is_sorted(pa.chunked_array([sorted_data]), descending=True)
True
is_unique()
Get a mask of all unique rows in the Series.
Returns:
Type | Description |
---|---|
Self
|
A new Series with boolean values indicating unique rows. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3, 1]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_is_unique(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.is_unique().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_is_unique
:
>>> agnostic_is_unique(s_pd)
0 False
1 True
2 True
3 False
dtype: bool
>>> agnostic_is_unique(s_pl)
shape: (4,)
Series: '' [bool]
[
false
true
true
false
]
>>> agnostic_is_unique(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
false,
true,
true,
false
]
]
item(index=None)
Return the Series as a scalar, or return the element at the given index.
If no index is provided, this is equivalent to s[0]
, with a check
that the shape is (1,). With an index, this is equivalent to s[index]
.
Returns:
Type | Description |
---|---|
Any
|
The scalar value of the Series or the element at the given index. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
Let's define a dataframe-agnostic function that returns item at given index
>>> def agnostic_item(s_native: IntoSeries, index=None):
... s = nw.from_native(s_native, series_only=True)
... return s.item(index)
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_item
:
>>> (
... agnostic_item(pl.Series("a", [1]), None),
... agnostic_item(pd.Series([1]), None),
... agnostic_item(pa.chunked_array([[1]]), None),
... )
(1, np.int64(1), 1)
>>> (
... agnostic_item(pl.Series("a", [9, 8, 7]), -1),
... agnostic_item(pl.Series([9, 8, 7]), -2),
... agnostic_item(pa.chunked_array([[9, 8, 7]]), -3),
... )
(7, 8, 9)
len()
Return the number of elements in the Series.
Null values count towards the total.
Returns:
Type | Description |
---|---|
int
|
The number of elements in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, None]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function that computes the len of the series:
>>> def agnostic_len(s_native: IntoSeries) -> int:
... s = nw.from_native(s_native, series_only=True)
... return s.len()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_len
:
>>> agnostic_len(s_pd)
3
>>> agnostic_len(s_pl)
3
>>> agnostic_len(s_pa)
3
max()
Get the maximum value in this Series.
Returns:
Type | Description |
---|---|
Any
|
The maximum value in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_max(s_native: IntoSeries):
... s = nw.from_native(s_native, series_only=True)
... return s.max()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_max
:
>>> agnostic_max(s_pd)
np.int64(3)
>>> agnostic_max(s_pl)
3
>>> agnostic_max(s_pa)
3
mean()
Reduce this Series to the mean value.
Returns:
Type | Description |
---|---|
Any
|
The average of all elements in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_mean(s_native: IntoSeries) -> float:
... s = nw.from_native(s_native, series_only=True)
... return s.mean()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_mean
:
>>> agnostic_mean(s_pd)
np.float64(2.0)
>>> agnostic_mean(s_pl)
2.0
>>> agnostic_mean(s_pa)
2.0
median()
Reduce this Series to the median value.
Notes
Results might slightly differ across backends due to differences in the underlying algorithms used to compute the median.
Returns:
Type | Description |
---|---|
Any
|
The median value of all elements in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [5, 3, 8]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a library agnostic function:
>>> def agnostic_median(s_native: IntoSeries) -> float:
... s = nw.from_native(s_native, series_only=True)
... return s.median()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_median
:
>>> agnostic_median(s_pd)
np.float64(5.0)
>>> agnostic_median(s_pl)
5.0
>>> agnostic_median(s_pa)
5.0
min()
Get the minimal value in this Series.
Returns:
Type | Description |
---|---|
Any
|
The minimum value in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_min(s_native: IntoSeries):
... s = nw.from_native(s_native, series_only=True)
... return s.min()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_min
:
>>> agnostic_min(s_pd)
np.int64(1)
>>> agnostic_min(s_pl)
1
>>> agnostic_min(s_pa)
1
mode()
Compute the most occurring value(s).
Can return multiple values.
Returns:
Type | Description |
---|---|
Self
|
A new Series containing the mode(s) (values that appear most frequently). |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 1, 2, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_mode(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.mode().sort().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_mode
:
>>> agnostic_mode(s_pd)
0 1
1 2
dtype: int64
>>> agnostic_mode(s_pl)
shape: (2,)
Series: '' [i64]
[
1
2
]
>>> agnostic_mode(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
2
]
]
n_unique()
Count the number of unique values.
Returns:
Type | Description |
---|---|
int
|
Number of unique values in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_n_unique(s_native: IntoSeries) -> int:
... s = nw.from_native(s_native, series_only=True)
... return s.n_unique()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_n_unique
:
>>> agnostic_n_unique(s_pd)
3
>>> agnostic_n_unique(s_pl)
3
>>> agnostic_n_unique(s_pa)
3
null_count()
Create a new Series that shows the null counts per column.
Notes
pandas handles null values differently from Polars and PyArrow. See null_handling for reference.
Returns:
Type | Description |
---|---|
int
|
The number of null values in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, None, None]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function that returns the null count of the series:
>>> def agnostic_null_count(s_native: IntoSeries) -> int:
... s = nw.from_native(s_native, series_only=True)
... return s.null_count()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_null_count
:
>>> agnostic_null_count(s_pd)
np.int64(2)
>>> agnostic_null_count(s_pl)
2
>>> agnostic_null_count(s_pa)
2
pipe(function, *args, **kwargs)
Pipe function call.
Returns:
Type | Description |
---|---|
Self
|
A new Series with the results of the piped function applied. |
Examples:
>>> import polars as pl
>>> import pandas as pd
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a function to pipe into:
>>> def agnostic_pipe(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.pipe(lambda x: x + 2).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_pipe
:
>>> agnostic_pipe(s_pd)
0 3
1 4
2 5
dtype: int64
>>> agnostic_pipe(s_pl)
shape: (3,)
Series: '' [i64]
[
3
4
5
]
>>> agnostic_pipe(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
3,
4,
5
]
]
quantile(quantile, interpolation)
Get quantile value of the series.
Note
pandas and Polars may have implementation differences for a given interpolation method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
quantile
|
float
|
Quantile between 0.0 and 1.0. |
required |
interpolation
|
Literal['nearest', 'higher', 'lower', 'midpoint', 'linear']
|
Interpolation method. |
required |
Returns:
Type | Description |
---|---|
Any
|
The quantile value. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = list(range(50))
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_quantile(s_native: IntoSeries) -> list[float]:
... s = nw.from_native(s_native, series_only=True)
... return [
... s.quantile(quantile=q, interpolation="nearest")
... for q in (0.1, 0.25, 0.5, 0.75, 0.9)
... ]
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_quantile
:
>>> agnostic_quantile(s_pd)
[np.int64(5), np.int64(12), np.int64(24), np.int64(37), np.int64(44)]
>>> agnostic_quantile(s_pl)
[5.0, 12.0, 25.0, 37.0, 44.0]
>>> agnostic_quantile(s_pa)
[5, 12, 24, 37, 44]
rank(method='average', *, descending=False)
Assign ranks to data, dealing with ties appropriately.
Notes
The resulting dtype may differ between backends.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
method
|
Literal['average', 'min', 'max', 'dense', 'ordinal']
|
The method used to assign ranks to tied elements. The following methods are available (default is 'average'):
|
'average'
|
descending
|
bool
|
Rank in descending order. |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new series with rank data as values. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>>
>>> data = [3, 6, 1, 1, 6]
We define a dataframe-agnostic function that computes the dense rank for the data:
>>> def agnostic_dense_rank(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.rank(method="dense").to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_dense_rank
:
>>> agnostic_dense_rank(pd.Series(data))
0 2.0
1 3.0
2 1.0
3 1.0
4 3.0
dtype: float64
>>> agnostic_dense_rank(pl.Series(data))
shape: (5,)
Series: '' [u32]
[
2
3
1
1
3
]
>>> agnostic_dense_rank(pa.chunked_array([data]))
<pyarrow.lib.ChunkedArray object at ...>
[
[
2,
3,
1,
1,
3
]
]
rename(name)
Rename the Series.
Alias for Series.alias()
.
Notes
This method is very cheap, but does not guarantee that data will be copied. For example:
s1: nw.Series
s2 = s1.rename("foo")
arr = s2.to_numpy()
arr[0] = 999
may (depending on the backend, and on the version) result in
s1
's data being modified. We recommend:
- if you need to rename an object and don't need the original
one around any more, just use `rename` without worrying about it.
- if you were expecting `rename` to copy data, then explicily call
`.clone` before calling `rename`.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The new name. |
required |
Returns:
Type | Description |
---|---|
Self
|
A new Series with the updated name. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data, name="foo")
>>> s_pl = pl.Series("foo", data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_rename(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.rename("bar").to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_rename
:
>>> agnostic_rename(s_pd)
0 1
1 2
2 3
Name: bar, dtype: int64
>>> agnostic_rename(s_pl)
shape: (3,)
Series: 'bar' [i64]
[
1
2
3
]
>>> agnostic_rename(s_pa)
<pyarrow.lib.ChunkedArray object at 0x...>
[
[
1,
2,
3
]
]
replace_strict(old, new=None, *, return_dtype=None)
Replace all values by different values.
This function must replace all non-null input values (else it raises an error).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
old
|
Sequence[Any] | Mapping[Any, Any]
|
Sequence of values to replace. It also accepts a mapping of values to
their replacement as syntactic sugar for
|
required |
new
|
Sequence[Any] | None
|
Sequence of values to replace by. Length must match the length of |
None
|
return_dtype
|
DType | type[DType] | None
|
The data type of the resulting expression. If set to |
None
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with values replaced according to the mapping. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = {"a": [3, 0, 1, 2]}
>>> df_pd = pd.DataFrame(data)
>>> df_pl = pl.DataFrame(data)
>>> df_pa = pa.table(data)
Let's define dataframe-agnostic functions:
>>> def agnostic_replace_strict(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.replace_strict(
... [0, 1, 2, 3], ["zero", "one", "two", "three"], return_dtype=nw.String
... ).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_replace_strict
:
>>> agnostic_replace_strict(df_pd["a"])
0 three
1 zero
2 one
3 two
Name: a, dtype: object
>>> agnostic_replace_strict(df_pl["a"])
shape: (4,)
Series: 'a' [str]
[
"three"
"zero"
"one"
"two"
]
>>> agnostic_replace_strict(df_pa["a"])
<pyarrow.lib.ChunkedArray object at ...>
[
[
"three",
"zero",
"one",
"two"
]
]
rolling_mean(window_size, *, min_periods=None, center=False)
Apply a rolling mean (moving mean) over the values.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
A window of length window_size
will traverse the values. The resulting values
will be aggregated to their mean.
The window at a given row will include the row itself and the window_size - 1
elements before it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_size
|
int
|
The length of the window in number of elements. It must be a strictly positive integer. |
required |
min_periods
|
int | None
|
The number of values in the window that should be non-null before
computing a result. If set to |
None
|
center
|
bool
|
Set the labels at the center of the window. |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1.0, 2.0, 3.0, 4.0]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_rolling_mean(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.rolling_mean(window_size=2).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_rolling_mean
:
>>> agnostic_rolling_mean(s_pd)
0 NaN
1 1.5
2 2.5
3 3.5
dtype: float64
>>> agnostic_rolling_mean(s_pl)
shape: (4,)
Series: '' [f64]
[
null
1.5
2.5
3.5
]
>>> agnostic_rolling_mean(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
null,
1.5,
2.5,
3.5
]
]
rolling_std(window_size, *, min_periods=None, center=False, ddof=1)
Apply a rolling standard deviation (moving standard deviation) over the values.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
A window of length window_size
will traverse the values. The resulting values
will be aggregated to their standard deviation.
The window at a given row will include the row itself and the window_size - 1
elements before it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_size
|
int
|
The length of the window in number of elements. It must be a strictly positive integer. |
required |
min_periods
|
int | None
|
The number of values in the window that should be non-null before
computing a result. If set to |
None
|
center
|
bool
|
Set the labels at the center of the window. |
False
|
ddof
|
int
|
Delta Degrees of Freedom; the divisor for a length N window is N - ddof. |
1
|
Returns:
Type | Description |
---|---|
Self
|
A new series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1.0, 3.0, 1.0, 4.0]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_rolling_std(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.rolling_std(window_size=2, min_periods=1).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_rolling_std
:
>>> agnostic_rolling_std(s_pd)
0 NaN
1 1.414214
2 1.414214
3 2.121320
dtype: float64
>>> agnostic_rolling_std(s_pl)
shape: (4,)
Series: '' [f64]
[
null
1.414214
1.414214
2.12132
]
>>> agnostic_rolling_std(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
nan,
1.4142135623730951,
1.4142135623730951,
2.1213203435596424
]
]
rolling_sum(window_size, *, min_periods=None, center=False)
Apply a rolling sum (moving sum) over the values.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
A window of length window_size
will traverse the values. The resulting values
will be aggregated to their sum.
The window at a given row will include the row itself and the window_size - 1
elements before it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_size
|
int
|
The length of the window in number of elements. It must be a strictly positive integer. |
required |
min_periods
|
int | None
|
The number of values in the window that should be non-null before
computing a result. If set to |
None
|
center
|
bool
|
Set the labels at the center of the window. |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1.0, 2.0, 3.0, 4.0]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_rolling_sum(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.rolling_sum(window_size=2).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_rolling_sum
:
>>> agnostic_rolling_sum(s_pd)
0 NaN
1 3.0
2 5.0
3 7.0
dtype: float64
>>> agnostic_rolling_sum(s_pl)
shape: (4,)
Series: '' [f64]
[
null
3.0
5.0
7.0
]
>>> agnostic_rolling_sum(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
null,
3,
5,
7
]
]
rolling_var(window_size, *, min_periods=None, center=False, ddof=1)
Apply a rolling variance (moving variance) over the values.
Warning
This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.
A window of length window_size
will traverse the values. The resulting values
will be aggregated to their variance.
The window at a given row will include the row itself and the window_size - 1
elements before it.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
window_size
|
int
|
The length of the window in number of elements. It must be a strictly positive integer. |
required |
min_periods
|
int | None
|
The number of values in the window that should be non-null before
computing a result. If set to |
None
|
center
|
bool
|
Set the labels at the center of the window. |
False
|
ddof
|
int
|
Delta Degrees of Freedom; the divisor for a length N window is N - ddof. |
1
|
Returns:
Type | Description |
---|---|
Self
|
A new series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1.0, 3.0, 1.0, 4.0]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_rolling_var(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.rolling_var(window_size=2, min_periods=1).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_rolling_var
:
>>> agnostic_rolling_var(s_pd)
0 NaN
1 2.0
2 2.0
3 4.5
dtype: float64
>>> agnostic_rolling_var(s_pl)
shape: (4,)
Series: '' [f64]
[
null
2.0
2.0
4.5
]
>>> agnostic_rolling_var(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
nan,
2,
2,
4.5
]
]
round(decimals=0)
Round underlying floating point data by decimals
digits.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
decimals
|
int
|
Number of decimals to round by. |
0
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with rounded values. |
Notes
For values exactly halfway between rounded decimal values pandas behaves differently than Polars and Arrow.
pandas rounds to the nearest even value (e.g. -0.5 and 0.5 round to 0.0, 1.5 and 2.5 round to 2.0, 3.5 and 4.5 to 4.0, etc..).
Polars and Arrow round away from 0 (e.g. -0.5 to -1.0, 0.5 to 1.0, 1.5 to 2.0, 2.5 to 3.0, etc..).
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1.12345, 2.56789, 3.901234]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function that rounds to the first decimal:
>>> def agnostic_round(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.round(1).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_round
:
>>> agnostic_round(s_pd)
0 1.1
1 2.6
2 3.9
dtype: float64
>>> agnostic_round(s_pl)
shape: (3,)
Series: '' [f64]
[
1.1
2.6
3.9
]
>>> agnostic_round(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1.1,
2.6,
3.9
]
]
sample(n=None, *, fraction=None, with_replacement=False, seed=None)
Sample randomly from this Series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int | None
|
Number of items to return. Cannot be used with fraction. |
None
|
fraction
|
float | None
|
Fraction of items to return. Cannot be used with n. |
None
|
with_replacement
|
bool
|
Allow values to be sampled more than once. |
False
|
seed
|
int | None
|
Seed for the random number generator. If set to None (default), a random seed is generated for each sample operation. |
None
|
Returns:
Type | Description |
---|---|
Self
|
A new Series containing randomly sampled values from the original Series. |
Notes
The sample
method returns a Series with a specified number of
randomly selected items chosen from this Series.
The results are not consistent across libraries.
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3, 4]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_sample(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.sample(fraction=1.0, with_replacement=True).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_sample
:
>>> agnostic_sample(s_pd)
a
2 3
1 2
3 4
3 4
>>> agnostic_sample(s_pl)
shape: (4,)
Series: '' [i64]
[
1
4
3
4
]
>>> agnostic_sample(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
4,
3,
4
]
]
scatter(indices, values)
Set value(s) at given position(s).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
indices
|
int | Sequence[int]
|
Position(s) to set items at. |
required |
values
|
Any
|
Values to set. |
required |
Returns:
Type | Description |
---|---|
Self
|
A new Series with values set at given positions. |
Note
This method always returns a new Series, without modifying the original one. Using this function in a for-loop is an anti-pattern, we recommend building up your positions and values beforehand and doing an update in one go.
For example, instead of
for i in [1, 3, 2]:
value = some_function(i)
s = s.scatter(i, value)
prefer
positions = [1, 3, 2]
values = [some_function(x) for x in positions]
s = s.scatter(positions, values)
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>> data = {"a": [1, 2, 3], "b": [4, 5, 6]}
>>> df_pd = pd.DataFrame(data)
>>> df_pl = pl.DataFrame(data)
>>> df_pa = pa.table(data)
We define a library agnostic function:
>>> def agnostic_scatter(df_native: IntoFrameT) -> IntoFrameT:
... df = nw.from_native(df_native)
... return df.with_columns(df["a"].scatter([0, 1], [999, 888])).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_scatter
:
>>> agnostic_scatter(df_pd)
a b
0 999 4
1 888 5
2 3 6
>>> agnostic_scatter(df_pl)
shape: (3, 2)
┌─────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 999 ┆ 4 │
│ 888 ┆ 5 │
│ 3 ┆ 6 │
└─────┴─────┘
>>> agnostic_scatter(df_pa)
pyarrow.Table
a: int64
b: int64
----
a: [[999,888,3]]
b: [[4,5,6]]
shift(n)
Shift values by n
positions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
Number of indices to shift forward. If a negative value is passed, values are shifted in the opposite direction instead. |
required |
Returns:
Type | Description |
---|---|
Self
|
A new Series with values shifted by n positions. |
Notes
pandas may change the dtype here, for example when introducing missing
values in an integer column. To ensure, that the dtype doesn't change,
you may want to use fill_null
and cast
. For example, to shift
and fill missing values with 0
in a Int64 column, you could
do:
s.shift(1).fill_null(0).cast(nw.Int64)
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [2, 4, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a dataframe-agnostic function:
>>> def agnostic_shift(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.shift(1).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_shift
:
>>> agnostic_shift(s_pd)
0 NaN
1 2.0
2 4.0
dtype: float64
>>> agnostic_shift(s_pl)
shape: (3,)
Series: '' [i64]
[
null
2
4
]
>>> agnostic_shift(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
null,
2,
4
]
]
sort(*, descending=False, nulls_last=False)
Sort this Series. Place null values first.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
descending
|
bool
|
Sort in descending order. |
False
|
nulls_last
|
bool
|
Place null values last instead of first. |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new sorted Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [5, None, 1, 2]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define library agnostic functions:
>>> def agnostic_sort(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.sort().to_native()
>>> def agnostic_sort_descending(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.sort(descending=True).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_sort
and agnostic_sort_descending
:
>>> agnostic_sort(s_pd)
1 NaN
2 1.0
3 2.0
0 5.0
dtype: float64
>>> agnostic_sort(s_pl)
shape: (4,)
Series: '' [i64]
[
null
1
2
5
]
>>> agnostic_sort(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
null,
1,
2,
5
]
]
>>> agnostic_sort_descending(s_pd)
1 NaN
0 5.0
3 2.0
2 1.0
dtype: float64
>>> agnostic_sort_descending(s_pl)
shape: (4,)
Series: '' [i64]
[
null
5
2
1
]
>>> agnostic_sort_descending(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
null,
5,
2,
1
]
]
skew()
Calculate the sample skewness of the Series.
Returns:
Type | Description |
---|---|
Any
|
The sample skewness of the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 1, 2, 10, 100]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_skew(s_native: IntoSeries) -> float:
... s = nw.from_native(s_native, series_only=True)
... return s.skew()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_skew
:
>>> agnostic_skew(s_pd)
np.float64(1.4724267269058975)
>>> agnostic_skew(s_pl)
1.4724267269058975
>>> agnostic_skew(s_pa)
1.4724267269058975
Notes
The skewness is a measure of the asymmetry of the probability distribution. A perfectly symmetric distribution has a skewness of 0.
std(*, ddof=1)
Get the standard deviation of this Series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ddof
|
int
|
"Delta Degrees of Freedom": the divisor used in the calculation is N - ddof, where N represents the number of elements. |
1
|
Returns:
Type | Description |
---|---|
Any
|
The standard deviation of all elements in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_std(s_native: IntoSeries) -> float:
... s = nw.from_native(s_native, series_only=True)
... return s.std()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_std
:
>>> agnostic_std(s_pd)
np.float64(1.0)
>>> agnostic_std(s_pl)
1.0
>>> agnostic_std(s_pa)
1.0
sum()
Reduce this Series to the sum value.
Returns:
Type | Description |
---|---|
Any
|
The sum of all elements in the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_sum(s_native: IntoSeries):
... s = nw.from_native(s_native, series_only=True)
... return s.sum()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_sum
:
>>> agnostic_sum(s_pd)
np.int64(6)
>>> agnostic_sum(s_pl)
6
>>> agnostic_sum(s_pa)
6
tail(n=10)
Get the last n
rows.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n
|
int
|
Number of rows to return. |
10
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with the last n rows. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = list(range(10))
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function that returns the last 3 rows:
>>> def agnostic_tail(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.tail(3).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_tail
:
>>> agnostic_tail(s_pd)
7 7
8 8
9 9
dtype: int64
>>> agnostic_tail(s_pl)
shape: (3,)
Series: '' [i64]
[
7
8
9
]
>>> agnostic_tail(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
7,
8,
9
]
]
to_arrow()
Convert to arrow.
Returns:
Type | Description |
---|---|
Array
|
A PyArrow Array containing the data from the Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3, 4]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function that converts to arrow:
>>> def agnostic_to_arrow(s_native: IntoSeries) -> pa.Array:
... s = nw.from_native(s_native, series_only=True)
... return s.to_arrow()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_arrow
:
>>> agnostic_to_arrow(s_pd)
<pyarrow.lib.Int64Array object at ...>
[
1,
2,
3,
4
]
>>> agnostic_to_arrow(s_pl)
<pyarrow.lib.Int64Array object at ...>
[
1,
2,
3,
4
]
>>> agnostic_to_arrow(s_pa)
<pyarrow.lib.Int64Array object at ...>
[
1,
2,
3,
4
]
to_dummies(*, separator='_', drop_first=False)
Get dummy/indicator variables.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
separator
|
str
|
Separator/delimiter used when generating column names. |
'_'
|
drop_first
|
bool
|
Remove the first category from the variable being encoded. |
False
|
Returns:
Type | Description |
---|---|
DataFrame[Any]
|
A new DataFrame containing the dummy/indicator variables. |
Notes
pandas and Polars handle null values differently. Polars distinguishes between NaN and Null, whereas pandas doesn't.
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoDataFrame
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data, name="a")
>>> s_pl = pl.Series("a", data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_to_dummies(
... s_native: IntoSeries, drop_first: bool = False
... ) -> IntoDataFrame:
... s = nw.from_native(s_native, series_only=True)
... return s.to_dummies(drop_first=drop_first).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_dummies
:
>>> agnostic_to_dummies(s_pd)
a_1 a_2 a_3
0 1 0 0
1 0 1 0
2 0 0 1
>>> agnostic_to_dummies(s_pd, drop_first=True)
a_2 a_3
0 0 0
1 1 0
2 0 1
>>> agnostic_to_dummies(s_pl)
shape: (3, 3)
┌─────┬─────┬─────┐
│ a_1 ┆ a_2 ┆ a_3 │
│ --- ┆ --- ┆ --- │
│ i8 ┆ i8 ┆ i8 │
╞═════╪═════╪═════╡
│ 1 ┆ 0 ┆ 0 │
│ 0 ┆ 1 ┆ 0 │
│ 0 ┆ 0 ┆ 1 │
└─────┴─────┴─────┘
>>> agnostic_to_dummies(s_pl, drop_first=True)
shape: (3, 2)
┌─────┬─────┐
│ a_2 ┆ a_3 │
│ --- ┆ --- │
│ i8 ┆ i8 │
╞═════╪═════╡
│ 0 ┆ 0 │
│ 1 ┆ 0 │
│ 0 ┆ 1 │
└─────┴─────┘
>>> agnostic_to_dummies(s_pa)
pyarrow.Table
_1: int8
_2: int8
_3: int8
----
_1: [[1,0,0]]
_2: [[0,1,0]]
_3: [[0,0,1]]
>>> agnostic_to_dummies(s_pa, drop_first=True)
pyarrow.Table
_2: int8
_3: int8
----
_2: [[0,1,0]]
_3: [[0,0,1]]
to_frame()
Convert to dataframe.
Returns:
Type | Description |
---|---|
DataFrame[Any]
|
A DataFrame containing this Series as a single column. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoDataFrame
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2]
>>> s_pd = pd.Series(data, name="a")
>>> s_pl = pl.Series("a", data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_to_frame(s_native: IntoSeries) -> IntoDataFrame:
... s = nw.from_native(s_native, series_only=True)
... return s.to_frame().to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_frame
:
>>> agnostic_to_frame(s_pd)
a
0 1
1 2
>>> agnostic_to_frame(s_pl)
shape: (2, 1)
┌─────┐
│ a │
│ --- │
│ i64 │
╞═════╡
│ 1 │
│ 2 │
└─────┘
>>> agnostic_to_frame(s_pa)
pyarrow.Table
: int64
----
: [[1,2]]
to_list()
Convert to list.
Notes
This function converts to Python scalars. It's typically more efficient to keep your data in the format native to your original dataframe, so we recommend only calling this when you absolutely need to.
Returns:
Type | Description |
---|---|
list[Any]
|
A list of Python objects. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_to_list(s_native: IntoSeries):
... s = nw.from_native(s_native, series_only=True)
... return s.to_list()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_list
:
>>> agnostic_to_list(s_pd)
[1, 2, 3]
>>> agnostic_to_list(s_pl)
[1, 2, 3]
>>> agnostic_to_list(s_pa)
[1, 2, 3]
to_numpy()
Convert to numpy.
Returns:
Type | Description |
---|---|
ndarray
|
NumPy ndarray representation of the Series. |
Examples:
>>> import numpy as np
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data, name="a")
>>> s_pl = pl.Series("a", data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_to_numpy(s_native: IntoSeries) -> np.ndarray:
... s = nw.from_native(s_native, series_only=True)
... return s.to_numpy()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_numpy
:
>>> agnostic_to_numpy(s_pd)
array([1, 2, 3]...)
>>> agnostic_to_numpy(s_pl)
array([1, 2, 3]...)
>>> agnostic_to_numpy(s_pa)
array([1, 2, 3]...)
to_pandas()
Convert to pandas Series.
Returns:
Type | Description |
---|---|
Series
|
A pandas Series containing the data from this Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data, name="a")
>>> s_pl = pl.Series("a", data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_to_pandas(s_native: IntoSeries) -> pd.Series:
... s = nw.from_native(s_native, series_only=True)
... return s.to_pandas()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_pandas
:
>>> agnostic_to_pandas(s_pd)
0 1
1 2
2 3
Name: a, dtype: int64
>>> agnostic_to_pandas(s_pl)
0 1
1 2
2 3
Name: a, dtype: int64
>>> agnostic_to_pandas(s_pa)
0 1
1 2
2 3
Name: , dtype: int64
to_polars()
Convert to polars Series.
Returns:
Type | Description |
---|---|
Series
|
A polars Series containing the data from this Series. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data, name="a")
>>> s_pl = pl.Series("a", data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_to_polars(s_native: IntoSeries) -> pd.Series:
... s = nw.from_native(s_native, series_only=True)
... return s.to_polars()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_polars
:
>>> agnostic_to_polars(s_pd)
shape: (3,)
Series: 'a' [i64]
[
1
2
3
]
>>> agnostic_to_polars(s_pl)
shape: (3,)
Series: 'a' [i64]
[
1
2
3
]
>>> agnostic_to_polars(s_pa)
shape: (3,)
Series: '' [i64]
[
1
2
3
]
to_native()
Convert Narwhals series to native series.
Returns:
Type | Description |
---|---|
IntoSeriesT
|
Series of class that user started with. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_to_native(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_to_native
:
>>> agnostic_to_native(s_pd)
0 1
1 2
2 3
dtype: int64
>>> agnostic_to_native(s_pl)
shape: (3,)
Series: '' [i64]
[
1
2
3
]
>>> agnostic_to_native(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
2,
3
]
]
unique(*, maintain_order=False)
Returns unique values of the series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
maintain_order
|
bool
|
Keep the same order as the original series. This may be more
expensive to compute. Settings this to |
False
|
Returns:
Type | Description |
---|---|
Self
|
A new Series with duplicate values removed. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [2, 4, 4, 6]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_unique(s_native: IntoSeriesT) -> IntoSeriesT:
... s = nw.from_native(s_native, series_only=True)
... return s.unique(maintain_order=True).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_unique
:
>>> agnostic_unique(s_pd)
0 2
1 4
2 6
dtype: int64
>>> agnostic_unique(s_pl)
shape: (3,)
Series: '' [i64]
[
2
4
6
]
>>> agnostic_unique(s_pa)
<pyarrow.lib.ChunkedArray object at ...>
[
[
2,
4,
6
]
]
value_counts(*, sort=False, parallel=False, name=None, normalize=False)
Count the occurrences of unique values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sort
|
bool
|
Sort the output by count in descending order. If set to False (default), the order of the output is random. |
False
|
parallel
|
bool
|
Execute the computation in parallel. Used for Polars only. |
False
|
name
|
str | None
|
Give the resulting count column a specific name; if |
None
|
normalize
|
bool
|
If true gives relative frequencies of the unique values |
False
|
Returns:
Type | Description |
---|---|
DataFrame[Any]
|
A DataFrame with two columns: |
DataFrame[Any]
|
|
DataFrame[Any]
|
|
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoDataFrame
>>> from narwhals.typing import IntoSeries
>>> data = [1, 1, 2, 3, 2]
>>> s_pd = pd.Series(data, name="s")
>>> s_pl = pl.Series(values=data, name="s")
>>> s_pa = pa.chunked_array([data])
Let's define a dataframe-agnostic function:
>>> def agnostic_value_counts(s_native: IntoSeries) -> IntoDataFrame:
... s = nw.from_native(s_native, series_only=True)
... return s.value_counts(sort=True).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_value_counts
:
>>> agnostic_value_counts(s_pd)
s count
0 1 2
1 2 2
2 3 1
>>> agnostic_value_counts(s_pl)
shape: (3, 2)
┌─────┬───────┐
│ s ┆ count │
│ --- ┆ --- │
│ i64 ┆ u32 │
╞═════╪═══════╡
│ 1 ┆ 2 │
│ 2 ┆ 2 │
│ 3 ┆ 1 │
└─────┴───────┘
>>> agnostic_value_counts(s_pa)
pyarrow.Table
: int64
count: int64
----
: [[1,2,3]]
count: [[2,2,1]]
var(*, ddof=1)
Get the variance of this Series.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ddof
|
int
|
"Delta Degrees of Freedom": the divisor used in the calculation is N - ddof, where N represents the number of elements. |
1
|
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeries
>>> data = [1, 2, 3]
>>> s_pd = pd.Series(data)
>>> s_pl = pl.Series(data)
>>> s_pa = pa.chunked_array([data])
We define a library agnostic function:
>>> def agnostic_var(s_native: IntoSeries) -> float:
... s = nw.from_native(s_native, series_only=True)
... return s.var()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_var
:
>>> agnostic_var(s_pd)
np.float64(1.0)
>>> agnostic_var(s_pl)
1.0
>>> agnostic_var(s_pa)
1.0
zip_with(mask, other)
Take values from self or other based on the given mask.
Where mask evaluates true, take values from self. Where mask evaluates false, take values from other.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mask
|
Self
|
Boolean Series |
required |
other
|
Self
|
Series of same type. |
required |
Returns:
Type | Description |
---|---|
Self
|
A new Series with values selected from self or other based on the mask. |
Examples:
>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoSeriesT
>>> data = [1, 2, 3, 4, 5]
>>> other = [5, 4, 3, 2, 1]
>>> mask = [True, False, True, False, True]
Let's define a dataframe-agnostic function:
>>> def agnostic_zip_with(
... s1_native: IntoSeriesT, mask_native: IntoSeriesT, s2_native: IntoSeriesT
... ) -> IntoSeriesT:
... s1 = nw.from_native(s1_native, series_only=True)
... mask = nw.from_native(mask_native, series_only=True)
... s2 = nw.from_native(s2_native, series_only=True)
... return s1.zip_with(mask, s2).to_native()
We can then pass any supported library such as pandas, Polars, or
PyArrow to agnostic_zip_with
:
>>> agnostic_zip_with(
... s1_native=pl.Series(data),
... mask_native=pl.Series(mask),
... s2_native=pl.Series(other),
... )
shape: (5,)
Series: '' [i64]
[
1
4
3
2
5
]
>>> agnostic_zip_with(
... s1_native=pd.Series(data),
... mask_native=pd.Series(mask),
... s2_native=pd.Series(other),
... )
0 1
1 4
2 3
3 2
4 5
dtype: int64
>>> agnostic_zip_with(
... s1_native=pa.chunked_array([data]),
... mask_native=pa.chunked_array([mask]),
... s2_native=pa.chunked_array([other]),
... )
<pyarrow.lib.ChunkedArray object at ...>
[
[
1,
4,
3,
2,
5
]
]