Skip to content

Top-level functions

Here are the top-level functions available in Narwhals.

all() -> Expr

Instantiate an expression representing all columns.

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import pandas as pd
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2, 3], "b": [4, 5, 6]}
>>> df_pd = pd.DataFrame(data)
>>> df_pl = pl.DataFrame(data)
>>> df_pa = pa.table(data)

Let's define a dataframe-agnostic function:

>>> def agnostic_all(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.all() * 2).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_all:

>>> agnostic_all(df_pd)
   a   b
0  2   8
1  4  10
2  6  12
>>> agnostic_all(df_pl)
shape: (3, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 2   ┆ 8   │
│ 4   ┆ 10  │
│ 6   ┆ 12  │
└─────┴─────┘
>>> agnostic_all(df_pa)
pyarrow.Table
a: int64
b: int64
----
a: [[2,4,6]]
b: [[8,10,12]]

all_horizontal(*exprs: IntoExpr | Iterable[IntoExpr]) -> Expr

Compute the bitwise AND horizontally across columns.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {
...     "a": [False, False, True, True, False, None],
...     "b": [False, True, True, None, None, None],
... }
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data).convert_dtypes(dtype_backend="pyarrow")
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_all_horizontal(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select("a", "b", all=nw.all_horizontal("a", "b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_all_horizontal:

>>> agnostic_all_horizontal(df_pd)
       a      b    all
0  False  False  False
1  False   True  False
2   True   True   True
3   True   <NA>   <NA>
4  False   <NA>  False
5   <NA>   <NA>   <NA>
>>> agnostic_all_horizontal(df_pl)
shape: (6, 3)
┌───────┬───────┬───────┐
│ a     ┆ b     ┆ all   │
│ ---   ┆ ---   ┆ ---   │
│ bool  ┆ bool  ┆ bool  │
╞═══════╪═══════╪═══════╡
│ false ┆ false ┆ false │
│ false ┆ true  ┆ false │
│ true  ┆ true  ┆ true  │
│ true  ┆ null  ┆ null  │
│ false ┆ null  ┆ false │
│ null  ┆ null  ┆ null  │
└───────┴───────┴───────┘
>>> agnostic_all_horizontal(df_pa)
pyarrow.Table
a: bool
b: bool
all: bool
----
a: [[false,false,true,true,false,null]]
b: [[false,true,true,null,null,null]]
all: [[false,false,true,null,false,null]]

any_horizontal(*exprs: IntoExpr | Iterable[IntoExpr]) -> Expr

Compute the bitwise OR horizontally across columns.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {
...     "a": [False, False, True, True, False, None],
...     "b": [False, True, True, None, None, None],
... }
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data).convert_dtypes(dtype_backend="pyarrow")
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_any_horizontal(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select("a", "b", any=nw.any_horizontal("a", "b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_any_horizontal:

>>> agnostic_any_horizontal(df_pd)
       a      b    any
0  False  False  False
1  False   True   True
2   True   True   True
3   True   <NA>   True
4  False   <NA>   <NA>
5   <NA>   <NA>   <NA>
>>> agnostic_any_horizontal(df_pl)
shape: (6, 3)
┌───────┬───────┬───────┐
│ a     ┆ b     ┆ any   │
│ ---   ┆ ---   ┆ ---   │
│ bool  ┆ bool  ┆ bool  │
╞═══════╪═══════╪═══════╡
│ false ┆ false ┆ false │
│ false ┆ true  ┆ true  │
│ true  ┆ true  ┆ true  │
│ true  ┆ null  ┆ true  │
│ false ┆ null  ┆ null  │
│ null  ┆ null  ┆ null  │
└───────┴───────┴───────┘
>>> agnostic_any_horizontal(df_pa)
pyarrow.Table
a: bool
b: bool
any: bool
----
a: [[false,false,true,true,false,null]]
b: [[false,true,true,null,null,null]]
any: [[false,true,true,true,null,null]]

col(*names: str | Iterable[str]) -> Expr

Creates an expression that references one or more columns by their name(s).

Parameters:

Name Type Description Default
names str | Iterable[str]

Name(s) of the columns to use.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2], "b": [3, 4]}
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_col(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.col("a") * nw.col("b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_col:

>>> agnostic_col(df_pd)
   a
0  3
1  8
>>> agnostic_col(df_pl)
shape: (2, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 3   │
│ 8   │
└─────┘
>>> agnostic_col(df_pa)
pyarrow.Table
a: int64
----
a: [[3,8]]

concat(items: Iterable[DataFrame[IntoDataFrameT] | LazyFrame[IntoFrameT]], *, how: Literal['horizontal', 'vertical', 'diagonal'] = 'vertical') -> DataFrame[IntoDataFrameT] | LazyFrame[IntoFrameT]

Concatenate multiple DataFrames, LazyFrames into a single entity.

Parameters:

Name Type Description Default
items Iterable[DataFrame[IntoDataFrameT] | LazyFrame[IntoFrameT]]

DataFrames, LazyFrames to concatenate.

required
how Literal['horizontal', 'vertical', 'diagonal']

concatenating strategy:

  • vertical: Concatenate vertically. Column names must match.
  • horizontal: Concatenate horizontally. If lengths don't match, then missing rows are filled with null values.
  • diagonal: Finds a union between the column schemas and fills missing column values with null.
'vertical'

Returns:

Type Description
DataFrame[IntoDataFrameT] | LazyFrame[IntoFrameT]

A new DataFrame, Lazyframe resulting from the concatenation.

Raises:

Type Description
TypeError

The items to concatenate should either all be eager, or all lazy

Examples:

Let's take an example of vertical concatenation:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> data_1 = {"a": [1, 2, 3], "b": [4, 5, 6]}
>>> data_2 = {"a": [5, 2], "b": [1, 4]}
>>> df_pd_1 = pd.DataFrame(data_1)
>>> df_pd_2 = pd.DataFrame(data_2)
>>> df_pl_1 = pl.DataFrame(data_1)
>>> df_pl_2 = pl.DataFrame(data_2)

Let's define a dataframe-agnostic function:

>>> @nw.narwhalify
... def agnostic_vertical_concat(df1, df2):
...     return nw.concat([df1, df2], how="vertical")
>>> agnostic_vertical_concat(df_pd_1, df_pd_2)
   a  b
0  1  4
1  2  5
2  3  6
0  5  1
1  2  4
>>> agnostic_vertical_concat(df_pl_1, df_pl_2)
shape: (5, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 4   │
│ 2   ┆ 5   │
│ 3   ┆ 6   │
│ 5   ┆ 1   │
│ 2   ┆ 4   │
└─────┴─────┘

Let's look at case a for horizontal concatenation:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> data_1 = {"a": [1, 2, 3], "b": [4, 5, 6]}
>>> data_2 = {"c": [5, 2], "d": [1, 4]}
>>> df_pd_1 = pd.DataFrame(data_1)
>>> df_pd_2 = pd.DataFrame(data_2)
>>> df_pl_1 = pl.DataFrame(data_1)
>>> df_pl_2 = pl.DataFrame(data_2)

Defining a dataframe-agnostic function:

>>> @nw.narwhalify
... def agnostic_horizontal_concat(df1, df2):
...     return nw.concat([df1, df2], how="horizontal")
>>> agnostic_horizontal_concat(df_pd_1, df_pd_2)
   a  b    c    d
0  1  4  5.0  1.0
1  2  5  2.0  4.0
2  3  6  NaN  NaN
>>> agnostic_horizontal_concat(df_pl_1, df_pl_2)
shape: (3, 4)
┌─────┬─────┬──────┬──────┐
│ a   ┆ b   ┆ c    ┆ d    │
│ --- ┆ --- ┆ ---  ┆ ---  │
│ i64 ┆ i64 ┆ i64  ┆ i64  │
╞═════╪═════╪══════╪══════╡
│ 1   ┆ 4   ┆ 5    ┆ 1    │
│ 2   ┆ 5   ┆ 2    ┆ 4    │
│ 3   ┆ 6   ┆ null ┆ null │
└─────┴─────┴──────┴──────┘

Let's look at case a for diagonal concatenation:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> data_1 = {"a": [1, 2], "b": [3.5, 4.5]}
>>> data_2 = {"a": [3, 4], "z": ["x", "y"]}
>>> df_pd_1 = pd.DataFrame(data_1)
>>> df_pd_2 = pd.DataFrame(data_2)
>>> df_pl_1 = pl.DataFrame(data_1)
>>> df_pl_2 = pl.DataFrame(data_2)

Defining a dataframe-agnostic function:

>>> @nw.narwhalify
... def agnostic_diagonal_concat(df1, df2):
...     return nw.concat([df1, df2], how="diagonal")
>>> agnostic_diagonal_concat(df_pd_1, df_pd_2)
   a    b    z
0  1  3.5  NaN
1  2  4.5  NaN
0  3  NaN    x
1  4  NaN    y
>>> agnostic_diagonal_concat(df_pl_1, df_pl_2)
shape: (4, 3)
┌─────┬──────┬──────┐
│ a   ┆ b    ┆ z    │
│ --- ┆ ---  ┆ ---  │
│ i64 ┆ f64  ┆ str  │
╞═════╪══════╪══════╡
│ 1   ┆ 3.5  ┆ null │
│ 2   ┆ 4.5  ┆ null │
│ 3   ┆ null ┆ x    │
│ 4   ┆ null ┆ y    │
└─────┴──────┴──────┘

concat_str(exprs: IntoExpr | Iterable[IntoExpr], *more_exprs: IntoExpr, separator: str = '', ignore_nulls: bool = False) -> Expr

Horizontally concatenate columns into a single string column.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Columns to concatenate into a single string column. Accepts expression input. Strings are parsed as column names, other non-expression inputs are parsed as literals. Non-String columns are cast to String.

required
*more_exprs IntoExpr

Additional columns to concatenate into a single string column, specified as positional arguments.

()
separator str

String that will be used to separate the values of each column.

''
ignore_nulls bool

Ignore null values (default is False). If set to False, null values will be propagated and if the row contains any null values, the output is null.

False

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {
...     "a": [1, 2, 3],
...     "b": ["dogs", "cats", None],
...     "c": ["play", "swim", "walk"],
... }

We define a dataframe-agnostic function that computes the horizontal string concatenation of different columns

>>> def agnostic_concat_str(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(
...         nw.concat_str(
...             [
...                 nw.col("a") * 2,
...                 nw.col("b"),
...                 nw.col("c"),
...             ],
...             separator=" ",
...         ).alias("full_sentence")
...     ).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_concat_str:

>>> agnostic_concat_str(pd.DataFrame(data))
  full_sentence
0   2 dogs play
1   4 cats swim
2          None
>>> agnostic_concat_str(pl.DataFrame(data))
shape: (3, 1)
┌───────────────┐
│ full_sentence │
│ ---           │
│ str           │
╞═══════════════╡
│ 2 dogs play   │
│ 4 cats swim   │
│ null          │
└───────────────┘
>>> agnostic_concat_str(pa.table(data))
pyarrow.Table
full_sentence: string
----
full_sentence: [["2 dogs play","4 cats swim",null]]

from_arrow(native_frame: ArrowStreamExportable, *, native_namespace: ModuleType) -> DataFrame[Any]

Construct a DataFrame from an object which supports the PyCapsule Interface.

Parameters:

Name Type Description Default
native_frame ArrowStreamExportable

Object which implements __arrow_c_stream__.

required
native_namespace ModuleType

The native library to use for DataFrame creation.

required

Returns:

Type Description
DataFrame[Any]

A new DataFrame.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>> data = {"a": [1, 2, 3], "b": [4, 5, 6]}

Let's define a dataframe-agnostic function which creates a PyArrow Table.

>>> def agnostic_to_arrow(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return nw.from_arrow(df, native_namespace=pa).to_native()

Let's see what happens when passing pandas / Polars input:

>>> agnostic_to_arrow(pd.DataFrame(data))
pyarrow.Table
a: int64
b: int64
----
a: [[1,2,3]]
b: [[4,5,6]]
>>> agnostic_to_arrow(pl.DataFrame(data))
pyarrow.Table
a: int64
b: int64
----
a: [[1,2,3]]
b: [[4,5,6]]

from_dict(data: dict[str, Any], schema: dict[str, DType] | Schema | None = None, *, backend: ModuleType | Implementation | str | None = None, native_namespace: ModuleType | None = None) -> DataFrame[Any]

Instantiate DataFrame from dictionary.

Indexes (if present, for pandas-like backends) are aligned following the left-hand-rule.

Notes

For pandas-like dataframes, conversion to schema is applied after dataframe creation.

Parameters:

Name Type Description Default
data dict[str, Any]

Dictionary to create DataFrame from.

required
schema dict[str, DType] | Schema | None

The DataFrame schema as Schema or dict of {name: type}.

None
backend ModuleType | Implementation | str | None

specifies which eager backend instantiate to. Only necessary if inputs are not Narwhals Series.

`backend` can be specified in various ways:

- As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`,
    `POLARS`, `MODIN` or `CUDF`.
- As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`.
- Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.26.0): Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None

Returns:

Type Description
DataFrame[Any]

A new DataFrame.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT

Let's create a new dataframe and specify the backend argument.

>>> def agnostic_from_dict(backend: str) -> IntoFrameT:
...     data = {"c": [5, 2], "d": [1, 4]}
...     return nw.from_dict(data, backend=backend).to_native()

Let's see what happens when passing pandas, Polars or PyArrow input:

>>> agnostic_from_dict(backend="pandas")
   c  d
0  5  1
1  2  4
>>> agnostic_from_dict(backend="polars")
shape: (2, 2)
┌─────┬─────┐
│ c   ┆ d   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 5   ┆ 1   │
│ 2   ┆ 4   │
└─────┴─────┘
>>> agnostic_from_dict(backend="pyarrow")
pyarrow.Table
c: int64
d: int64
----
c: [[5,2]]
d: [[1,4]]

from_native(native_object: IntoFrameT | IntoSeriesT | IntoFrame | IntoSeries | T, *, strict: bool | None = None, pass_through: bool | None = None, eager_only: bool = False, series_only: bool = False, allow_series: bool | None = None) -> LazyFrame[IntoFrameT] | DataFrame[IntoFrameT] | Series[IntoSeriesT] | T

Convert native_object to Narwhals Dataframe, Lazyframe, or Series.

Parameters:

Name Type Description Default
native_object IntoFrameT | IntoSeriesT | IntoFrame | IntoSeries | T

Raw object from user. Depending on the other arguments, input object can be:

  • a Dataframe / Lazyframe / Series supported by Narwhals (pandas, Polars, PyArrow, ...)
  • an object which implements __narwhals_dataframe__, __narwhals_lazyframe__, or __narwhals_series__
required
strict bool | None

Determine what happens if the object can't be converted to Narwhals:

  • True or None (default): raise an error
  • False: pass object through as-is

Deprecated (v1.13.0): Please use pass_through instead. Note that strict is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
pass_through bool | None

Determine what happens if the object can't be converted to Narwhals:

  • False or None (default): raise an error
  • True: pass object through as-is
None
eager_only bool

Whether to only allow eager objects:

  • False (default): don't require native_object to be eager
  • True: only convert to Narwhals if native_object is eager
False
series_only bool

Whether to only allow Series:

  • False (default): don't require native_object to be a Series
  • True: only convert to Narwhals if native_object is a Series
False
allow_series bool | None

Whether to allow Series (default is only Dataframe / Lazyframe):

  • False or None (default): don't convert to Narwhals if native_object is a Series
  • True: allow native_object to be a Series
None

Returns:

Type Description
LazyFrame[IntoFrameT] | DataFrame[IntoFrameT] | Series[IntoSeriesT] | T

DataFrame, LazyFrame, Series, or original object, depending on which combination of parameters was passed.

from_numpy(data: np.ndarray, schema: dict[str, DType] | Schema | list[str] | None = None, *, native_namespace: ModuleType) -> DataFrame[Any]

Construct a DataFrame from a NumPy ndarray.

Notes

Only row orientation is currently supported.

For pandas-like dataframes, conversion to schema is applied after dataframe creation.

Parameters:

Name Type Description Default
data ndarray

Two-dimensional data represented as a NumPy ndarray.

required
schema dict[str, DType] | Schema | list[str] | None

The DataFrame schema as Schema, dict of {name: type}, or a list of str.

None
native_namespace ModuleType

The native library to use for DataFrame creation.

required

Returns:

Type Description
DataFrame[Any]

A new DataFrame.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> import numpy as np
>>> from narwhals.typing import IntoFrameT
>>> data = {"a": [1, 2], "b": [3, 4]}

Let's create a new dataframe of the same class as the dataframe we started with, from a NumPy ndarray of new data:

>>> def agnostic_from_numpy(df_native: IntoFrameT) -> IntoFrameT:
...     new_data = np.array([[5, 2, 1], [1, 4, 3]])
...     df = nw.from_native(df_native)
...     native_namespace = nw.get_native_namespace(df)
...     return nw.from_numpy(
...         new_data, native_namespace=native_namespace
...     ).to_native()

Let's see what happens when passing pandas, Polars or PyArrow input:

>>> agnostic_from_numpy(pd.DataFrame(data))
   column_0  column_1  column_2
0         5         2         1
1         1         4         3
>>> agnostic_from_numpy(pl.DataFrame(data))
shape: (2, 3)
┌──────────┬──────────┬──────────┐
│ column_0 ┆ column_1 ┆ column_2 │
│ ---      ┆ ---      ┆ ---      │
│ i64      ┆ i64      ┆ i64      │
╞══════════╪══════════╪══════════╡
│ 5        ┆ 2        ┆ 1        │
│ 1        ┆ 4        ┆ 3        │
└──────────┴──────────┴──────────┘
>>> agnostic_from_numpy(pa.table(data))
pyarrow.Table
column_0: int64
column_1: int64
column_2: int64
----
column_0: [[5,1]]
column_1: [[2,4]]
column_2: [[1,3]]

Let's specify the column names:

>>> def agnostic_from_numpy(df_native: IntoFrameT) -> IntoFrameT:
...     new_data = np.array([[5, 2, 1], [1, 4, 3]])
...     schema = ["c", "d", "e"]
...     df = nw.from_native(df_native)
...     native_namespace = nw.get_native_namespace(df)
...     return nw.from_numpy(
...         new_data, native_namespace=native_namespace, schema=schema
...     ).to_native()

Let's see the modified outputs:

>>> agnostic_from_numpy(pd.DataFrame(data))
   c  d  e
0  5  2  1
1  1  4  3
>>> agnostic_from_numpy(pl.DataFrame(data))
shape: (2, 3)
┌─────┬─────┬─────┐
│ c   ┆ d   ┆ e   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 5   ┆ 2   ┆ 1   │
│ 1   ┆ 4   ┆ 3   │
└─────┴─────┴─────┘
>>> agnostic_from_numpy(pa.table(data))
pyarrow.Table
c: int64
d: int64
e: int64
----
c: [[5,1]]
d: [[2,4]]
e: [[1,3]]

Let's modify the function so that it specifies the schema:

>>> def agnostic_from_numpy(df_native: IntoFrameT) -> IntoFrameT:
...     new_data = np.array([[5, 2, 1], [1, 4, 3]])
...     schema = {"c": nw.Int16(), "d": nw.Float32(), "e": nw.Int8()}
...     df = nw.from_native(df_native)
...     native_namespace = nw.get_native_namespace(df)
...     return nw.from_numpy(
...         new_data, native_namespace=native_namespace, schema=schema
...     ).to_native()

Let's see the outputs:

>>> agnostic_from_numpy(pd.DataFrame(data))
   c    d  e
0  5  2.0  1
1  1  4.0  3
>>> agnostic_from_numpy(pl.DataFrame(data))
shape: (2, 3)
┌─────┬─────┬─────┐
│ c   ┆ d   ┆ e   │
│ --- ┆ --- ┆ --- │
│ i16 ┆ f32 ┆ i8  │
╞═════╪═════╪═════╡
│ 5   ┆ 2.0 ┆ 1   │
│ 1   ┆ 4.0 ┆ 3   │
└─────┴─────┴─────┘
>>> agnostic_from_numpy(pa.table(data))
pyarrow.Table
c: int16
d: float
e: int8
----
c: [[5,1]]
d: [[2,4]]
e: [[1,3]]

generate_temporary_column_name(n_bytes: int, columns: list[str]) -> str

Generates a unique column name that is not present in the given list of columns.

It relies on python secrets token_hex function to return a string nbytes random bytes.

Parameters:

Name Type Description Default
n_bytes int

The number of bytes to generate for the token.

required
columns list[str]

The list of columns to check for uniqueness.

required

Returns:

Type Description
str

A unique token that is not present in the given list of columns.

Raises:

Type Description
AssertionError

If a unique token cannot be generated after 100 attempts.

Examples:

>>> import narwhals as nw
>>> columns = ["abc", "xyz"]
>>> nw.generate_temporary_column_name(n_bytes=8, columns=columns) not in columns
True

get_level(obj: DataFrame[Any] | LazyFrame[Any] | Series[IntoSeriesT]) -> Literal['full', 'lazy', 'interchange']

Level of support Narwhals has for current object.

Parameters:

Name Type Description Default
obj DataFrame[Any] | LazyFrame[Any] | Series[IntoSeriesT]

Dataframe or Series.

required

Returns:

Type Description
Literal['full', 'lazy', 'interchange']

This can be one of:

  • 'full': full Narwhals API support
  • 'lazy': only lazy operations are supported. This excludes anything which involves iterating over rows in Python.
  • 'interchange': only metadata operations are supported (df.schema)

get_native_namespace(obj: DataFrame[Any] | LazyFrame[Any] | Series[Any] | pd.DataFrame | pd.Series | pl.DataFrame | pl.LazyFrame | pl.Series | pa.Table | pa.ChunkedArray) -> Any

Get native namespace from object.

Parameters:

Name Type Description Default
obj DataFrame[Any] | LazyFrame[Any] | Series[Any] | DataFrame | Series | DataFrame | LazyFrame | Series | Table | ChunkedArray

Dataframe, Lazyframe, or Series.

required

Returns:

Type Description
Any

Native module.

Examples:

>>> import polars as pl
>>> import pandas as pd
>>> import narwhals as nw
>>> df = nw.from_native(pd.DataFrame({"a": [1, 2, 3]}))
>>> nw.get_native_namespace(df)
<module 'pandas'...>
>>> df = nw.from_native(pl.DataFrame({"a": [1, 2, 3]}))
>>> nw.get_native_namespace(df)
<module 'polars'...>

is_ordered_categorical(series: Series[Any]) -> bool

Return whether indices of categories are semantically meaningful.

This is a convenience function to accessing what would otherwise be the is_ordered property from the DataFrame Interchange Protocol, see https://data-apis.org/dataframe-protocol/latest/API.html.

  • For Polars:
  • Enums are always ordered.
  • Categoricals are ordered if dtype.ordering == "physical".
  • For pandas-like APIs:
  • Categoricals are ordered if dtype.cat.ordered == True.
  • For PyArrow table:
  • Categoricals are ordered if dtype.type.ordered == True.

Parameters:

Name Type Description Default
series Series[Any]

Input Series.

required

Returns:

Type Description
bool

Whether the Series is an ordered categorical.

Examples:

>>> import narwhals as nw
>>> import pandas as pd
>>> import polars as pl
>>> data = ["x", "y"]
>>> s_pd = pd.Series(data, dtype=pd.CategoricalDtype(ordered=True))
>>> s_pl = pl.Series(data, dtype=pl.Categorical(ordering="physical"))

Let's define a library-agnostic function:

>>> @nw.narwhalify
... def func(s):
...     return nw.is_ordered_categorical(s)

Then, we can pass any supported library to func:

>>> func(s_pd)
True
>>> func(s_pl)
True

len() -> Expr

Return the number of rows.

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import pandas as pd
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2], "b": [5, 10]}
>>> df_pd = pd.DataFrame(data)
>>> df_pl = pl.DataFrame(data)
>>> df_pa = pa.table(data)

Let's define a dataframe-agnostic function:

>>> def agnostic_len(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.len()).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_len:

>>> agnostic_len(df_pd)
   len
0    2
>>> agnostic_len(df_pl)
shape: (1, 1)
┌─────┐
│ len │
│ --- │
│ u32 │
╞═════╡
│ 2   │
└─────┘
>>> agnostic_len(df_pa)
pyarrow.Table
len: int64
----
len: [[2]]

lit(value: Any, dtype: DType | type[DType] | None = None) -> Expr

Return an expression representing a literal value.

Parameters:

Name Type Description Default
value Any

The value to use as literal.

required
dtype DType | type[DType] | None

The data type of the literal value. If not provided, the data type will be inferred.

None

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2]}
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_lit(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.with_columns(nw.lit(3)).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_lit:

>>> agnostic_lit(df_pd)
   a  literal
0  1        3
1  2        3
>>> agnostic_lit(df_pl)
shape: (2, 2)
┌─────┬─────────┐
│ a   ┆ literal │
│ --- ┆ ---     │
│ i64 ┆ i32     │
╞═════╪═════════╡
│ 1   ┆ 3       │
│ 2   ┆ 3       │
└─────┴─────────┘
>>> agnostic_lit(df_pa)
pyarrow.Table
a: int64
literal: int64
----
a: [[1,2]]
literal: [[3,3]]

max(*columns: str) -> Expr

Return the maximum value.

Note

Syntactic sugar for nw.col(columns).max().

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import pandas as pd
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2], "b": [5, 10]}
>>> df_pd = pd.DataFrame(data)
>>> df_pl = pl.DataFrame(data)
>>> df_pa = pa.table(data)

Let's define a dataframe-agnostic function:

>>> def agnostic_max(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.max("a")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_max:

>>> agnostic_max(df_pd)
   a
0  2
>>> agnostic_max(df_pl)
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 2   │
└─────┘
>>> agnostic_max(df_pa)
pyarrow.Table
a: int64
----
a: [[2]]

max_horizontal(*exprs: IntoExpr | Iterable[IntoExpr]) -> Expr

Get the maximum value horizontally across columns.

Notes

We support max_horizontal over numeric columns only.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {
...     "a": [1, 8, 3],
...     "b": [4, 5, None],
...     "c": ["x", "y", "z"],
... }

We define a dataframe-agnostic function that computes the horizontal max of "a" and "b" columns:

>>> def agnostic_max_horizontal(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.max_horizontal("a", "b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_max_horizontal:

>>> agnostic_max_horizontal(pd.DataFrame(data))
     a
0  4.0
1  8.0
2  3.0
>>> agnostic_max_horizontal(pl.DataFrame(data))
shape: (3, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 4   │
│ 8   │
│ 3   │
└─────┘
>>> agnostic_max_horizontal(pa.table(data))
pyarrow.Table
a: int64
----
a: [[4,8,3]]

maybe_align_index(lhs: FrameOrSeriesT, rhs: Series[Any] | DataFrame[Any] | LazyFrame[Any]) -> FrameOrSeriesT

Align lhs to the Index of rhs, if they're both pandas-like.

Parameters:

Name Type Description Default
lhs FrameOrSeriesT

Dataframe or Series.

required
rhs Series[Any] | DataFrame[Any] | LazyFrame[Any]

Dataframe or Series to align with.

required

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this only checks that lhs and rhs are the same length.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2]}, index=[3, 4])
>>> s_pd = pd.Series([6, 7], index=[4, 3])
>>> df = nw.from_native(df_pd)
>>> s = nw.from_native(s_pd, series_only=True)
>>> nw.to_native(nw.maybe_align_index(df, s))
   a
4  2
3  1

maybe_convert_dtypes(obj: FrameOrSeriesT, *args: bool, **kwargs: bool | str) -> FrameOrSeriesT

Convert columns or series to the best possible dtypes using dtypes supporting pd.NA, if df is pandas-like.

Parameters:

Name Type Description Default
obj FrameOrSeriesT

DataFrame or Series.

required
*args bool

Additional arguments which gets passed through.

()
**kwargs bool | str

Additional arguments which gets passed through.

{}

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Notes

For non-pandas-like inputs, this is a no-op. Also, args and kwargs just get passed down to the underlying library as-is.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> import numpy as np
>>> df_pd = pd.DataFrame(
...     {
...         "a": pd.Series([1, 2, 3], dtype=np.dtype("int32")),
...         "b": pd.Series([True, False, np.nan], dtype=np.dtype("O")),
...     }
... )
>>> df = nw.from_native(df_pd)
>>> nw.to_native(
...     nw.maybe_convert_dtypes(df)
... ).dtypes
a             Int32
b           boolean
dtype: object

maybe_get_index(obj: DataFrame[Any] | LazyFrame[Any] | Series[Any]) -> Any | None

Get the index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name Type Description Default
obj DataFrame[Any] | LazyFrame[Any] | Series[Any]

Dataframe or Series.

required

Returns:

Type Description
Any | None

Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this returns None.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]})
>>> df = nw.from_native(df_pd)
>>> nw.maybe_get_index(df)
RangeIndex(start=0, stop=2, step=1)
>>> series_pd = pd.Series([1, 2])
>>> series = nw.from_native(series_pd, series_only=True)
>>> nw.maybe_get_index(series)
RangeIndex(start=0, stop=2, step=1)

maybe_reset_index(obj: FrameOrSeriesT) -> FrameOrSeriesT

Reset the index to the default integer index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name Type Description Default
obj FrameOrSeriesT

Dataframe or Series.

required

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already resets the index for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this is a no-op.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]}, index=([6, 7]))
>>> df = nw.from_native(df_pd)
>>> nw.to_native(nw.maybe_reset_index(df))
   a  b
0  1  4
1  2  5
>>> series_pd = pd.Series([1, 2])
>>> series = nw.from_native(series_pd, series_only=True)
>>> nw.maybe_get_index(series)
RangeIndex(start=0, stop=2, step=1)

maybe_set_index(obj: FrameOrSeriesT, column_names: str | list[str] | None = None, *, index: Series[IntoSeriesT] | list[Series[IntoSeriesT]] | None = None) -> FrameOrSeriesT

Set the index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name Type Description Default
obj FrameOrSeriesT

object for which maybe set the index (can be either a Narwhals DataFrame or Series).

required
column_names str | list[str] | None

name or list of names of the columns to set as index. For dataframes, only one of column_names and index can be specified but not both. If column_names is passed and df is a Series, then a ValueError is raised.

None
index Series[IntoSeriesT] | list[Series[IntoSeriesT]] | None

series or list of series to set as index.

None

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Raises:

Type Description
ValueError

If one of the following condition happens:

  • none of column_names and index are provided
  • both column_names and index are provided
  • column_names is provided and df is a Series
Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index.

For non-pandas-like inputs, this is a no-op.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]})
>>> df = nw.from_native(df_pd)
>>> nw.to_native(nw.maybe_set_index(df, "b"))
   a
b
4  1
5  2

mean(*columns: str) -> Expr

Get the mean value.

Note

Syntactic sugar for nw.col(columns).mean()

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 8, 3]}
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe agnostic function:

>>> def agnostic_mean(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.mean("a")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_mean:

>>> agnostic_mean(df_pd)
     a
0  4.0
>>> agnostic_mean(df_pl)
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ f64 │
╞═════╡
│ 4.0 │
└─────┘
>>> agnostic_mean(df_pa)
pyarrow.Table
a: double
----
a: [[4]]

mean_horizontal(*exprs: IntoExpr | Iterable[IntoExpr]) -> Expr

Compute the mean of all values horizontally across columns.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {
...     "a": [1, 8, 3],
...     "b": [4, 5, None],
...     "c": ["x", "y", "z"],
... }
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function that computes the horizontal mean of "a" and "b" columns:

>>> def agnostic_mean_horizontal(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.mean_horizontal("a", "b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_mean_horizontal:

>>> agnostic_mean_horizontal(df_pd)
     a
0  2.5
1  6.5
2  3.0
>>> agnostic_mean_horizontal(df_pl)
shape: (3, 1)
┌─────┐
│ a   │
│ --- │
│ f64 │
╞═════╡
│ 2.5 │
│ 6.5 │
│ 3.0 │
└─────┘
>>> agnostic_mean_horizontal(df_pa)
pyarrow.Table
a: double
----
a: [[2.5,6.5,3]]

median(*columns: str) -> Expr

Get the median value.

Notes
  • Syntactic sugar for nw.col(columns).median()
  • Results might slightly differ across backends due to differences in the underlying algorithms used to compute the median.

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [4, 5, 2]}
>>> df_pd = pd.DataFrame(data)
>>> df_pl = pl.DataFrame(data)
>>> df_pa = pa.table(data)

Let's define a dataframe agnostic function:

>>> def agnostic_median(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.median("a")).to_native()

We can then pass any supported library such as pandas, Polars, or PyArrow to agnostic_median:

>>> agnostic_median(df_pd)
     a
0  4.0
>>> agnostic_median(df_pl)
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ f64 │
╞═════╡
│ 4.0 │
└─────┘
>>> agnostic_median(df_pa)
pyarrow.Table
a: double
----
a: [[4]]

min(*columns: str) -> Expr

Return the minimum value.

Note

Syntactic sugar for nw.col(columns).min().

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import pandas as pd
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2], "b": [5, 10]}
>>> df_pd = pd.DataFrame(data)
>>> df_pl = pl.DataFrame(data)
>>> df_pa = pa.table(data)

Let's define a dataframe-agnostic function:

>>> def agnostic_min(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.min("b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_min:

>>> agnostic_min(df_pd)
   b
0  5
>>> agnostic_min(df_pl)
shape: (1, 1)
┌─────┐
│ b   │
│ --- │
│ i64 │
╞═════╡
│ 5   │
└─────┘
>>> agnostic_min(df_pa)
pyarrow.Table
b: int64
----
b: [[5]]

min_horizontal(*exprs: IntoExpr | Iterable[IntoExpr]) -> Expr

Get the minimum value horizontally across columns.

Notes

We support min_horizontal over numeric columns only.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {
...     "a": [1, 8, 3],
...     "b": [4, 5, None],
...     "c": ["x", "y", "z"],
... }

We define a dataframe-agnostic function that computes the horizontal min of "a" and "b" columns:

>>> def agnostic_min_horizontal(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.min_horizontal("a", "b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_min_horizontal:

>>> agnostic_min_horizontal(pd.DataFrame(data))
     a
0  1.0
1  5.0
2  3.0
>>> agnostic_min_horizontal(pl.DataFrame(data))
shape: (3, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 1   │
│ 5   │
│ 3   │
└─────┘
>>> agnostic_min_horizontal(pa.table(data))
pyarrow.Table
a: int64
----
a: [[1,5,3]]

narwhalify(func: Callable[..., Any] | None = None, *, strict: bool | None = None, pass_through: bool | None = None, eager_only: bool = False, series_only: bool = False, allow_series: bool | None = True) -> Callable[..., Any]

Decorate function so it becomes dataframe-agnostic.

This will try to convert any dataframe/series-like object into the Narwhals respective DataFrame/Series, while leaving the other parameters as they are. Similarly, if the output of the function is a Narwhals DataFrame or Series, it will be converted back to the original dataframe/series type, while if the output is another type it will be left as is. By setting pass_through=False, then every input and every output will be required to be a dataframe/series-like object.

Parameters:

Name Type Description Default
func Callable[..., Any] | None

Function to wrap in a from_native-to_native block.

None
strict bool | None

Deprecated (v1.13.0): Please use pass_through instead. Note that strict is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

Determine what happens if the object can't be converted to Narwhals:

  • True or None (default): raise an error
  • False: pass object through as-is
None
pass_through bool | None

Determine what happens if the object can't be converted to Narwhals:

  • False or None (default): raise an error
  • True: pass object through as-is
None
eager_only bool

Whether to only allow eager objects:

  • False (default): don't require native_object to be eager
  • True: only convert to Narwhals if native_object is eager
False
series_only bool

Whether to only allow Series:

  • False (default): don't require native_object to be a Series
  • True: only convert to Narwhals if native_object is a Series
False
allow_series bool | None

Whether to allow Series (default is only Dataframe / Lazyframe):

  • False or None: don't convert to Narwhals if native_object is a Series
  • True (default): allow native_object to be a Series
True

Returns:

Type Description
Callable[..., Any]

Decorated function.

Examples:

Instead of writing

>>> import narwhals as nw
>>> def agnostic_group_by_sum(df):
...     df = nw.from_native(df, pass_through=True)
...     df = df.group_by("a").agg(nw.col("b").sum())
...     return nw.to_native(df)

you can just write

>>> @nw.narwhalify
... def agnostic_group_by_sum(df):
...     return df.group_by("a").agg(nw.col("b").sum())

new_series(name: str, values: Any, dtype: DType | type[DType] | None = None, *, native_namespace: ModuleType) -> Series[Any]

Instantiate Narwhals Series from iterable (e.g. list or array).

Parameters:

Name Type Description Default
name str

Name of resulting Series.

required
values Any

Values of make Series from.

required
dtype DType | type[DType] | None

(Narwhals) dtype. If not provided, the native library may auto-infer it from values.

None
native_namespace ModuleType

The native library to use for DataFrame creation.

required

Returns:

Type Description
Series[Any]

A new Series

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT, IntoSeriesT
>>> data = {"a": [1, 2, 3], "b": [4, 5, 6]}

Let's define a dataframe-agnostic function:

>>> def agnostic_new_series(df_native: IntoFrameT) -> IntoSeriesT:
...     values = [4, 1, 2, 3]
...     native_namespace = nw.get_native_namespace(df_native)
...     return nw.new_series(
...         name="a",
...         values=values,
...         dtype=nw.Int32,
...         native_namespace=native_namespace,
...     ).to_native()

We can then pass any supported eager library, such as pandas / Polars / PyArrow:

>>> agnostic_new_series(pd.DataFrame(data))
0    4
1    1
2    2
3    3
Name: a, dtype: int32
>>> agnostic_new_series(pl.DataFrame(data))
shape: (4,)
Series: 'a' [i32]
[
   4
   1
   2
   3
]
>>> agnostic_new_series(pa.table(data))
<pyarrow.lib.ChunkedArray object at ...>
[
  [
    4,
    1,
    2,
    3
  ]
]

nth(*indices: int | Sequence[int]) -> Expr

Creates an expression that references one or more columns by their index(es).

Notes

nth is not supported for Polars version<1.0.0. Please use narwhals.col instead.

Parameters:

Name Type Description Default
indices int | Sequence[int]

One or more indices representing the columns to retrieve.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2], "b": [3, 4]}
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_nth(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.nth(0) * 2).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_nth:

>>> agnostic_nth(df_pd)
   a
0  2
1  4
>>> agnostic_nth(df_pl)
shape: (2, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 2   │
│ 4   │
└─────┘
>>> agnostic_nth(df_pa)
pyarrow.Table
a: int64
----
a: [[2,4]]

read_csv(source: str, *, native_namespace: ModuleType, **kwargs: Any) -> DataFrame[Any]

Read a CSV file into a DataFrame.

Parameters:

Name Type Description Default
source str

Path to a file.

required
native_namespace ModuleType

The native library to use for DataFrame creation.

required
kwargs Any

Extra keyword arguments which are passed to the native CSV reader. For example, you could use nw.read_csv('file.csv', native_namespace=pd, engine='pyarrow').

{}

Returns:

Type Description
DataFrame[Any]

DataFrame.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoDataFrame
>>> from types import ModuleType

Let's create an agnostic function that reads a csv file with a specified native namespace:

>>> def agnostic_read_csv(native_namespace: ModuleType) -> IntoDataFrame:
...     return nw.read_csv(
...         "file.csv", native_namespace=native_namespace
...     ).to_native()

Then we can read the file by passing pandas, Polars or PyArrow namespaces:

>>> agnostic_read_csv(native_namespace=pd)
   a  b
0  1  4
1  2  5
2  3  6
>>> agnostic_read_csv(native_namespace=pl)
shape: (3, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 4   │
│ 2   ┆ 5   │
│ 3   ┆ 6   │
└─────┴─────┘
>>> agnostic_read_csv(native_namespace=pa)
pyarrow.Table
a: int64
b: int64
----
a: [[1,2,3]]
b: [[4,5,6]]

read_parquet(source: str, *, native_namespace: ModuleType, **kwargs: Any) -> DataFrame[Any]

Read into a DataFrame from a parquet file.

Parameters:

Name Type Description Default
source str

Path to a file.

required
native_namespace ModuleType

The native library to use for DataFrame creation.

required
kwargs Any

Extra keyword arguments which are passed to the native parquet reader. For example, you could use nw.read_parquet('file.parquet', native_namespace=pd, engine='pyarrow').

{}

Returns:

Type Description
DataFrame[Any]

DataFrame.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoDataFrame
>>> from types import ModuleType

Let's create an agnostic function that reads a parquet file with a specified native namespace:

>>> def agnostic_read_parquet(native_namespace: ModuleType) -> IntoDataFrame:
...     return nw.read_parquet(
...         "file.parquet", native_namespace=native_namespace
...     ).to_native()

Then we can read the file by passing pandas, Polars or PyArrow namespaces:

>>> agnostic_read_parquet(native_namespace=pd)
   a  b
0  1  4
1  2  5
2  3  6
>>> agnostic_read_parquet(native_namespace=pl)
shape: (3, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 4   │
│ 2   ┆ 5   │
│ 3   ┆ 6   │
└─────┴─────┘
>>> agnostic_read_parquet(native_namespace=pa)
pyarrow.Table
a: int64
b: int64
----
a: [[1,2,3]]
b: [[4,5,6]]

scan_csv(source: str, *, native_namespace: ModuleType, **kwargs: Any) -> LazyFrame[Any]

Lazily read from a CSV file.

For the libraries that do not support lazy dataframes, the function reads a csv file eagerly and then converts the resulting dataframe to a lazyframe.

Parameters:

Name Type Description Default
source str

Path to a file.

required
native_namespace ModuleType

The native library to use for DataFrame creation.

required
kwargs Any

Extra keyword arguments which are passed to the native CSV reader. For example, you could use nw.scan_csv('file.csv', native_namespace=pd, engine='pyarrow').

{}

Returns:

Type Description
LazyFrame[Any]

LazyFrame.

Examples:

>>> import dask.dataframe as dd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrame
>>> from types import ModuleType

Let's create an agnostic function that lazily reads a csv file with a specified native namespace:

>>> def agnostic_scan_csv(native_namespace: ModuleType) -> IntoFrame:
...     return nw.scan_csv(
...         "file.csv", native_namespace=native_namespace
...     ).to_native()

Then we can read the file by passing, for example, Polars or Dask namespaces:

>>> agnostic_scan_csv(native_namespace=pl).collect()
shape: (3, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 4   │
│ 2   ┆ 5   │
│ 3   ┆ 6   │
└─────┴─────┘
>>> agnostic_scan_csv(native_namespace=dd).compute()
   a  b
0  1  4
1  2  5
2  3  6

scan_parquet(source: str, *, native_namespace: ModuleType, **kwargs: Any) -> LazyFrame[Any]

Lazily read from a parquet file.

For the libraries that do not support lazy dataframes, the function reads a parquet file eagerly and then converts the resulting dataframe to a lazyframe.

Parameters:

Name Type Description Default
source str

Path to a file.

required
native_namespace ModuleType

The native library to use for DataFrame creation.

required
kwargs Any

Extra keyword arguments which are passed to the native parquet reader. For example, you could use nw.scan_parquet('file.parquet', native_namespace=pd, engine='pyarrow').

{}

Returns:

Type Description
LazyFrame[Any]

LazyFrame.

Examples:

>>> import dask.dataframe as dd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrame
>>> from types import ModuleType

Let's create an agnostic function that lazily reads a parquet file with a specified native namespace:

>>> def agnostic_scan_parquet(native_namespace: ModuleType) -> IntoFrame:
...     return nw.scan_parquet(
...         "file.parquet", native_namespace=native_namespace
...     ).to_native()

Then we can read the file by passing, for example, Polars or Dask namespaces:

>>> agnostic_scan_parquet(native_namespace=pl).collect()
shape: (3, 2)
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1   ┆ 4   │
│ 2   ┆ 5   │
│ 3   ┆ 6   │
└─────┴─────┘
>>> agnostic_scan_parquet(native_namespace=dd).compute()
   a  b
0  1  4
1  2  5
2  3  6

sum(*columns: str) -> Expr

Sum all values.

Note

Syntactic sugar for nw.col(columns).sum()

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2]}
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_sum(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.sum("a")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_sum:

>>> agnostic_sum(df_pd)
   a
0  3
>>> agnostic_sum(df_pl)
shape: (1, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 3   │
└─────┘
>>> agnostic_sum(df_pa)
pyarrow.Table
a: int64
----
a: [[3]]

sum_horizontal(*exprs: IntoExpr | Iterable[IntoExpr]) -> Expr

Sum all values horizontally across columns.

Warning

Unlike Polars, we support horizontal sum over numeric columns only.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2, 3], "b": [5, 10, None]}
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_sum_horizontal(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.select(nw.sum_horizontal("a", "b")).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_sum_horizontal:

>>> agnostic_sum_horizontal(df_pd)
      a
0   6.0
1  12.0
2   3.0
>>> agnostic_sum_horizontal(df_pl)
shape: (3, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 6   │
│ 12  │
│ 3   │
└─────┘
>>> agnostic_sum_horizontal(df_pa)
pyarrow.Table
a: int64
----
a: [[6,12,3]]

show_versions() -> None

Print useful debugging information.

Examples:

>>> from narwhals import show_versions
>>> show_versions()

to_native(narwhals_object: DataFrame[IntoDataFrameT] | LazyFrame[IntoFrameT] | Series[IntoSeriesT], *, strict: bool | None = None, pass_through: bool | None = None) -> IntoDataFrameT | IntoFrameT | IntoSeriesT | Any

Convert Narwhals object to native one.

Parameters:

Name Type Description Default
narwhals_object DataFrame[IntoDataFrameT] | LazyFrame[IntoFrameT] | Series[IntoSeriesT]

Narwhals object.

required
strict bool | None

Determine what happens if narwhals_object isn't a Narwhals class:

  • True (default): raise an error
  • False: pass object through as-is

Deprecated (v1.13.0): Please use pass_through instead. Note that strict is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
pass_through bool | None

Determine what happens if narwhals_object isn't a Narwhals class:

  • False (default): raise an error
  • True: pass object through as-is
None

Returns:

Type Description
IntoDataFrameT | IntoFrameT | IntoSeriesT | Any

Object of class that user started with.

to_py_scalar(scalar_like: Any) -> Any

If a scalar is not Python native, converts it to Python native.

Parameters:

Name Type Description Default
scalar_like Any

Scalar-like value.

required

Returns:

Type Description
Any

Python scalar.

Raises:

Type Description
ValueError

If the object is not convertible to a scalar.

Examples:

>>> import narwhals as nw
>>> import pandas as pd
>>> df = nw.from_native(pd.DataFrame({"a": [1, 2, 3]}))
>>> nw.to_py_scalar(df["a"].item(0))
1
>>> import pyarrow as pa
>>> df = nw.from_native(pa.table({"a": [1, 2, 3]}))
>>> nw.to_py_scalar(df["a"].item(0))
1
>>> nw.to_py_scalar(1)
1

when(*predicates: IntoExpr | Iterable[IntoExpr]) -> When

Start a when-then-otherwise expression.

Expression similar to an if-else statement in Python. Always initiated by a pl.when(<condition>).then(<value if condition>), and optionally followed by chaining one or more .when(<condition>).then(<value>) statements. Chained when-then operations should be read as Python if, elif, ... elif blocks, not as if, if, ... if, i.e. the first condition that evaluates to True will be picked. If none of the conditions are True, an optional .otherwise(<value if all statements are false>) can be appended at the end. If not appended, and none of the conditions are True, None will be returned.

Parameters:

Name Type Description Default
predicates IntoExpr | Iterable[IntoExpr]

Condition(s) that must be met in order to apply the subsequent statement. Accepts one or more boolean expressions, which are implicitly combined with &. String input is parsed as a column name.

()

Returns:

Type Description
When

A "when" object, which .then can be called on.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw
>>> from narwhals.typing import IntoFrameT
>>>
>>> data = {"a": [1, 2, 3], "b": [5, 10, 15]}
>>> df_pl = pl.DataFrame(data)
>>> df_pd = pd.DataFrame(data)
>>> df_pa = pa.table(data)

We define a dataframe-agnostic function:

>>> def agnostic_when_then_otherwise(df_native: IntoFrameT) -> IntoFrameT:
...     df = nw.from_native(df_native)
...     return df.with_columns(
...         nw.when(nw.col("a") < 3).then(5).otherwise(6).alias("a_when")
...     ).to_native()

We can pass any supported library such as Pandas, Polars, or PyArrow to agnostic_when_then_otherwise:

>>> agnostic_when_then_otherwise(df_pd)
   a   b  a_when
0  1   5       5
1  2  10       5
2  3  15       6
>>> agnostic_when_then_otherwise(df_pl)
shape: (3, 3)
┌─────┬─────┬────────┐
│ a   ┆ b   ┆ a_when │
│ --- ┆ --- ┆ ---    │
│ i64 ┆ i64 ┆ i32    │
╞═════╪═════╪════════╡
│ 1   ┆ 5   ┆ 5      │
│ 2   ┆ 10  ┆ 5      │
│ 3   ┆ 15  ┆ 6      │
└─────┴─────┴────────┘
>>> agnostic_when_then_otherwise(df_pa)
pyarrow.Table
a: int64
b: int64
a_when: int64
----
a: [[1,2,3]]
b: [[5,10,15]]
a_when: [[5,5,6]]