Skip to content

Top-level functions

Here are the top-level functions available in Narwhals.

all

all() -> Expr

Instantiate an expression representing all columns.

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [3.14, 0.123]})
>>> nw.from_native(df_native).select(nw.all() * 2)
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|      a      b    |
|   0  2  6.280    |
|   1  4  0.246    |
└──────────────────┘

all_horizontal

all_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Compute the bitwise AND horizontally across columns.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> data = {
...     "a": [False, False, True, True, False, None],
...     "b": [False, True, True, None, None, None],
... }
>>> df_native = pa.table(data)
>>> nw.from_native(df_native).select("a", "b", all=nw.all_horizontal("a", "b"))
┌─────────────────────────────────────────┐
|           Narwhals DataFrame            |
|-----------------------------------------|
|pyarrow.Table                            |
|a: bool                                  |
|b: bool                                  |
|all: bool                                |
|----                                     |
|a: [[false,false,true,true,false,null]]  |
|b: [[false,true,true,null,null,null]]    |
|all: [[false,false,true,null,false,null]]|
└─────────────────────────────────────────┘

any_horizontal

any_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Compute the bitwise OR horizontally across columns.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> data = {
...     "a": [False, False, True, True, False, None],
...     "b": [False, True, True, None, None, None],
... }
>>> df_native = pl.DataFrame(data)
>>> nw.from_native(df_native).select("a", "b", any=nw.any_horizontal("a", "b"))
┌─────────────────────────┐
|   Narwhals DataFrame    |
|-------------------------|
|shape: (6, 3)            |
|┌───────┬───────┬───────┐|
|│ a     ┆ b     ┆ any   │|
|│ ---   ┆ ---   ┆ ---   │|
|│ bool  ┆ bool  ┆ bool  │|
|╞═══════╪═══════╪═══════╡|
|│ false ┆ false ┆ false │|
|│ false ┆ true  ┆ true  │|
|│ true  ┆ true  ┆ true  │|
|│ true  ┆ null  ┆ true  │|
|│ false ┆ null  ┆ null  │|
|│ null  ┆ null  ┆ null  │|
|└───────┴───────┴───────┘|
└─────────────────────────┘

col

col(*names: str | Iterable[str]) -> Expr

Creates an expression that references one or more columns by their name(s).

Parameters:

Name Type Description Default
names str | Iterable[str]

Name(s) of the columns to use.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2], "b": [3, 4], "c": ["x", "z"]})
>>> nw.from_native(df_native).select(nw.col("a", "b") * nw.col("b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (2, 2)   |
|  ┌─────┬─────┐   |
|  │ a   ┆ b   │   |
|  │ --- ┆ --- │   |
|  │ i64 ┆ i64 │   |
|  ╞═════╪═════╡   |
|  │ 3   ┆ 9   │   |
|  │ 8   ┆ 16  │   |
|  └─────┴─────┘   |
└──────────────────┘

concat

concat(
    items: Iterable[FrameT],
    *,
    how: ConcatMethod = "vertical"
) -> FrameT

Concatenate multiple DataFrames, LazyFrames into a single entity.

Parameters:

Name Type Description Default
items Iterable[FrameT]

DataFrames, LazyFrames to concatenate.

required
how ConcatMethod

concatenating strategy

  • vertical: Concatenate vertically. Column names must match.
  • horizontal: Concatenate horizontally. If lengths don't match, then missing rows are filled with null values. This is only supported when all inputs are (eager) DataFrames.
  • diagonal: Finds a union between the column schemas and fills missing column values with null.
'vertical'

Returns:

Type Description
FrameT

A new DataFrame or LazyFrame resulting from the concatenation.

Raises:

Type Description
TypeError

The items to concatenate should either all be eager, or all lazy

Examples:

Let's take an example of vertical concatenation:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw

Let's look at one case a for vertical concatenation (pandas backed):

>>> df_pd_1 = nw.from_native(pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}))
>>> df_pd_2 = nw.from_native(pd.DataFrame({"a": [5, 2], "b": [1, 4]}))
>>> nw.concat([df_pd_1, df_pd_2], how="vertical")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a  b      |
|     0  1  4      |
|     1  2  5      |
|     2  3  6      |
|     0  5  1      |
|     1  2  4      |
└──────────────────┘

Let's look at one case a for horizontal concatenation (polars backed):

>>> df_pl_1 = nw.from_native(pl.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}))
>>> df_pl_2 = nw.from_native(pl.DataFrame({"c": [5, 2], "d": [1, 4]}))
>>> nw.concat([df_pl_1, df_pl_2], how="horizontal")
┌───────────────────────────┐
|    Narwhals DataFrame     |
|---------------------------|
|shape: (3, 4)              |
|┌─────┬─────┬──────┬──────┐|
|│ a   ┆ b   ┆ c    ┆ d    │|
|│ --- ┆ --- ┆ ---  ┆ ---  │|
|│ i64 ┆ i64 ┆ i64  ┆ i64  │|
|╞═════╪═════╪══════╪══════╡|
|│ 1   ┆ 4   ┆ 5    ┆ 1    │|
|│ 2   ┆ 5   ┆ 2    ┆ 4    │|
|│ 3   ┆ 6   ┆ null ┆ null │|
|└─────┴─────┴──────┴──────┘|
└───────────────────────────┘

Let's look at one case a for diagonal concatenation (pyarrow backed):

>>> df_pa_1 = nw.from_native(pa.table({"a": [1, 2], "b": [3.5, 4.5]}))
>>> df_pa_2 = nw.from_native(pa.table({"a": [3, 4], "z": ["x", "y"]}))
>>> nw.concat([df_pa_1, df_pa_2], how="diagonal")
┌──────────────────────────┐
|    Narwhals DataFrame    |
|--------------------------|
|pyarrow.Table             |
|a: int64                  |
|b: double                 |
|z: string                 |
|----                      |
|a: [[1,2],[3,4]]          |
|b: [[3.5,4.5],[null,null]]|
|z: [[null,null],["x","y"]]|
└──────────────────────────┘

concat_str

concat_str(
    exprs: IntoExpr | Iterable[IntoExpr],
    *more_exprs: IntoExpr,
    separator: str = "",
    ignore_nulls: bool = False
) -> Expr

Horizontally concatenate columns into a single string column.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Columns to concatenate into a single string column. Accepts expression input. Strings are parsed as column names, other non-expression inputs are parsed as literals. Non-String columns are cast to String.

required
*more_exprs IntoExpr

Additional columns to concatenate into a single string column, specified as positional arguments.

()
separator str

String that will be used to separate the values of each column.

''
ignore_nulls bool

Ignore null values (default is False). If set to False, null values will be propagated and if the row contains any null values, the output is null.

False

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> data = {
...     "a": [1, 2, 3],
...     "b": ["dogs", "cats", None],
...     "c": ["play", "swim", "walk"],
... }
>>> df_native = pd.DataFrame(data)
>>> (
...     nw.from_native(df_native).select(
...         nw.concat_str(
...             [nw.col("a") * 2, nw.col("b"), nw.col("c")], separator=" "
...         ).alias("full_sentence")
...     )
... )
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|   full_sentence  |
| 0   2 dogs play  |
| 1   4 cats swim  |
| 2          None  |
└──────────────────┘

exclude

exclude(*names: str | Iterable[str]) -> Expr

Creates an expression that excludes columns by their name(s).

Parameters:

Name Type Description Default
names str | Iterable[str]

Name(s) of the columns to exclude.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2], "b": [3, 4], "c": ["x", "z"]})
>>> nw.from_native(df_native).select(nw.exclude("c", "a"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (2, 1)   |
|  ┌─────┐         |
|  │ b   │         |
|  │ --- │         |
|  │ i64 │         |
|  ╞═════╡         |
|  │ 3   │         |
|  │ 4   │         |
|  └─────┘         |
└──────────────────┘

from_arrow

from_arrow(
    native_frame: IntoArrowTable,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> DataFrame[Any]

Construct a DataFrame from an object which supports the PyCapsule Interface.

Parameters:

Name Type Description Default
native_frame IntoArrowTable

Object which implements __arrow_c_stream__.

required
backend ModuleType | Implementation | str | None

specifies which eager backend instantiate to.

backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN or CUDF.
  • As a string: "pandas", "pyarrow", "polars", "modin" or "cudf".
  • Directly as a module pandas, pyarrow, polars, modin or cudf.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.31.0)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None

Returns:

Type Description
DataFrame[Any]

A new DataFrame.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [4.2, 5.1]})
>>> nw.from_arrow(df_native, backend="polars")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (2, 2)   |
|  ┌─────┬─────┐   |
|  │ a   ┆ b   │   |
|  │ --- ┆ --- │   |
|  │ i64 ┆ f64 │   |
|  ╞═════╪═════╡   |
|  │ 1   ┆ 4.2 │   |
|  │ 2   ┆ 5.1 │   |
|  └─────┴─────┘   |
└──────────────────┘

from_dict

from_dict(
    data: Mapping[str, Any],
    schema: Mapping[str, DType] | Schema | None = None,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> DataFrame[Any]

Instantiate DataFrame from dictionary.

Indexes (if present, for pandas-like backends) are aligned following the left-hand-rule.

Notes

For pandas-like dataframes, conversion to schema is applied after dataframe creation.

Parameters:

Name Type Description Default
data Mapping[str, Any]

Dictionary to create DataFrame from.

required
schema Mapping[str, DType] | Schema | None

The DataFrame schema as Schema or dict of {name: type}. If not specified, the schema will be inferred by the native library.

None
backend ModuleType | Implementation | str | None

specifies which eager backend instantiate to. Only necessary if inputs are not Narwhals Series.

backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN or CUDF.
  • As a string: "pandas", "pyarrow", "polars", "modin" or "cudf".
  • Directly as a module pandas, pyarrow, polars, modin or cudf.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.26.0)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None

Returns:

Type Description
DataFrame[Any]

A new DataFrame.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>> data = {"c": [5, 2], "d": [1, 4]}
>>> nw.from_dict(data, backend="pandas")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        c  d      |
|     0  5  1      |
|     1  2  4      |
└──────────────────┘

from_native

from_native(native_object: SeriesT, **kwds: Any) -> SeriesT
from_native(
    native_object: DataFrameT, **kwds: Any
) -> DataFrameT
from_native(
    native_object: LazyFrameT, **kwds: Any
) -> LazyFrameT
from_native(
    native_object: IntoDataFrameT | IntoSeriesT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: Literal[True]
) -> DataFrame[IntoDataFrameT] | Series[IntoSeriesT]
from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]
from_native(
    native_object: T,
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> T
from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]
from_native(
    native_object: T,
    *,
    pass_through: Literal[True],
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> T
from_native(
    native_object: (
        IntoFrameT | IntoLazyFrameT | IntoSeriesT
    ),
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: Literal[True]
) -> (
    DataFrame[IntoFrameT]
    | LazyFrame[IntoLazyFrameT]
    | Series[IntoSeriesT]
)
from_native(
    native_object: IntoSeriesT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[True],
    allow_series: None = ...
) -> Series[IntoSeriesT]
from_native(
    native_object: IntoLazyFrameT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> LazyFrame[IntoLazyFrameT]
from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]
from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]
from_native(
    native_object: IntoFrame | IntoSeries,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: Literal[True]
) -> DataFrame[Any] | LazyFrame[Any] | Series[Any]
from_native(
    native_object: IntoSeriesT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[True],
    allow_series: None = ...
) -> Series[IntoSeriesT]
from_native(
    native_object: Any,
    *,
    pass_through: bool,
    eager_only: bool,
    series_only: bool,
    allow_series: bool | None
) -> Any
from_native(
    native_object: (
        IntoLazyFrameT
        | IntoFrameT
        | IntoSeriesT
        | IntoFrame
        | IntoSeries
        | T
    ),
    *,
    strict: bool | None = None,
    pass_through: bool | None = None,
    eager_only: bool = False,
    series_only: bool = False,
    allow_series: bool | None = None,
    **kwds: Any
) -> (
    LazyFrame[IntoLazyFrameT]
    | DataFrame[IntoFrameT]
    | Series[IntoSeriesT]
    | T
)

Convert native_object to Narwhals Dataframe, Lazyframe, or Series.

Parameters:

Name Type Description Default
native_object IntoLazyFrameT | IntoFrameT | IntoSeriesT | IntoFrame | IntoSeries | T

Raw object from user. Depending on the other arguments, input object can be

  • a Dataframe / Lazyframe / Series supported by Narwhals (pandas, Polars, PyArrow, ...)
  • an object which implements __narwhals_dataframe__, __narwhals_lazyframe__, or __narwhals_series__
required
strict bool | None

Determine what happens if the object can't be converted to Narwhals

  • True or None (default): raise an error
  • False: pass object through as-is

Deprecated (v1.13.0)

Please use pass_through instead. Note that strict is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
pass_through bool | None

Determine what happens if the object can't be converted to Narwhals

  • False or None (default): raise an error
  • True: pass object through as-is
None
eager_only bool

Whether to only allow eager objects

  • False (default): don't require native_object to be eager
  • True: only convert to Narwhals if native_object is eager
False
series_only bool

Whether to only allow Series

  • False (default): don't require native_object to be a Series
  • True: only convert to Narwhals if native_object is a Series
False
allow_series bool | None

Whether to allow Series (default is only Dataframe / Lazyframe)

  • False or None (default): don't convert to Narwhals if native_object is a Series
  • True: allow native_object to be a Series
None

Returns:

Type Description
LazyFrame[IntoLazyFrameT] | DataFrame[IntoFrameT] | Series[IntoSeriesT] | T

DataFrame, LazyFrame, Series, or original object, depending on which combination of parameters was passed.

from_numpy

from_numpy(
    data: _2DArray,
    schema: (
        Mapping[str, DType] | Schema | Sequence[str] | None
    ) = None,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> DataFrame[Any]

Construct a DataFrame from a NumPy ndarray.

Notes

Only row orientation is currently supported.

For pandas-like dataframes, conversion to schema is applied after dataframe creation.

Parameters:

Name Type Description Default
data _2DArray

Two-dimensional data represented as a NumPy ndarray.

required
schema Mapping[str, DType] | Schema | Sequence[str] | None

The DataFrame schema as Schema, dict of {name: type}, or a sequence of str.

None
backend ModuleType | Implementation | str | None

specifies which eager backend instantiate to.

backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN or CUDF.
  • As a string: "pandas", "pyarrow", "polars", "modin" or "cudf".
  • Directly as a module pandas, pyarrow, polars, modin or cudf.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.31.0)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None

Returns:

Type Description
DataFrame[Any]

A new DataFrame.

Examples:

>>> import numpy as np
>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> arr = np.array([[5, 2, 1], [1, 4, 3]])
>>> schema = {"c": nw.Int16(), "d": nw.Float32(), "e": nw.Int8()}
>>> nw.from_numpy(arr, schema=schema, backend="pyarrow")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  pyarrow.Table   |
|  c: int16        |
|  d: float        |
|  e: int8         |
|  ----            |
|  c: [[5,1]]      |
|  d: [[2,4]]      |
|  e: [[1,3]]      |
└──────────────────┘

generate_temporary_column_name

generate_temporary_column_name(
    n_bytes: int, columns: Sequence[str]
) -> str

Generates a unique column name that is not present in the given list of columns.

It relies on python secrets token_hex function to return a string nbytes random bytes.

Parameters:

Name Type Description Default
n_bytes int

The number of bytes to generate for the token.

required
columns Sequence[str]

The list of columns to check for uniqueness.

required

Returns:

Type Description
str

A unique token that is not present in the given list of columns.

Raises:

Type Description
AssertionError

If a unique token cannot be generated after 100 attempts.

Examples:

>>> import narwhals as nw
>>> columns = ["abc", "xyz"]
>>> nw.generate_temporary_column_name(n_bytes=8, columns=columns) not in columns
True

get_level

get_level(
    obj: (
        DataFrame[Any]
        | LazyFrame[Any]
        | Series[IntoSeriesT]
    ),
) -> Literal["full", "lazy", "interchange"]

Level of support Narwhals has for current object.

Parameters:

Name Type Description Default
obj DataFrame[Any] | LazyFrame[Any] | Series[IntoSeriesT]

Dataframe or Series.

required

Returns:

Type Description
Literal['full', 'lazy', 'interchange']

This can be one of

  • 'full': full Narwhals API support
  • 'lazy': only lazy operations are supported. This excludes anything which involves iterating over rows in Python.
  • 'interchange': only metadata operations are supported (df.schema)

get_native_namespace

get_native_namespace(
    *obj: DataFrame[Any]
    | LazyFrame[Any]
    | Series[Any]
    | IntoFrame
    | IntoSeries,
) -> Any

Get native namespace from object.

Parameters:

Name Type Description Default
obj DataFrame[Any] | LazyFrame[Any] | Series[Any] | IntoFrame | IntoSeries

Dataframe, Lazyframe, or Series. Multiple objects can be passed positionally, in which case they must all have the same native namespace (else an error is raised).

()

Returns:

Type Description
Any

Native module.

Examples:

>>> import polars as pl
>>> import pandas as pd
>>> import narwhals as nw
>>> df = nw.from_native(pd.DataFrame({"a": [1, 2, 3]}))
>>> nw.get_native_namespace(df)
<module 'pandas'...>
>>> df = nw.from_native(pl.DataFrame({"a": [1, 2, 3]}))
>>> nw.get_native_namespace(df)
<module 'polars'...>

is_ordered_categorical

is_ordered_categorical(series: Series[Any]) -> bool

Return whether indices of categories are semantically meaningful.

This is a convenience function to accessing what would otherwise be the is_ordered property from the DataFrame Interchange Protocol, see https://data-apis.org/dataframe-protocol/latest/API.html.

  • For Polars:
  • Enums are always ordered.
  • Categoricals are ordered if dtype.ordering == "physical".
  • For pandas-like APIs:
  • Categoricals are ordered if dtype.cat.ordered == True.
  • For PyArrow table:
  • Categoricals are ordered if dtype.type.ordered == True.

Parameters:

Name Type Description Default
series Series[Any]

Input Series.

required

Returns:

Type Description
bool

Whether the Series is an ordered categorical.

Examples:

>>> import narwhals as nw
>>> import pandas as pd
>>> import polars as pl
>>> data = ["x", "y"]
>>> s_pd = pd.Series(data, dtype=pd.CategoricalDtype(ordered=True))
>>> s_pl = pl.Series(data, dtype=pl.Categorical(ordering="physical"))

Let's define a library-agnostic function:

>>> @nw.narwhalify
... def func(s):
...     return nw.is_ordered_categorical(s)

Then, we can pass any supported library to func:

>>> func(s_pd)
True
>>> func(s_pl)
True

len

len() -> Expr

Return the number of rows.

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2], "b": [5, None]})
>>> nw.from_native(df_native).select(nw.len())
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (1, 1)   |
|  ┌─────┐         |
|  │ len │         |
|  │ --- │         |
|  │ u32 │         |
|  ╞═════╡         |
|  │ 2   │         |
|  └─────┘         |
└──────────────────┘

lit

lit(
    value: NonNestedLiteral,
    dtype: DType | type[DType] | None = None,
) -> Expr

Return an expression representing a literal value.

Parameters:

Name Type Description Default
value NonNestedLiteral

The value to use as literal.

required
dtype DType | type[DType] | None

The data type of the literal value. If not provided, the data type will be inferred by the native library.

None

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2]})
>>> nw.from_native(df_native).with_columns(nw.lit(3))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|     a  literal   |
|  0  1        3   |
|  1  2        3   |
└──────────────────┘

max

max(*columns: str) -> Expr

Return the maximum value.

Note

Syntactic sugar for nw.col(columns).max().

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [5, 10]})
>>> nw.from_native(df_native).select(nw.max("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a   b     |
|     0  2  10     |
└──────────────────┘

max_horizontal

max_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Get the maximum value horizontally across columns.

Notes

We support max_horizontal over numeric columns only.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 8, 3], "b": [4, 5, None]})
>>> nw.from_native(df_native).with_columns(h_max=nw.max_horizontal("a", "b"))
┌──────────────────────┐
|  Narwhals DataFrame  |
|----------------------|
|shape: (3, 3)         |
|┌─────┬──────┬───────┐|
|│ a   ┆ b    ┆ h_max │|
|│ --- ┆ ---  ┆ ---   │|
|│ i64 ┆ i64  ┆ i64   │|
|╞═════╪══════╪═══════╡|
|│ 1   ┆ 4    ┆ 4     │|
|│ 8   ┆ 5    ┆ 8     │|
|│ 3   ┆ null ┆ 3     │|
|└─────┴──────┴───────┘|
└──────────────────────┘

maybe_align_index

maybe_align_index(
    lhs: FrameOrSeriesT,
    rhs: Series[Any] | DataFrame[Any] | LazyFrame[Any],
) -> FrameOrSeriesT

Align lhs to the Index of rhs, if they're both pandas-like.

Parameters:

Name Type Description Default
lhs FrameOrSeriesT

Dataframe or Series.

required
rhs Series[Any] | DataFrame[Any] | LazyFrame[Any]

Dataframe or Series to align with.

required

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this only checks that lhs and rhs are the same length.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2]}, index=[3, 4])
>>> s_pd = pd.Series([6, 7], index=[4, 3])
>>> df = nw.from_native(df_pd)
>>> s = nw.from_native(s_pd, series_only=True)
>>> nw.to_native(nw.maybe_align_index(df, s))
   a
4  2
3  1

maybe_convert_dtypes

maybe_convert_dtypes(
    obj: FrameOrSeriesT, *args: bool, **kwargs: bool | str
) -> FrameOrSeriesT

Convert columns or series to the best possible dtypes using dtypes supporting pd.NA, if df is pandas-like.

Parameters:

Name Type Description Default
obj FrameOrSeriesT

DataFrame or Series.

required
*args bool

Additional arguments which gets passed through.

()
**kwargs bool | str

Additional arguments which gets passed through.

{}

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Notes

For non-pandas-like inputs, this is a no-op. Also, args and kwargs just get passed down to the underlying library as-is.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> import numpy as np
>>> df_pd = pd.DataFrame(
...     {
...         "a": pd.Series([1, 2, 3], dtype=np.dtype("int32")),
...         "b": pd.Series([True, False, np.nan], dtype=np.dtype("O")),
...     }
... )
>>> df = nw.from_native(df_pd)
>>> nw.to_native(
...     nw.maybe_convert_dtypes(df)
... ).dtypes
a             Int32
b           boolean
dtype: object

maybe_get_index

maybe_get_index(
    obj: DataFrame[Any] | LazyFrame[Any] | Series[Any],
) -> Any | None

Get the index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name Type Description Default
obj DataFrame[Any] | LazyFrame[Any] | Series[Any]

Dataframe or Series.

required

Returns:

Type Description
Any | None

Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this returns None.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]})
>>> df = nw.from_native(df_pd)
>>> nw.maybe_get_index(df)
RangeIndex(start=0, stop=2, step=1)
>>> series_pd = pd.Series([1, 2])
>>> series = nw.from_native(series_pd, series_only=True)
>>> nw.maybe_get_index(series)
RangeIndex(start=0, stop=2, step=1)

maybe_reset_index

maybe_reset_index(obj: FrameOrSeriesT) -> FrameOrSeriesT

Reset the index to the default integer index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name Type Description Default
obj FrameOrSeriesT

Dataframe or Series.

required

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already resets the index for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this is a no-op.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]}, index=([6, 7]))
>>> df = nw.from_native(df_pd)
>>> nw.to_native(nw.maybe_reset_index(df))
   a  b
0  1  4
1  2  5
>>> series_pd = pd.Series([1, 2])
>>> series = nw.from_native(series_pd, series_only=True)
>>> nw.maybe_get_index(series)
RangeIndex(start=0, stop=2, step=1)

maybe_set_index

maybe_set_index(
    obj: FrameOrSeriesT,
    column_names: str | list[str] | None = None,
    *,
    index: (
        Series[IntoSeriesT]
        | list[Series[IntoSeriesT]]
        | None
    ) = None
) -> FrameOrSeriesT

Set the index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name Type Description Default
obj FrameOrSeriesT

object for which maybe set the index (can be either a Narwhals DataFrame or Series).

required
column_names str | list[str] | None

name or list of names of the columns to set as index. For dataframes, only one of column_names and index can be specified but not both. If column_names is passed and df is a Series, then a ValueError is raised.

None
index Series[IntoSeriesT] | list[Series[IntoSeriesT]] | None

series or list of series to set as index.

None

Returns:

Type Description
FrameOrSeriesT

Same type as input.

Raises:

Type Description
ValueError

If one of the following conditions happens

  • none of column_names and index are provided
  • both column_names and index are provided
  • column_names is provided and df is a Series
Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index.

For non-pandas-like inputs, this is a no-op.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]})
>>> df = nw.from_native(df_pd)
>>> nw.to_native(nw.maybe_set_index(df, "b"))
   a
b
4  1
5  2

mean

mean(*columns: str) -> Expr

Get the mean value.

Note

Syntactic sugar for nw.col(columns).mean()

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 8, 3], "b": [3.14, 6.28, 42.1]})
>>> nw.from_native(df_native).select(nw.mean("a", "b"))
┌─────────────────────────┐
|   Narwhals DataFrame    |
|-------------------------|
|pyarrow.Table            |
|a: double                |
|b: double                |
|----                     |
|a: [[4]]                 |
|b: [[17.173333333333336]]|
└─────────────────────────┘

mean_horizontal

mean_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Compute the mean of all values horizontally across columns.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> data = {"a": [1, 8, 3], "b": [4, 5, None], "c": ["x", "y", "z"]}
>>> df_native = pa.table(data)

We define a dataframe-agnostic function that computes the horizontal mean of "a" and "b" columns:

>>> nw.from_native(df_native).select(nw.mean_horizontal("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| pyarrow.Table    |
| a: double        |
| ----             |
| a: [[2.5,6.5,3]] |
└──────────────────┘

median

median(*columns: str) -> Expr

Get the median value.

Notes
  • Syntactic sugar for nw.col(columns).median()
  • Results might slightly differ across backends due to differences in the underlying algorithms used to compute the median.

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [4, 5, 2]})
>>> nw.from_native(df_native).select(nw.median("a"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (1, 1)   |
|  ┌─────┐         |
|  │ a   │         |
|  │ --- │         |
|  │ f64 │         |
|  ╞═════╡         |
|  │ 4.0 │         |
|  └─────┘         |
└──────────────────┘

min

min(*columns: str) -> Expr

Return the minimum value.

Note

Syntactic sugar for nw.col(columns).min().

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 2], "b": [5, 10]})
>>> nw.from_native(df_native).select(nw.min("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  pyarrow.Table   |
|  a: int64        |
|  b: int64        |
|  ----            |
|  a: [[1]]        |
|  b: [[5]]        |
└──────────────────┘

min_horizontal

min_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Get the minimum value horizontally across columns.

Notes

We support min_horizontal over numeric columns only.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 8, 3], "b": [4, 5, None]})
>>> nw.from_native(df_native).with_columns(h_min=nw.min_horizontal("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| pyarrow.Table    |
| a: int64         |
| b: int64         |
| h_min: int64     |
| ----             |
| a: [[1,8,3]]     |
| b: [[4,5,null]]  |
| h_min: [[1,5,3]] |
└──────────────────┘

narwhalify

narwhalify(
    func: Callable[..., Any] | None = None,
    *,
    strict: bool | None = None,
    pass_through: bool | None = None,
    eager_only: bool = False,
    series_only: bool = False,
    allow_series: bool | None = True
) -> Callable[..., Any]

Decorate function so it becomes dataframe-agnostic.

This will try to convert any dataframe/series-like object into the Narwhals respective DataFrame/Series, while leaving the other parameters as they are. Similarly, if the output of the function is a Narwhals DataFrame or Series, it will be converted back to the original dataframe/series type, while if the output is another type it will be left as is. By setting pass_through=False, then every input and every output will be required to be a dataframe/series-like object.

Parameters:

Name Type Description Default
func Callable[..., Any] | None

Function to wrap in a from_native-to_native block.

None
strict bool | None

Determine what happens if the object can't be converted to Narwhals

Deprecated (v1.13.0)

Please use pass_through instead. Note that strict is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

  • True or None (default): raise an error
  • False: pass object through as-is
None
pass_through bool | None

Determine what happens if the object can't be converted to Narwhals

  • False or None (default): raise an error
  • True: pass object through as-is
None
eager_only bool

Whether to only allow eager objects

  • False (default): don't require native_object to be eager
  • True: only convert to Narwhals if native_object is eager
False
series_only bool

Whether to only allow Series

  • False (default): don't require native_object to be a Series
  • True: only convert to Narwhals if native_object is a Series
False
allow_series bool | None

Whether to allow Series (default is only Dataframe / Lazyframe)

  • False or None: don't convert to Narwhals if native_object is a Series
  • True (default): allow native_object to be a Series
True

Returns:

Type Description
Callable[..., Any]

Decorated function.

Examples:

Instead of writing

>>> import narwhals as nw
>>> def agnostic_group_by_sum(df):
...     df = nw.from_native(df, pass_through=True)
...     df = df.group_by("a").agg(nw.col("b").sum())
...     return nw.to_native(df)

you can just write

>>> @nw.narwhalify
... def agnostic_group_by_sum(df):
...     return df.group_by("a").agg(nw.col("b").sum())

new_series

new_series(
    name: str,
    values: Any,
    dtype: DType | type[DType] | None = None,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> Series[Any]

Instantiate Narwhals Series from iterable (e.g. list or array).

Parameters:

Name Type Description Default
name str

Name of resulting Series.

required
values Any

Values of make Series from.

required
dtype DType | type[DType] | None

(Narwhals) dtype. If not provided, the native library may auto-infer it from values.

None
backend ModuleType | Implementation | str | None

specifies which eager backend instantiate to.

backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN or CUDF.
  • As a string: "pandas", "pyarrow", "polars", "modin" or "cudf".
  • Directly as a module pandas, pyarrow, polars, modin or cudf.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.31.0)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None

Returns:

Type Description
Series[Any]

A new Series

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> values = [4, 1, 2, 3]
>>> nw.new_series(name="a", values=values, dtype=nw.Int32, backend=pd)
┌─────────────────────┐
|   Narwhals Series   |
|---------------------|
|0    4               |
|1    1               |
|2    2               |
|3    3               |
|Name: a, dtype: int32|
└─────────────────────┘

nth

nth(*indices: int | Sequence[int]) -> Expr

Creates an expression that references one or more columns by their index(es).

Notes

nth is not supported for Polars version<1.0.0. Please use narwhals.col instead.

Parameters:

Name Type Description Default
indices int | Sequence[int]

One or more indices representing the columns to retrieve.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 2], "b": [3, 4], "c": [0.123, 3.14]})
>>> nw.from_native(df_native).select(nw.nth(0, 2) * 2)
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|pyarrow.Table     |
|a: int64          |
|c: double         |
|----              |
|a: [[2,4]]        |
|c: [[0.246,6.28]] |
└──────────────────┘

read_csv

read_csv(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> DataFrame[Any]

Read a CSV file into a DataFrame.

Parameters:

Name Type Description Default
source str

Path to a file.

required
backend ModuleType | Implementation | str | None

The eager backend for DataFrame creation. backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN or CUDF.
  • As a string: "pandas", "pyarrow", "polars", "modin" or "cudf".
  • Directly as a module pandas, pyarrow, polars, modin or cudf.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.27.2)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
kwargs Any

Extra keyword arguments which are passed to the native CSV reader. For example, you could use nw.read_csv('file.csv', backend='pandas', engine='pyarrow').

{}

Returns:

Type Description
DataFrame[Any]

DataFrame.

Examples:

>>> import narwhals as nw
>>> nw.read_csv("file.csv", backend="pandas")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a   b     |
|     0  1   4     |
|     1  2   5     |
└──────────────────┘

read_parquet

read_parquet(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> DataFrame[Any]

Read into a DataFrame from a parquet file.

Parameters:

Name Type Description Default
source str

Path to a file.

required
backend ModuleType | Implementation | str | None

The eager backend for DataFrame creation. backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN or CUDF.
  • As a string: "pandas", "pyarrow", "polars", "modin" or "cudf".
  • Directly as a module pandas, pyarrow, polars, modin or cudf.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.31.0)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
kwargs Any

Extra keyword arguments which are passed to the native parquet reader. For example, you could use nw.read_parquet('file.parquet', backend=pd, engine='pyarrow').

{}

Returns:

Type Description
DataFrame[Any]

DataFrame.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> nw.read_parquet("file.parquet", backend="pyarrow")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|pyarrow.Table     |
|a: int64          |
|c: double         |
|----              |
|a: [[1,2]]        |
|c: [[0.2,0.1]]    |
└──────────────────┘

scan_csv

scan_csv(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> LazyFrame[Any]

Lazily read from a CSV file.

For the libraries that do not support lazy dataframes, the function reads a csv file eagerly and then converts the resulting dataframe to a lazyframe.

Parameters:

Name Type Description Default
source str

Path to a file.

required
backend ModuleType | Implementation | str | None

The eager backend for DataFrame creation. backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN or CUDF.
  • As a string: "pandas", "pyarrow", "polars", "modin" or "cudf".
  • Directly as a module pandas, pyarrow, polars, modin or cudf.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.31.0)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
kwargs Any

Extra keyword arguments which are passed to the native CSV reader. For example, you could use nw.scan_csv('file.csv', backend=pd, engine='pyarrow').

{}

Returns:

Type Description
LazyFrame[Any]

LazyFrame.

Examples:

>>> import duckdb
>>> import narwhals as nw
>>>
>>> nw.scan_csv("file.csv", backend="duckdb").to_native()
┌─────────┬───────┐
│    a    │   b   │
│ varchar │ int32 │
├─────────┼───────┤
│ x       │     1 │
│ y       │     2 │
│ z       │     3 │
└─────────┴───────┘

scan_parquet

scan_parquet(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> LazyFrame[Any]

Lazily read from a parquet file.

For the libraries that do not support lazy dataframes, the function reads a parquet file eagerly and then converts the resulting dataframe to a lazyframe.

Note

Spark like backends require a session object to be passed in kwargs.

For instance:

import narwhals as nw
from sqlframe.duckdb import DuckDBSession

nw.scan_parquet(source, backend="sqlframe", session=DuckDBSession())

Parameters:

Name Type Description Default
source str

Path to a file.

required
backend ModuleType | Implementation | str | None

The eager backend for DataFrame creation. backend can be specified in various ways

  • As Implementation.<BACKEND> with BACKEND being PANDAS, PYARROW, POLARS, MODIN, CUDF, PYSPARK or SQLFRAME.
  • As a string: "pandas", "pyarrow", "polars", "modin", "cudf", "pyspark" or "sqlframe".
  • Directly as a module pandas, pyarrow, polars, modin, cudf, pyspark.sql or sqlframe.
None
native_namespace ModuleType | None

The native library to use for DataFrame creation.

Deprecated (v1.31.0)

Please use backend instead. Note that native_namespace is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
kwargs Any

Extra keyword arguments which are passed to the native parquet reader. For example, you could use nw.scan_parquet('file.parquet', backend=pd, engine='pyarrow').

{}

Returns:

Type Description
LazyFrame[Any]

LazyFrame.

Examples:

>>> import dask.dataframe as dd
>>> from sqlframe.duckdb import DuckDBSession
>>> import narwhals as nw
>>>
>>> nw.scan_parquet("file.parquet", backend="dask").collect()
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a   b     |
|     0  1   4     |
|     1  2   5     |
└──────────────────┘
>>> nw.scan_parquet(
...     "file.parquet", backend="sqlframe", session=DuckDBSession()
... ).collect()
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  pyarrow.Table   |
|  a: int64        |
|  b: int64        |
|  ----            |
|  a: [[1,2]]      |
|  b: [[4,5]]      |
└──────────────────┘

sum

sum(*columns: str) -> Expr

Sum all values.

Note

Syntactic sugar for nw.col(columns).sum()

Parameters:

Name Type Description Default
columns str

Name(s) of the columns to use in the aggregation function

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [-1.4, 6.2]})
>>> nw.from_native(df_native).select(nw.sum("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|       a    b     |
|    0  3  4.8     |
└──────────────────┘

sum_horizontal

sum_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Sum all values horizontally across columns.

Warning

Unlike Polars, we support horizontal sum over numeric columns only.

Parameters:

Name Type Description Default
exprs IntoExpr | Iterable[IntoExpr]

Name(s) of the columns to use in the aggregation function. Accepts expression input.

()

Returns:

Type Description
Expr

A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2, 3], "b": [5, 10, None]})
>>> nw.from_native(df_native).with_columns(sum=nw.sum_horizontal("a", "b"))
┌────────────────────┐
| Narwhals DataFrame |
|--------------------|
|shape: (3, 3)       |
|┌─────┬──────┬─────┐|
|│ a   ┆ b    ┆ sum │|
|│ --- ┆ ---  ┆ --- │|
|│ i64 ┆ i64  ┆ i64 │|
|╞═════╪══════╪═════╡|
|│ 1   ┆ 5    ┆ 6   │|
|│ 2   ┆ 10   ┆ 12  │|
|│ 3   ┆ null ┆ 3   │|
|└─────┴──────┴─────┘|
└────────────────────┘

show_versions

show_versions() -> None

Print useful debugging information.

Examples:

>>> from narwhals import show_versions
>>> show_versions()

to_native

to_native(
    narwhals_object: DataFrame[IntoDataFrameT],
    *,
    pass_through: Literal[False] = ...
) -> IntoDataFrameT
to_native(
    narwhals_object: LazyFrame[IntoFrameT],
    *,
    pass_through: Literal[False] = ...
) -> IntoFrameT
to_native(
    narwhals_object: Series[IntoSeriesT],
    *,
    pass_through: Literal[False] = ...
) -> IntoSeriesT
to_native(
    narwhals_object: Any, *, pass_through: bool
) -> Any
to_native(
    narwhals_object: (
        DataFrame[IntoDataFrameT]
        | LazyFrame[IntoFrameT]
        | Series[IntoSeriesT]
    ),
    *,
    strict: bool | None = None,
    pass_through: bool | None = None
) -> IntoDataFrameT | IntoFrameT | IntoSeriesT | Any

Convert Narwhals object to native one.

Parameters:

Name Type Description Default
narwhals_object DataFrame[IntoDataFrameT] | LazyFrame[IntoFrameT] | Series[IntoSeriesT]

Narwhals object.

required
strict bool | None

Determine what happens if narwhals_object isn't a Narwhals class

  • True (default): raise an error
  • False: pass object through as-is

Deprecated (v1.13.0)

Please use pass_through instead. Note that strict is still available (and won't emit a deprecation warning) if you use narwhals.stable.v1, see perfect backwards compatibility policy.

None
pass_through bool | None

Determine what happens if narwhals_object isn't a Narwhals class

  • False (default): raise an error
  • True: pass object through as-is
None

Returns:

Type Description
IntoDataFrameT | IntoFrameT | IntoSeriesT | Any

Object of class that user started with.

to_py_scalar

to_py_scalar(scalar_like: Any) -> Any

If a scalar is not Python native, converts it to Python native.

Parameters:

Name Type Description Default
scalar_like Any

Scalar-like value.

required

Returns:

Type Description
Any

Python scalar.

Raises:

Type Description
ValueError

If the object is not convertible to a scalar.

Examples:

>>> import narwhals as nw
>>> import pandas as pd
>>> df = nw.from_native(pd.DataFrame({"a": [1, 2, 3]}))
>>> nw.to_py_scalar(df["a"].item(0))
1
>>> import pyarrow as pa
>>> df = nw.from_native(pa.table({"a": [1, 2, 3]}))
>>> nw.to_py_scalar(df["a"].item(0))
1
>>> nw.to_py_scalar(1)
1

when

when(*predicates: IntoExpr | Iterable[IntoExpr]) -> When

Start a when-then-otherwise expression.

Expression similar to an if-else statement in Python. Always initiated by a pl.when(<condition>).then(<value if condition>), and optionally followed by a .otherwise(<value if condition is false>) can be appended at the end. If not appended, and the condition is not True, None will be returned.

Info

Chaining multiple .when(<condition>).then(<value>) statements is currently not supported. See Narwhals#668.

Parameters:

Name Type Description Default
predicates IntoExpr | Iterable[IntoExpr]

Condition(s) that must be met in order to apply the subsequent statement. Accepts one or more boolean expressions, which are implicitly combined with &. String input is parsed as a column name.

()

Returns:

Type Description
When

A "when" object, which .then can be called on.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> data = {"a": [1, 2, 3], "b": [5, 10, 15]}
>>> df_native = pd.DataFrame(data)
>>> nw.from_native(df_native).with_columns(
...     nw.when(nw.col("a") < 3).then(5).otherwise(6).alias("a_when")
... )
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|    a   b  a_when |
| 0  1   5       5 |
| 1  2  10       5 |
| 2  3  15       6 |
└──────────────────┘