Top-level functions

Here are the top-level functions available in Narwhals.

all

all() -> Expr

Instantiate an expression representing all columns.

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [3.14, 0.123]})
>>> nw.from_native(df_native).select(nw.all() * 2)
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|      a      b    |
|   0  2  6.280    |
|   1  4  0.246    |
└──────────────────┘

all_horizontal

all_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Compute the bitwise AND horizontally across columns.

Parameters:

Name	Type	Description	Default
`exprs`	`IntoExpr \| Iterable[IntoExpr]`	Name(s) of the columns to use in the aggregation function. Accepts expression input.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> data = {
...     "a": [False, False, True, True, False, None],
...     "b": [False, True, True, None, None, None],
... }
>>> df_native = pa.table(data)
>>> nw.from_native(df_native).select("a", "b", all=nw.all_horizontal("a", "b"))
┌─────────────────────────────────────────┐
|           Narwhals DataFrame            |
|-----------------------------------------|
|pyarrow.Table                            |
|a: bool                                  |
|b: bool                                  |
|all: bool                                |
|----                                     |
|a: [[false,false,true,true,false,null]]  |
|b: [[false,true,true,null,null,null]]    |
|all: [[false,false,true,null,false,null]]|
└─────────────────────────────────────────┘

any_horizontal

any_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Compute the bitwise OR horizontally across columns.

Parameters:

Name	Type	Description	Default
`exprs`	`IntoExpr \| Iterable[IntoExpr]`	Name(s) of the columns to use in the aggregation function. Accepts expression input.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> data = {
...     "a": [False, False, True, True, False, None],
...     "b": [False, True, True, None, None, None],
... }
>>> df_native = pl.DataFrame(data)
>>> nw.from_native(df_native).select("a", "b", any=nw.any_horizontal("a", "b"))
┌─────────────────────────┐
|   Narwhals DataFrame    |
|-------------------------|
|shape: (6, 3)            |
|┌───────┬───────┬───────┐|
|│ a     ┆ b     ┆ any   │|
|│ ---   ┆ ---   ┆ ---   │|
|│ bool  ┆ bool  ┆ bool  │|
|╞═══════╪═══════╪═══════╡|
|│ false ┆ false ┆ false │|
|│ false ┆ true  ┆ true  │|
|│ true  ┆ true  ┆ true  │|
|│ true  ┆ null  ┆ true  │|
|│ false ┆ null  ┆ null  │|
|│ null  ┆ null  ┆ null  │|
|└───────┴───────┴───────┘|
└─────────────────────────┘

col

col(*names: str | Iterable[str]) -> Expr

Creates an expression that references one or more columns by their name(s).

Parameters:

Name	Type	Description	Default
`names`	`str \| Iterable[str]`	Name(s) of the columns to use.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2], "b": [3, 4], "c": ["x", "z"]})
>>> nw.from_native(df_native).select(nw.col("a", "b") * nw.col("b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (2, 2)   |
|  ┌─────┬─────┐   |
|  │ a   ┆ b   │   |
|  │ --- ┆ --- │   |
|  │ i64 ┆ i64 │   |
|  ╞═════╪═════╡   |
|  │ 3   ┆ 9   │   |
|  │ 8   ┆ 16  │   |
|  └─────┴─────┘   |
└──────────────────┘

concat

concat(
    items: Iterable[FrameT],
    *,
    how: ConcatMethod = "vertical"
) -> FrameT

Concatenate multiple DataFrames, LazyFrames into a single entity.

Parameters:

Name	Type	Description	Default
`items`	`Iterable[FrameT]`	DataFrames, LazyFrames to concatenate.	required
`how`	`ConcatMethod`	concatenating strategy vertical: Concatenate vertically. Column names must match. horizontal: Concatenate horizontally. If lengths don't match, then missing rows are filled with null values. This is only supported when all inputs are (eager) DataFrames. diagonal: Finds a union between the column schemas and fills missing column values with null.	`'vertical'`

Returns:

Type	Description
`FrameT`	A new DataFrame or LazyFrame resulting from the concatenation.

Raises:

Type	Description
`TypeError`	The items to concatenate should either all be eager, or all lazy

Examples:

Let's take an example of vertical concatenation:

>>> import pandas as pd
>>> import polars as pl
>>> import pyarrow as pa
>>> import narwhals as nw

Let's look at one case a for vertical concatenation (pandas backed):

>>> df_pd_1 = nw.from_native(pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}))
>>> df_pd_2 = nw.from_native(pd.DataFrame({"a": [5, 2], "b": [1, 4]}))
>>> nw.concat([df_pd_1, df_pd_2], how="vertical")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a  b      |
|     0  1  4      |
|     1  2  5      |
|     2  3  6      |
|     0  5  1      |
|     1  2  4      |
└──────────────────┘

Let's look at one case a for horizontal concatenation (polars backed):

>>> df_pl_1 = nw.from_native(pl.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]}))
>>> df_pl_2 = nw.from_native(pl.DataFrame({"c": [5, 2], "d": [1, 4]}))
>>> nw.concat([df_pl_1, df_pl_2], how="horizontal")
┌───────────────────────────┐
|    Narwhals DataFrame     |
|---------------------------|
|shape: (3, 4)              |
|┌─────┬─────┬──────┬──────┐|
|│ a   ┆ b   ┆ c    ┆ d    │|
|│ --- ┆ --- ┆ ---  ┆ ---  │|
|│ i64 ┆ i64 ┆ i64  ┆ i64  │|
|╞═════╪═════╪══════╪══════╡|
|│ 1   ┆ 4   ┆ 5    ┆ 1    │|
|│ 2   ┆ 5   ┆ 2    ┆ 4    │|
|│ 3   ┆ 6   ┆ null ┆ null │|
|└─────┴─────┴──────┴──────┘|
└───────────────────────────┘

Let's look at one case a for diagonal concatenation (pyarrow backed):

>>> df_pa_1 = nw.from_native(pa.table({"a": [1, 2], "b": [3.5, 4.5]}))
>>> df_pa_2 = nw.from_native(pa.table({"a": [3, 4], "z": ["x", "y"]}))
>>> nw.concat([df_pa_1, df_pa_2], how="diagonal")
┌──────────────────────────┐
|    Narwhals DataFrame    |
|--------------------------|
|pyarrow.Table             |
|a: int64                  |
|b: double                 |
|z: string                 |
|----                      |
|a: [[1,2],[3,4]]          |
|b: [[3.5,4.5],[null,null]]|
|z: [[null,null],["x","y"]]|
└──────────────────────────┘

concat_str

concat_str(
    exprs: IntoExpr | Iterable[IntoExpr],
    *more_exprs: IntoExpr,
    separator: str = "",
    ignore_nulls: bool = False
) -> Expr

Horizontally concatenate columns into a single string column.

Parameters:

Name	Type	Description	Default
`exprs`	`IntoExpr \| Iterable[IntoExpr]`	Columns to concatenate into a single string column. Accepts expression input. Strings are parsed as column names, other non-expression inputs are parsed as literals. Non-`String` columns are cast to `String`.	required
`*more_exprs`	`IntoExpr`	Additional columns to concatenate into a single string column, specified as positional arguments.	`()`
`separator`	`str`	String that will be used to separate the values of each column.	`''`
`ignore_nulls`	`bool`	Ignore null values (default is `False`). If set to `False`, null values will be propagated and if the row contains any null values, the output is null.	`False`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> data = {
...     "a": [1, 2, 3],
...     "b": ["dogs", "cats", None],
...     "c": ["play", "swim", "walk"],
... }
>>> df_native = pd.DataFrame(data)
>>> (
...     nw.from_native(df_native).select(
...         nw.concat_str(
...             [nw.col("a") * 2, nw.col("b"), nw.col("c")], separator=" "
...         ).alias("full_sentence")
...     )
... )
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|   full_sentence  |
| 0   2 dogs play  |
| 1   4 cats swim  |
| 2          None  |
└──────────────────┘

exclude

exclude(*names: str | Iterable[str]) -> Expr

Creates an expression that excludes columns by their name(s).

Parameters:

Name	Type	Description	Default
`names`	`str \| Iterable[str]`	Name(s) of the columns to exclude.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2], "b": [3, 4], "c": ["x", "z"]})
>>> nw.from_native(df_native).select(nw.exclude("c", "a"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (2, 1)   |
|  ┌─────┐         |
|  │ b   │         |
|  │ --- │         |
|  │ i64 │         |
|  ╞═════╡         |
|  │ 3   │         |
|  │ 4   │         |
|  └─────┘         |
└──────────────────┘

from_arrow

from_arrow(
    native_frame: IntoArrowTable,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> DataFrame[Any]

Construct a DataFrame from an object which supports the PyCapsule Interface.

Parameters:

Name	Type	Description	Default
`native_frame`	`IntoArrowTable`	Object which implements `__arrow_c_stream__`.	required
`backend`	`ModuleType \| Implementation \| str \| None`	specifies which eager backend instantiate to. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN` or `CUDF`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.31.0) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`

Returns:

Type	Description
`DataFrame[Any]`	A new DataFrame.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [4.2, 5.1]})
>>> nw.from_arrow(df_native, backend="polars")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (2, 2)   |
|  ┌─────┬─────┐   |
|  │ a   ┆ b   │   |
|  │ --- ┆ --- │   |
|  │ i64 ┆ f64 │   |
|  ╞═════╪═════╡   |
|  │ 1   ┆ 4.2 │   |
|  │ 2   ┆ 5.1 │   |
|  └─────┴─────┘   |
└──────────────────┘

from_dict

from_dict(
    data: Mapping[str, Any],
    schema: Mapping[str, DType] | Schema | None = None,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> DataFrame[Any]

Instantiate DataFrame from dictionary.

Indexes (if present, for pandas-like backends) are aligned following the left-hand-rule.

Notes

For pandas-like dataframes, conversion to schema is applied after dataframe creation.

Parameters:

Name	Type	Description	Default
`data`	`Mapping[str, Any]`	Dictionary to create DataFrame from.	required
`schema`	`Mapping[str, DType] \| Schema \| None`	The DataFrame schema as Schema or dict of {name: type}. If not specified, the schema will be inferred by the native library.	`None`
`backend`	`ModuleType \| Implementation \| str \| None`	specifies which eager backend instantiate to. Only necessary if inputs are not Narwhals Series. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN` or `CUDF`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.26.0) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`

Returns:

Type	Description
`DataFrame[Any]`	A new DataFrame.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>> data = {"c": [5, 2], "d": [1, 4]}
>>> nw.from_dict(data, backend="pandas")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        c  d      |
|     0  5  1      |
|     1  2  4      |
└──────────────────┘

from_native

from_native(native_object: SeriesT, **kwds: Any) -> SeriesT

from_native(
    native_object: DataFrameT, **kwds: Any
) -> DataFrameT

from_native(
    native_object: LazyFrameT, **kwds: Any
) -> LazyFrameT

from_native(
    native_object: IntoDataFrameT | IntoSeriesT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: Literal[True]
) -> DataFrame[IntoDataFrameT] | Series[IntoSeriesT]

from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]

from_native(
    native_object: T,
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> T

from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]

from_native(
    native_object: T,
    *,
    pass_through: Literal[True],
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> T

from_native(
    native_object: (
        IntoFrameT | IntoLazyFrameT | IntoSeriesT
    ),
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: Literal[True]
) -> (
    DataFrame[IntoFrameT]
    | LazyFrame[IntoLazyFrameT]
    | Series[IntoSeriesT]
)

from_native(
    native_object: IntoSeriesT,
    *,
    pass_through: Literal[True],
    eager_only: Literal[False] = ...,
    series_only: Literal[True],
    allow_series: None = ...
) -> Series[IntoSeriesT]

from_native(
    native_object: IntoLazyFrameT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> LazyFrame[IntoLazyFrameT]

from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]

from_native(
    native_object: IntoDataFrameT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[True],
    series_only: Literal[False] = ...,
    allow_series: None = ...
) -> DataFrame[IntoDataFrameT]

from_native(
    native_object: IntoFrame | IntoSeries,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[False] = ...,
    allow_series: Literal[True]
) -> DataFrame[Any] | LazyFrame[Any] | Series[Any]

from_native(
    native_object: IntoSeriesT,
    *,
    pass_through: Literal[False] = ...,
    eager_only: Literal[False] = ...,
    series_only: Literal[True],
    allow_series: None = ...
) -> Series[IntoSeriesT]

from_native(
    native_object: Any,
    *,
    pass_through: bool,
    eager_only: bool,
    series_only: bool,
    allow_series: bool | None
) -> Any

from_native(
    native_object: (
        IntoLazyFrameT
        | IntoFrameT
        | IntoSeriesT
        | IntoFrame
        | IntoSeries
        | T
    ),
    *,
    strict: bool | None = None,
    pass_through: bool | None = None,
    eager_only: bool = False,
    series_only: bool = False,
    allow_series: bool | None = None,
    **kwds: Any
) -> (
    LazyFrame[IntoLazyFrameT]
    | DataFrame[IntoFrameT]
    | Series[IntoSeriesT]
    | T
)

Convert native_object to Narwhals Dataframe, Lazyframe, or Series.

Parameters:

Name	Type	Description	Default
`native_object`	`IntoLazyFrameT \| IntoFrameT \| IntoSeriesT \| IntoFrame \| IntoSeries \| T`	Raw object from user. Depending on the other arguments, input object can be a Dataframe / Lazyframe / Series supported by Narwhals (pandas, Polars, PyArrow, ...) an object which implements `__narwhals_dataframe__`, `__narwhals_lazyframe__`, or `__narwhals_series__`	required
`strict`	`bool \| None`	Determine what happens if the object can't be converted to Narwhals `True` or `None` (default): raise an error `False`: pass object through as-is Deprecated (v1.13.0) Please use `pass_through` instead. Note that `strict` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`
`pass_through`	`bool \| None`	Determine what happens if the object can't be converted to Narwhals `False` or `None` (default): raise an error `True`: pass object through as-is	`None`
`eager_only`	`bool`	Whether to only allow eager objects `False` (default): don't require `native_object` to be eager `True`: only convert to Narwhals if `native_object` is eager	`False`
`series_only`	`bool`	Whether to only allow Series `False` (default): don't require `native_object` to be a Series `True`: only convert to Narwhals if `native_object` is a Series	`False`
`allow_series`	`bool \| None`	Whether to allow Series (default is only Dataframe / Lazyframe) `False` or `None` (default): don't convert to Narwhals if `native_object` is a Series `True`: allow `native_object` to be a Series	`None`

Returns:

Type	Description
`LazyFrame[IntoLazyFrameT] \| DataFrame[IntoFrameT] \| Series[IntoSeriesT] \| T`	DataFrame, LazyFrame, Series, or original object, depending on which combination of parameters was passed.

from_numpy

from_numpy(
    data: _2DArray,
    schema: (
        Mapping[str, DType] | Schema | Sequence[str] | None
    ) = None,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> DataFrame[Any]

Construct a DataFrame from a NumPy ndarray.

Notes

Only row orientation is currently supported.

For pandas-like dataframes, conversion to schema is applied after dataframe creation.

Parameters:

Name	Type	Description	Default
`data`	`_2DArray`	Two-dimensional data represented as a NumPy ndarray.	required
`schema`	`Mapping[str, DType] \| Schema \| Sequence[str] \| None`	The DataFrame schema as Schema, dict of {name: type}, or a sequence of str.	`None`
`backend`	`ModuleType \| Implementation \| str \| None`	specifies which eager backend instantiate to. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN` or `CUDF`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.31.0) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`

Returns:

Type	Description
`DataFrame[Any]`	A new DataFrame.

Examples:

>>> import numpy as np
>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> arr = np.array([[5, 2, 1], [1, 4, 3]])
>>> schema = {"c": nw.Int16(), "d": nw.Float32(), "e": nw.Int8()}
>>> nw.from_numpy(arr, schema=schema, backend="pyarrow")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  pyarrow.Table   |
|  c: int16        |
|  d: float        |
|  e: int8         |
|  ----            |
|  c: [[5,1]]      |
|  d: [[2,4]]      |
|  e: [[1,3]]      |
└──────────────────┘

generate_temporary_column_name

generate_temporary_column_name(
    n_bytes: int, columns: Sequence[str]
) -> str

Generates a unique column name that is not present in the given list of columns.

It relies on python secrets token_hex function to return a string nbytes random bytes.

Parameters:

Name	Type	Description	Default
`n_bytes`	`int`	The number of bytes to generate for the token.	required
`columns`	`Sequence[str]`	The list of columns to check for uniqueness.	required

Returns:

Type	Description
`str`	A unique token that is not present in the given list of columns.

Raises:

Type	Description
`AssertionError`	If a unique token cannot be generated after 100 attempts.

Examples:

>>> import narwhals as nw
>>> columns = ["abc", "xyz"]
>>> nw.generate_temporary_column_name(n_bytes=8, columns=columns) not in columns
True

get_level

get_level(
    obj: (
        DataFrame[Any]
        | LazyFrame[Any]
        | Series[IntoSeriesT]
    ),
) -> Literal["full", "lazy", "interchange"]

Level of support Narwhals has for current object.

Parameters:

Name	Type	Description	Default
`obj`	`DataFrame[Any] \| LazyFrame[Any] \| Series[IntoSeriesT]`	Dataframe or Series.	required

Returns:

Type	Description
`Literal['full', 'lazy', 'interchange']`	This can be one of 'full': full Narwhals API support 'lazy': only lazy operations are supported. This excludes anything which involves iterating over rows in Python. 'interchange': only metadata operations are supported (`df.schema`)

get_native_namespace

get_native_namespace(
    *obj: DataFrame[Any]
    | LazyFrame[Any]
    | Series[Any]
    | IntoFrame
    | IntoSeries,
) -> Any

Get native namespace from object.

Parameters:

Name	Type	Description	Default
`obj`	`DataFrame[Any] \| LazyFrame[Any] \| Series[Any] \| IntoFrame \| IntoSeries`	Dataframe, Lazyframe, or Series. Multiple objects can be passed positionally, in which case they must all have the same native namespace (else an error is raised).	`()`

Returns:

Type	Description
`Any`	Native module.

Examples:

>>> import polars as pl
>>> import pandas as pd
>>> import narwhals as nw
>>> df = nw.from_native(pd.DataFrame({"a": [1, 2, 3]}))
>>> nw.get_native_namespace(df)
<module 'pandas'...>
>>> df = nw.from_native(pl.DataFrame({"a": [1, 2, 3]}))
>>> nw.get_native_namespace(df)
<module 'polars'...>

is_ordered_categorical

is_ordered_categorical(series: Series[Any]) -> bool

Return whether indices of categories are semantically meaningful.

This is a convenience function to accessing what would otherwise be the is_ordered property from the DataFrame Interchange Protocol, see https://data-apis.org/dataframe-protocol/latest/API.html.

For Polars:
Enums are always ordered.
Categoricals are ordered if dtype.ordering == "physical".
For pandas-like APIs:
Categoricals are ordered if dtype.cat.ordered == True.
For PyArrow table:
Categoricals are ordered if dtype.type.ordered == True.

Parameters:

Name	Type	Description	Default
`series`	`Series[Any]`	Input Series.	required

Returns:

Type	Description
`bool`	Whether the Series is an ordered categorical.

Examples:

>>> import narwhals as nw
>>> import pandas as pd
>>> import polars as pl
>>> data = ["x", "y"]
>>> s_pd = pd.Series(data, dtype=pd.CategoricalDtype(ordered=True))
>>> s_pl = pl.Series(data, dtype=pl.Categorical(ordering="physical"))

Let's define a library-agnostic function:

>>> @nw.narwhalify
... def func(s):
...     return nw.is_ordered_categorical(s)

Then, we can pass any supported library to func:

>>> func(s_pd)
True
>>> func(s_pl)
True

len

len() -> Expr

Return the number of rows.

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2], "b": [5, None]})
>>> nw.from_native(df_native).select(nw.len())
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (1, 1)   |
|  ┌─────┐         |
|  │ len │         |
|  │ --- │         |
|  │ u32 │         |
|  ╞═════╡         |
|  │ 2   │         |
|  └─────┘         |
└──────────────────┘

lit

lit(
    value: NonNestedLiteral,
    dtype: DType | type[DType] | None = None,
) -> Expr

Return an expression representing a literal value.

Parameters:

Name	Type	Description	Default
`value`	`NonNestedLiteral`	The value to use as literal.	required
`dtype`	`DType \| type[DType] \| None`	The data type of the literal value. If not provided, the data type will be inferred by the native library.	`None`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2]})
>>> nw.from_native(df_native).with_columns(nw.lit(3))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|     a  literal   |
|  0  1        3   |
|  1  2        3   |
└──────────────────┘

max

max(*columns: str) -> Expr

Return the maximum value.

Note

Syntactic sugar for nw.col(columns).max().

Parameters:

Name	Type	Description	Default
`columns`	`str`	Name(s) of the columns to use in the aggregation function.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [5, 10]})
>>> nw.from_native(df_native).select(nw.max("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a   b     |
|     0  2  10     |
└──────────────────┘

max_horizontal

max_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Get the maximum value horizontally across columns.

Notes

We support max_horizontal over numeric columns only.

Parameters:

Name	Type	Description	Default
`exprs`	`IntoExpr \| Iterable[IntoExpr]`	Name(s) of the columns to use in the aggregation function. Accepts expression input.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 8, 3], "b": [4, 5, None]})
>>> nw.from_native(df_native).with_columns(h_max=nw.max_horizontal("a", "b"))
┌──────────────────────┐
|  Narwhals DataFrame  |
|----------------------|
|shape: (3, 3)         |
|┌─────┬──────┬───────┐|
|│ a   ┆ b    ┆ h_max │|
|│ --- ┆ ---  ┆ ---   │|
|│ i64 ┆ i64  ┆ i64   │|
|╞═════╪══════╪═══════╡|
|│ 1   ┆ 4    ┆ 4     │|
|│ 8   ┆ 5    ┆ 8     │|
|│ 3   ┆ null ┆ 3     │|
|└─────┴──────┴───────┘|
└──────────────────────┘

maybe_align_index

maybe_align_index(
    lhs: FrameOrSeriesT,
    rhs: Series[Any] | DataFrame[Any] | LazyFrame[Any],
) -> FrameOrSeriesT

Align lhs to the Index of rhs, if they're both pandas-like.

Parameters:

Name	Type	Description	Default
`lhs`	`FrameOrSeriesT`	Dataframe or Series.	required
`rhs`	`Series[Any] \| DataFrame[Any] \| LazyFrame[Any]`	Dataframe or Series to align with.	required

Returns:

Type	Description
`FrameOrSeriesT`	Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this only checks that lhs and rhs are the same length.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2]}, index=[3, 4])
>>> s_pd = pd.Series([6, 7], index=[4, 3])
>>> df = nw.from_native(df_pd)
>>> s = nw.from_native(s_pd, series_only=True)
>>> nw.to_native(nw.maybe_align_index(df, s))
   a
4  2
3  1

maybe_convert_dtypes

maybe_convert_dtypes(
    obj: FrameOrSeriesT, *args: bool, **kwargs: bool | str
) -> FrameOrSeriesT

Convert columns or series to the best possible dtypes using dtypes supporting pd.NA, if df is pandas-like.

Parameters:

Name	Type	Description	Default
`obj`	`FrameOrSeriesT`	DataFrame or Series.	required
`*args`	`bool`	Additional arguments which gets passed through.	`()`
`**kwargs`	`bool \| str`	Additional arguments which gets passed through.	`{}`

Returns:

Type	Description
`FrameOrSeriesT`	Same type as input.

Notes

For non-pandas-like inputs, this is a no-op. Also, args and kwargs just get passed down to the underlying library as-is.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> import numpy as np
>>> df_pd = pd.DataFrame(
...     {
...         "a": pd.Series([1, 2, 3], dtype=np.dtype("int32")),
...         "b": pd.Series([True, False, np.nan], dtype=np.dtype("O")),
...     }
... )
>>> df = nw.from_native(df_pd)
>>> nw.to_native(
...     nw.maybe_convert_dtypes(df)
... ).dtypes
a             Int32
b           boolean
dtype: object

maybe_get_index

maybe_get_index(
    obj: DataFrame[Any] | LazyFrame[Any] | Series[Any],
) -> Any | None

Get the index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name	Type	Description	Default
`obj`	`DataFrame[Any] \| LazyFrame[Any] \| Series[Any]`	Dataframe or Series.	required

Returns:

Type	Description
`Any \| None`	Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this returns None.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]})
>>> df = nw.from_native(df_pd)
>>> nw.maybe_get_index(df)
RangeIndex(start=0, stop=2, step=1)
>>> series_pd = pd.Series([1, 2])
>>> series = nw.from_native(series_pd, series_only=True)
>>> nw.maybe_get_index(series)
RangeIndex(start=0, stop=2, step=1)

maybe_reset_index

maybe_reset_index(obj: FrameOrSeriesT) -> FrameOrSeriesT

Reset the index to the default integer index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name	Type	Description	Default
`obj`	`FrameOrSeriesT`	Dataframe or Series.	required

Returns:

Type	Description
`FrameOrSeriesT`	Same type as input.

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already resets the index for users. If you're designing a new library, we highly encourage you to not rely on the Index. For non-pandas-like inputs, this is a no-op.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]}, index=([6, 7]))
>>> df = nw.from_native(df_pd)
>>> nw.to_native(nw.maybe_reset_index(df))
   a  b
0  1  4
1  2  5
>>> series_pd = pd.Series([1, 2])
>>> series = nw.from_native(series_pd, series_only=True)
>>> nw.maybe_get_index(series)
RangeIndex(start=0, stop=2, step=1)

maybe_set_index

maybe_set_index(
    obj: FrameOrSeriesT,
    column_names: str | list[str] | None = None,
    *,
    index: (
        Series[IntoSeriesT]
        | list[Series[IntoSeriesT]]
        | None
    ) = None
) -> FrameOrSeriesT

Set the index of a DataFrame or a Series, if it's pandas-like.

Parameters:

Name	Type	Description	Default
`obj`	`FrameOrSeriesT`	object for which maybe set the index (can be either a Narwhals `DataFrame` or `Series`).	required
`column_names`	`str \| list[str] \| None`	name or list of names of the columns to set as index. For dataframes, only one of `column_names` and `index` can be specified but not both. If `column_names` is passed and `df` is a Series, then a `ValueError` is raised.	`None`
`index`	`Series[IntoSeriesT] \| list[Series[IntoSeriesT]] \| None`	series or list of series to set as index.	`None`

Returns:

Type	Description
`FrameOrSeriesT`	Same type as input.

Raises:

Type	Description
`ValueError`	If one of the following conditions happens none of `column_names` and `index` are provided both `column_names` and `index` are provided `column_names` is provided and `df` is a Series

Notes

This is only really intended for backwards-compatibility purposes, for example if your library already aligns indices for users. If you're designing a new library, we highly encourage you to not rely on the Index.

For non-pandas-like inputs, this is a no-op.

Examples:

>>> import pandas as pd
>>> import polars as pl
>>> import narwhals as nw
>>> df_pd = pd.DataFrame({"a": [1, 2], "b": [4, 5]})
>>> df = nw.from_native(df_pd)
>>> nw.to_native(nw.maybe_set_index(df, "b"))
   a
b
4  1
5  2

mean

mean(*columns: str) -> Expr

Get the mean value.

Note

Syntactic sugar for nw.col(columns).mean()

Parameters:

Name	Type	Description	Default
`columns`	`str`	Name(s) of the columns to use in the aggregation function	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 8, 3], "b": [3.14, 6.28, 42.1]})
>>> nw.from_native(df_native).select(nw.mean("a", "b"))
┌─────────────────────────┐
|   Narwhals DataFrame    |
|-------------------------|
|pyarrow.Table            |
|a: double                |
|b: double                |
|----                     |
|a: [[4]]                 |
|b: [[17.173333333333336]]|
└─────────────────────────┘

mean_horizontal

mean_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Compute the mean of all values horizontally across columns.

Parameters:

Name	Type	Description	Default
`exprs`	`IntoExpr \| Iterable[IntoExpr]`	Name(s) of the columns to use in the aggregation function. Accepts expression input.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> data = {"a": [1, 8, 3], "b": [4, 5, None], "c": ["x", "y", "z"]}
>>> df_native = pa.table(data)

We define a dataframe-agnostic function that computes the horizontal mean of "a" and "b" columns:

>>> nw.from_native(df_native).select(nw.mean_horizontal("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| pyarrow.Table    |
| a: double        |
| ----             |
| a: [[2.5,6.5,3]] |
└──────────────────┘

median

median(*columns: str) -> Expr

Get the median value.

Notes

Syntactic sugar for nw.col(columns).median()
Results might slightly differ across backends due to differences in the underlying algorithms used to compute the median.

Parameters:

Name	Type	Description	Default
`columns`	`str`	Name(s) of the columns to use in the aggregation function	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [4, 5, 2]})
>>> nw.from_native(df_native).select(nw.median("a"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  shape: (1, 1)   |
|  ┌─────┐         |
|  │ a   │         |
|  │ --- │         |
|  │ f64 │         |
|  ╞═════╡         |
|  │ 4.0 │         |
|  └─────┘         |
└──────────────────┘

min

min(*columns: str) -> Expr

Return the minimum value.

Note

Syntactic sugar for nw.col(columns).min().

Parameters:

Name	Type	Description	Default
`columns`	`str`	Name(s) of the columns to use in the aggregation function.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 2], "b": [5, 10]})
>>> nw.from_native(df_native).select(nw.min("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  pyarrow.Table   |
|  a: int64        |
|  b: int64        |
|  ----            |
|  a: [[1]]        |
|  b: [[5]]        |
└──────────────────┘

min_horizontal

min_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Get the minimum value horizontally across columns.

Notes

We support min_horizontal over numeric columns only.

Parameters:

Name	Type	Description	Default
`exprs`	`IntoExpr \| Iterable[IntoExpr]`	Name(s) of the columns to use in the aggregation function. Accepts expression input.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 8, 3], "b": [4, 5, None]})
>>> nw.from_native(df_native).with_columns(h_min=nw.min_horizontal("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| pyarrow.Table    |
| a: int64         |
| b: int64         |
| h_min: int64     |
| ----             |
| a: [[1,8,3]]     |
| b: [[4,5,null]]  |
| h_min: [[1,5,3]] |
└──────────────────┘

narwhalify

narwhalify(
    func: Callable[..., Any] | None = None,
    *,
    strict: bool | None = None,
    pass_through: bool | None = None,
    eager_only: bool = False,
    series_only: bool = False,
    allow_series: bool | None = True
) -> Callable[..., Any]

Decorate function so it becomes dataframe-agnostic.

This will try to convert any dataframe/series-like object into the Narwhals respective DataFrame/Series, while leaving the other parameters as they are. Similarly, if the output of the function is a Narwhals DataFrame or Series, it will be converted back to the original dataframe/series type, while if the output is another type it will be left as is. By setting pass_through=False, then every input and every output will be required to be a dataframe/series-like object.

Parameters:

Name	Type	Description	Default
`func`	`Callable[..., Any] \| None`	Function to wrap in a `from_native`-`to_native` block.	`None`
`strict`	`bool \| None`	Determine what happens if the object can't be converted to Narwhals Deprecated (v1.13.0) Please use `pass_through` instead. Note that `strict` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy. `True` or `None` (default): raise an error `False`: pass object through as-is	`None`
`pass_through`	`bool \| None`	Determine what happens if the object can't be converted to Narwhals `False` or `None` (default): raise an error `True`: pass object through as-is	`None`
`eager_only`	`bool`	Whether to only allow eager objects `False` (default): don't require `native_object` to be eager `True`: only convert to Narwhals if `native_object` is eager	`False`
`series_only`	`bool`	Whether to only allow Series `False` (default): don't require `native_object` to be a Series `True`: only convert to Narwhals if `native_object` is a Series	`False`
`allow_series`	`bool \| None`	Whether to allow Series (default is only Dataframe / Lazyframe) `False` or `None`: don't convert to Narwhals if `native_object` is a Series `True` (default): allow `native_object` to be a Series	`True`

Returns:

Type	Description
`Callable[..., Any]`	Decorated function.

Examples:

Instead of writing

>>> import narwhals as nw
>>> def agnostic_group_by_sum(df):
...     df = nw.from_native(df, pass_through=True)
...     df = df.group_by("a").agg(nw.col("b").sum())
...     return nw.to_native(df)

you can just write

>>> @nw.narwhalify
... def agnostic_group_by_sum(df):
...     return df.group_by("a").agg(nw.col("b").sum())

new_series

new_series(
    name: str,
    values: Any,
    dtype: DType | type[DType] | None = None,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None
) -> Series[Any]

Instantiate Narwhals Series from iterable (e.g. list or array).

Parameters:

Name	Type	Description	Default
`name`	`str`	Name of resulting Series.	required
`values`	`Any`	Values of make Series from.	required
`dtype`	`DType \| type[DType] \| None`	(Narwhals) dtype. If not provided, the native library may auto-infer it from `values`.	`None`
`backend`	`ModuleType \| Implementation \| str \| None`	specifies which eager backend instantiate to. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN` or `CUDF`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.31.0) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`

Returns:

Type	Description
`Series[Any]`	A new Series

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> values = [4, 1, 2, 3]
>>> nw.new_series(name="a", values=values, dtype=nw.Int32, backend=pd)
┌─────────────────────┐
|   Narwhals Series   |
|---------------------|
|0    4               |
|1    1               |
|2    2               |
|3    3               |
|Name: a, dtype: int32|
└─────────────────────┘

nth

nth(*indices: int | Sequence[int]) -> Expr

Creates an expression that references one or more columns by their index(es).

Notes

nth is not supported for Polars version<1.0.0. Please use narwhals.col instead.

Parameters:

Name	Type	Description	Default
`indices`	`int \| Sequence[int]`	One or more indices representing the columns to retrieve.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> df_native = pa.table({"a": [1, 2], "b": [3, 4], "c": [0.123, 3.14]})
>>> nw.from_native(df_native).select(nw.nth(0, 2) * 2)
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|pyarrow.Table     |
|a: int64          |
|c: double         |
|----              |
|a: [[2,4]]        |
|c: [[0.246,6.28]] |
└──────────────────┘

read_csv

read_csv(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> DataFrame[Any]

Read a CSV file into a DataFrame.

Parameters:

Name	Type	Description	Default
`source`	`str`	Path to a file.	required
`backend`	`ModuleType \| Implementation \| str \| None`	The eager backend for DataFrame creation. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN` or `CUDF`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.27.2) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`
`kwargs`	`Any`	Extra keyword arguments which are passed to the native CSV reader. For example, you could use `nw.read_csv('file.csv', backend='pandas', engine='pyarrow')`.	`{}`

Returns:

Type	Description
`DataFrame[Any]`	DataFrame.

Examples:

>>> import narwhals as nw
>>> nw.read_csv("file.csv", backend="pandas")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a   b     |
|     0  1   4     |
|     1  2   5     |
└──────────────────┘

read_parquet

read_parquet(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> DataFrame[Any]

Read into a DataFrame from a parquet file.

Parameters:

Name	Type	Description	Default
`source`	`str`	Path to a file.	required
`backend`	`ModuleType \| Implementation \| str \| None`	The eager backend for DataFrame creation. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN` or `CUDF`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.31.0) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`
`kwargs`	`Any`	Extra keyword arguments which are passed to the native parquet reader. For example, you could use `nw.read_parquet('file.parquet', backend=pd, engine='pyarrow')`.	`{}`

Returns:

Type	Description
`DataFrame[Any]`	DataFrame.

Examples:

>>> import pyarrow as pa
>>> import narwhals as nw
>>>
>>> nw.read_parquet("file.parquet", backend="pyarrow")
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|pyarrow.Table     |
|a: int64          |
|c: double         |
|----              |
|a: [[1,2]]        |
|c: [[0.2,0.1]]    |
└──────────────────┘

scan_csv

scan_csv(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> LazyFrame[Any]

Lazily read from a CSV file.

For the libraries that do not support lazy dataframes, the function reads a csv file eagerly and then converts the resulting dataframe to a lazyframe.

Parameters:

Name	Type	Description	Default
`source`	`str`	Path to a file.	required
`backend`	`ModuleType \| Implementation \| str \| None`	The eager backend for DataFrame creation. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN` or `CUDF`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"` or `"cudf"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin` or `cudf`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.31.0) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`
`kwargs`	`Any`	Extra keyword arguments which are passed to the native CSV reader. For example, you could use `nw.scan_csv('file.csv', backend=pd, engine='pyarrow')`.	`{}`

Returns:

Type	Description
`LazyFrame[Any]`	LazyFrame.

Examples:

>>> import duckdb
>>> import narwhals as nw
>>>
>>> nw.scan_csv("file.csv", backend="duckdb").to_native()
┌─────────┬───────┐
│    a    │   b   │
│ varchar │ int32 │
├─────────┼───────┤
│ x       │     1 │
│ y       │     2 │
│ z       │     3 │
└─────────┴───────┘

scan_parquet

scan_parquet(
    source: str,
    *,
    backend: (
        ModuleType | Implementation | str | None
    ) = None,
    native_namespace: ModuleType | None = None,
    **kwargs: Any
) -> LazyFrame[Any]

Lazily read from a parquet file.

For the libraries that do not support lazy dataframes, the function reads a parquet file eagerly and then converts the resulting dataframe to a lazyframe.

Note

Spark like backends require a session object to be passed in kwargs.

For instance:

import narwhals as nw
from sqlframe.duckdb import DuckDBSession

nw.scan_parquet(source, backend="sqlframe", session=DuckDBSession())

Parameters:

Name	Type	Description	Default
`source`	`str`	Path to a file.	required
`backend`	`ModuleType \| Implementation \| str \| None`	The eager backend for DataFrame creation. `backend` can be specified in various ways As `Implementation.<BACKEND>` with `BACKEND` being `PANDAS`, `PYARROW`, `POLARS`, `MODIN`, `CUDF`, `PYSPARK` or `SQLFRAME`. As a string: `"pandas"`, `"pyarrow"`, `"polars"`, `"modin"`, `"cudf"`, `"pyspark"` or `"sqlframe"`. Directly as a module `pandas`, `pyarrow`, `polars`, `modin`, `cudf`, `pyspark.sql` or `sqlframe`.	`None`
`native_namespace`	`ModuleType \| None`	The native library to use for DataFrame creation. Deprecated (v1.31.0) Please use `backend` instead. Note that `native_namespace` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`
`kwargs`	`Any`	Extra keyword arguments which are passed to the native parquet reader. For example, you could use `nw.scan_parquet('file.parquet', backend=pd, engine='pyarrow')`.	`{}`

Returns:

Type	Description
`LazyFrame[Any]`	LazyFrame.

Examples:

>>> import dask.dataframe as dd
>>> from sqlframe.duckdb import DuckDBSession
>>> import narwhals as nw
>>>
>>> nw.scan_parquet("file.parquet", backend="dask").collect()
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|        a   b     |
|     0  1   4     |
|     1  2   5     |
└──────────────────┘
>>> nw.scan_parquet(
...     "file.parquet", backend="sqlframe", session=DuckDBSession()
... ).collect()
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|  pyarrow.Table   |
|  a: int64        |
|  b: int64        |
|  ----            |
|  a: [[1,2]]      |
|  b: [[4,5]]      |
└──────────────────┘

sum

sum(*columns: str) -> Expr

Sum all values.

Note

Syntactic sugar for nw.col(columns).sum()

Parameters:

Name	Type	Description	Default
`columns`	`str`	Name(s) of the columns to use in the aggregation function	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> df_native = pd.DataFrame({"a": [1, 2], "b": [-1.4, 6.2]})
>>> nw.from_native(df_native).select(nw.sum("a", "b"))
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|       a    b     |
|    0  3  4.8     |
└──────────────────┘

sum_horizontal

sum_horizontal(
    *exprs: IntoExpr | Iterable[IntoExpr],
) -> Expr

Sum all values horizontally across columns.

Warning

Unlike Polars, we support horizontal sum over numeric columns only.

Parameters:

Name	Type	Description	Default
`exprs`	`IntoExpr \| Iterable[IntoExpr]`	Name(s) of the columns to use in the aggregation function. Accepts expression input.	`()`

Returns:

Type	Description
`Expr`	A new expression.

Examples:

>>> import polars as pl
>>> import narwhals as nw
>>>
>>> df_native = pl.DataFrame({"a": [1, 2, 3], "b": [5, 10, None]})
>>> nw.from_native(df_native).with_columns(sum=nw.sum_horizontal("a", "b"))
┌────────────────────┐
| Narwhals DataFrame |
|--------------------|
|shape: (3, 3)       |
|┌─────┬──────┬─────┐|
|│ a   ┆ b    ┆ sum │|
|│ --- ┆ ---  ┆ --- │|
|│ i64 ┆ i64  ┆ i64 │|
|╞═════╪══════╪═════╡|
|│ 1   ┆ 5    ┆ 6   │|
|│ 2   ┆ 10   ┆ 12  │|
|│ 3   ┆ null ┆ 3   │|
|└─────┴──────┴─────┘|
└────────────────────┘

show_versions

show_versions() -> None

Print useful debugging information.

Examples:

>>> from narwhals import show_versions
>>> show_versions()

to_native

to_native(
    narwhals_object: DataFrame[IntoDataFrameT],
    *,
    pass_through: Literal[False] = ...
) -> IntoDataFrameT

to_native(
    narwhals_object: LazyFrame[IntoFrameT],
    *,
    pass_through: Literal[False] = ...
) -> IntoFrameT

to_native(
    narwhals_object: Series[IntoSeriesT],
    *,
    pass_through: Literal[False] = ...
) -> IntoSeriesT

to_native(
    narwhals_object: Any, *, pass_through: bool
) -> Any

to_native(
    narwhals_object: (
        DataFrame[IntoDataFrameT]
        | LazyFrame[IntoFrameT]
        | Series[IntoSeriesT]
    ),
    *,
    strict: bool | None = None,
    pass_through: bool | None = None
) -> IntoDataFrameT | IntoFrameT | IntoSeriesT | Any

Convert Narwhals object to native one.

Parameters:

Name	Type	Description	Default
`narwhals_object`	`DataFrame[IntoDataFrameT] \| LazyFrame[IntoFrameT] \| Series[IntoSeriesT]`	Narwhals object.	required
`strict`	`bool \| None`	Determine what happens if `narwhals_object` isn't a Narwhals class `True` (default): raise an error `False`: pass object through as-is Deprecated (v1.13.0) Please use `pass_through` instead. Note that `strict` is still available (and won't emit a deprecation warning) if you use `narwhals.stable.v1`, see perfect backwards compatibility policy.	`None`
`pass_through`	`bool \| None`	Determine what happens if `narwhals_object` isn't a Narwhals class `False` (default): raise an error `True`: pass object through as-is	`None`

Returns:

Type	Description
`IntoDataFrameT \| IntoFrameT \| IntoSeriesT \| Any`	Object of class that user started with.

to_py_scalar

to_py_scalar(scalar_like: Any) -> Any

If a scalar is not Python native, converts it to Python native.

Parameters:

Name	Type	Description	Default
`scalar_like`	`Any`	Scalar-like value.	required

Returns:

Type	Description
`Any`	Python scalar.

Raises:

Type	Description
`ValueError`	If the object is not convertible to a scalar.

Examples:

>>> import narwhals as nw
>>> import pandas as pd
>>> df = nw.from_native(pd.DataFrame({"a": [1, 2, 3]}))
>>> nw.to_py_scalar(df["a"].item(0))
1
>>> import pyarrow as pa
>>> df = nw.from_native(pa.table({"a": [1, 2, 3]}))
>>> nw.to_py_scalar(df["a"].item(0))
1
>>> nw.to_py_scalar(1)
1

when

when(*predicates: IntoExpr | Iterable[IntoExpr]) -> When

Start a when-then-otherwise expression.

Expression similar to an if-else statement in Python. Always initiated by a pl.when(<condition>).then(<value if condition>), and optionally followed by a .otherwise(<value if condition is false>) can be appended at the end. If not appended, and the condition is not True, None will be returned.

Info

Chaining multiple .when(<condition>).then(<value>) statements is currently not supported. See Narwhals#668.

Parameters:

Name	Type	Description	Default
`predicates`	`IntoExpr \| Iterable[IntoExpr]`	Condition(s) that must be met in order to apply the subsequent statement. Accepts one or more boolean expressions, which are implicitly combined with `&`. String input is parsed as a column name.	`()`

Returns:

Type	Description
`When`	A "when" object, which `.then` can be called on.

Examples:

>>> import pandas as pd
>>> import narwhals as nw
>>>
>>> data = {"a": [1, 2, 3], "b": [5, 10, 15]}
>>> df_native = pd.DataFrame(data)
>>> nw.from_native(df_native).with_columns(
...     nw.when(nw.col("a") < 3).then(5).otherwise(6).alias("a_when")
... )
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
|    a   b  a_when |
| 0  1   5       5 |
| 1  2  10       5 |
| 2  3  15       6 |
└──────────────────┘