Skip to content

Conversion between libraries

Some library maintainers must apply complex dataframe operations, using methods and functions that may not (yet) be implemented in Narwhals. In such cases, Narwhals can still be highly beneficial, by allowing easy dataframe conversion.

Dataframe X in, pandas out

Imagine that you maintain a library with a function that operates on pandas dataframes to produce automated reports. You want to allow users to supply a dataframe in any format to that function (pandas, Polars, DuckDB, cuDF, Modin, etc.) without adding all those dependencies to your own project and without special-casing each input library's variation of to_pandas / toPandas / to_pandas_df / df ...

One solution is to use Narwhals as a thin Dataframe ingestion layer, to convert user-supplied dataframe to the format that your library uses internally. Since Narwhals is zero-dependency, this is a much more lightweight solution than including all the dataframe libraries as dependencies, and easier to write than special casing each input library's to_pandas method (if it even exists!).

To illustrate, we create dataframes in various formats:

import narwhals as nw
from narwhals.typing import IntoDataFrame
from typing import Any

import duckdb
import polars as pl
import pandas as pd

df_polars = pl.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "fruits": ["banana", "banana", "apple", "apple", "banana"],
        "B": [5, 4, 3, 2, 1],
        "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
    }
)
df_pandas = df_polars.to_pandas()
df_duckdb = duckdb.sql("SELECT * FROM df_polars")

Now, we define a function that can ingest any dataframe type supported by Narwhals, and convert it to a pandas DataFrame for internal use:

def df_to_pandas(df: IntoDataFrame) -> pd.DataFrame:
    return nw.from_native(df).to_pandas()


print(df_to_pandas(df_polars))
   A  fruits  B    cars
0  1  banana  5  beetle
1  2  banana  4    audi
2  3   apple  3  beetle
3  4   apple  2  beetle
4  5  banana  1  beetle

Dataframe X in, Polars out

Via PyCapsule Interface

Similarly, if your library uses Polars internally, you can convert any user-supplied dataframe which implements __arrow_c_stream__:

def df_to_polars(df_native: Any) -> pl.DataFrame:
    if hasattr(df_native, "__arrow_c_stream__"):
        return nw.from_arrow(df_native, native_namespace=pl).to_native()
    msg = (
        f"Expected object which implements '__arrow_c_stream__' got: {type(df_native)}"
    )
    raise TypeError(msg)


print(df_to_polars(df_duckdb))  # You can only execute this line of code once.
shape: (5, 4)
┌─────┬────────┬─────┬────────┐
 A    fruits  B    cars   
 ---  ---     ---  ---    
 i64  str     i64  str    
╞═════╪════════╪═════╪════════╡
 1    banana  5    beetle 
 2    banana  4    audi   
 3    apple   3    beetle 
 4    apple   2    beetle 
 5    banana  1    beetle 
└─────┴────────┴─────┴────────┘

It works to pass Polars to native_namespace here because Polars supports the PyCapsule Interface for import.

Note that the PyCapsule Interface makes no guarantee that you can call it repeatedly, so the approach above only works if you only expect to perform the conversion a single time on each input object.

Via PyArrow

If you need to ingest the same dataframe multiple times, then you may want to go via PyArrow instead. This may be less efficient than the PyCapsule approach above (and always requires PyArrow!), but is more forgiving:

def df_to_polars(df_native: IntoDataFrame) -> pl.DataFrame:
    df = nw.from_native(df_native).lazy().collect()
    return pl.DataFrame(nw.from_native(df, eager_only=True).to_arrow())


df_duckdb = duckdb.sql("SELECT * FROM df_polars")
print(df_to_polars(df_duckdb))  # We can execute this...
print(df_to_polars(df_duckdb))  # ...as many times as we like!
shape: (5, 4)
┌─────┬────────┬─────┬────────┐
 A    fruits  B    cars   
 ---  ---     ---  ---    
 i64  str     i64  str    
╞═════╪════════╪═════╪════════╡
 1    banana  5    beetle 
 2    banana  4    audi   
 3    apple   3    beetle 
 4    apple   2    beetle 
 5    banana  1    beetle 
└─────┴────────┴─────┴────────┘
shape: (5, 4)
┌─────┬────────┬─────┬────────┐
 A    fruits  B    cars   
 ---  ---     ---  ---    
 i64  str     i64  str    
╞═════╪════════╪═════╪════════╡
 1    banana  5    beetle 
 2    banana  4    audi   
 3    apple   3    beetle 
 4    apple   2    beetle 
 5    banana  1    beetle 
└─────┴────────┴─────┴────────┘