Skip to content

What about the pandas Index?

There are two types of pandas users:

  • The ones who make full use of the Index's power.
  • The .reset_index(drop=True) ones, who would rather not think about the Index.

Narwhals aims to accommodate both!

  • If you'd rather not think about the Index, then don't worry: it's not part of the Narwhals public API, and you'll never have to worry about resetting the index or about pandas doing funky index alignment for you.
  • If you want your library to cater to Index powerusers who would be very angry if you reset their beautiful Index on their behalf, then don't worry: Narwhals makes certain promises with regards to the Index.

Let's learn about what Narwhals promises.

1. Narwhals will preserve your index for dataframe operations

import narwhals as nw


def my_func(df):
    df = nw.from_native(df)
    df = df.with_columns(a_plus_one=nw.col("a") + 1)
    return nw.to_native(df)

Let's start with a dataframe with an Index with values [7, 8, 9].

import pandas as pd

df = pd.DataFrame({"a": [2, 1, 3], "b": [3, 5, -3]}, index=[7, 8, 9])
print(my_func(df))
   a  b  a_plus_one
7  2  3           3
8  1  5           2
9  3 -3           4

Note how the result still has the original index - Narwhals did not modify it.

2. Index alignment follows the left-hand-rule

pandas automatically aligns indices for users. For example:

import pandas as pd

df_pd = pd.DataFrame({"a": [2, 1, 3], "b": [4, 5, 6]})
s_pd = df_pd["a"].sort_values()
df_pd["a_sorted"] = s_pd
Reading the code, you might expect that 'a_sorted' will contain the values [1, 2, 3].

However, here's what actually happens:

print(df_pd)
   a  b  a_sorted
0  2  4         2
1  1  5         1
2  3  6         3
In other words, pandas' index alignment undid the sort_values operation!

Narwhals, on the other hand, preserves the index of the left-hand-side argument. Everything else will be inserted positionally, just like Polars would do:

import narwhals as nw

df = nw.from_native(df_pd)
s = nw.from_native(s_pd, allow_series=True)
df = df.with_columns(a_sorted=s.sort())
print(nw.to_native(df))
   a  b  a_sorted
0  2  4         1
1  1  5         2
2  3  6         3

If you keep these two rules in mind, then Narwhals will both help you avoid Index-related surprises whilst letting you preserve the Index for the subset of your users who consciously make great use of it.