docs: use IntoDataFrameT/IntoFrameT in docs (#1664)

FBruzzesi · MarcoGorelli · web-flow · commit 0a67642a5076 · 2024-12-29T22:04:13.000Z
* docs: use intoDataFrame in docs

* fixup

---------

Co-authored-by: Marco Gorelli &lt;33491632+MarcoGorelli@users.noreply.github.com&gt;
diff --git a/docs/pandas_like_concepts/boolean.md b/docs/pandas_like_concepts/boolean.md
@@ -8,12 +8,12 @@ For example, if you do `nw.col('a')*2`, then:
 
 ```python exec="1" source="above" session="boolean"
 import narwhals as nw
-from narwhals.typing import FrameT
+from narwhals.typing import IntoFrameT
 
 data = {"a": [1.4, None, 4.2]}
 
 
-def multiplication(df: FrameT) -> FrameT:
+def multiplication(df: IntoFrameT) -> IntoFrameT:
     return nw.from_native(df).with_columns((nw.col("a") * 2).alias("a*2")).to_native()
 ```
 
@@ -57,6 +57,9 @@ be a temporary legacy pandas issue which will eventually go
 away anyway.
 
 ```python exec="1" source="above" session="boolean"
+from narwhals.typing import FrameT
+
+
 def comparison(df: FrameT) -> FrameT:
     return nw.from_native(df).with_columns((nw.col("a") > 2).alias("a>2")).to_native()
 ```
diff --git a/docs/pandas_like_concepts/column_names.md b/docs/pandas_like_concepts/column_names.md
@@ -2,12 +2,11 @@
 
 Polars and PyArrow only allow for string column names. What about pandas?
 
-```python
->>> import pandas as pd
->>> pd.concat([pd.Series([1, 2], name=0), pd.Series([1, 3], name=0)], axis=1)
-   0  0
-0  1  1
-1  2  3
+```python exec="true" source="above" result="python" session="col_names"
+import pandas as pd
+
+df = pd.concat([pd.Series([1, 2], name=0), pd.Series([1, 3], name=0)], axis=1)
+print(df)
 ```
 
 Oh...not only does it let us create a dataframe with a column named `0` - it lets us
diff --git a/docs/pandas_like_concepts/pandas_index.md b/docs/pandas_like_concepts/pandas_index.md
@@ -20,9 +20,10 @@ Let's learn about what Narwhals promises.
 
 ```python exec="1" source="above" session="ex1"
 import narwhals as nw
+from narwhals.typing import IntoFrameT
 
 
-def my_func(df):
+def my_func(df: IntoFrameT) -> IntoFrameT:
     df = nw.from_native(df)
     df = df.with_columns(a_plus_one=nw.col("a") + 1)
     return nw.to_native(df)
@@ -51,13 +52,16 @@ df_pd = pd.DataFrame({"a": [2, 1, 3], "b": [4, 5, 6]})
 s_pd = df_pd["a"].sort_values()
 df_pd["a_sorted"] = s_pd
 ```
+
 Reading the code, you might expect that `'a_sorted'` will contain the
 values `[1, 2, 3]`.
 
 **However**, here's what actually happens:
+
 ```python exec="1" source="material-block" session="ex2" result="python"
 print(df_pd)
 ```
+
 In other words, pandas' index alignment undid the `sort_values` operation!
 
 Narwhals, on the other hand, preserves the index of the left-hand-side argument.
diff --git a/docs/pandas_like_concepts/user_warning.md b/docs/pandas_like_concepts/user_warning.md
@@ -14,14 +14,15 @@ The pandas API most likely cannot efficiently handle the complexity of the aggre
     ```python exec="true" source="above" result="python" session="df_ex1"
     import narwhals as nw
     import pandas as pd
+    from narwhals.typing import IntoFrameT
 
     data = {"a": [1, 2, 3, 4, 5], "b": [5, 4, 3, 2, 1], "c": [10, 20, 30, 40, 50]}
 
     df_pd = pd.DataFrame(data)
 
 
     @nw.narwhalify
-    def approach_1(df):
+    def approach_1(df: IntoFrameT) -> IntoFrameT:
 
         # Pay attention to this next line
         df = df.group_by("a").agg(d=(nw.col("b") + nw.col("c")).sum())
@@ -43,7 +44,7 @@ The pandas API most likely cannot efficiently handle the complexity of the aggre
 
 
     @nw.narwhalify
-    def approach_2(df):
+    def approach_2(df: IntoFrameT) -> IntoFrameT:
 
         # Pay attention to this next line
         df = df.with_columns(d=nw.col("b") + nw.col("c")).group_by("a").agg(nw.sum("d"))
@@ -54,40 +55,45 @@ The pandas API most likely cannot efficiently handle the complexity of the aggre
     print(approach_2(df_pd))
     ```
 
-
 Both Approaches shown above return the exact same result, but Approach 1 is inefficient and returns the warning message
 we showed at the top.
 
 What makes the first approach inefficient and the second approach efficient? It comes down to what the
 pandas API lets us express.
 
 ## Approach 1
+
 ```python
 # From line 11
 
 return df.group_by("a").agg((nw.col("b") + nw.col("c")).sum().alias("d"))
 ```
 
 To translate this to pandas, we would do:
+
 ```python
 df.groupby("a").apply(
     lambda df: pd.Series([(df["b"] + df["c"]).sum()], index=["d"]), include_groups=False
 )
 ```
+
 Any time you use `apply` in pandas, that's a performance footgun - best to avoid it and use vectorised operations instead.
 Let's take a look at how "approach 2" gets translated to pandas to see the difference.
 
 ## Approach 2
+
 ```python
 # Line 11 in Approach 2
 
 return df.with_columns(d=nw.col("b") + nw.col("c")).group_by("a").agg({"d": "sum"})
 ```
 
 This gets roughly translated to:
+
 ```python
 df.assign(d=lambda df: df["b"] + df["c"]).groupby("a").agg({"d": "sum"})
 ```
+
 Because we're using pandas' own API, as opposed to `apply` and a custom `lambda` function, then this is going to be much more efficient.
 
 ## Tips for Avoiding the `UserWarning`