Skip to content

Commit c7f2848

Browse files
committed
rename variables
1 parent 3fe8ebe commit c7f2848

File tree

1 file changed

+54
-54
lines changed

1 file changed

+54
-54
lines changed

book/dataframes.md

Lines changed: 54 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ The dataset has 5 columns and 5,429,252 rows. We can check that by using the
6363
`polars store-ls` command:
6464

6565
```nu
66-
> let df = polars open Data7602DescendingYearOrder.csv
66+
> let df_0 = polars open Data7602DescendingYearOrder.csv
6767
> polars store-ls | select key type columns rows estimated_size
6868
╭──────────────────────────────────────┬───────────┬─────────┬─────────┬────────────────╮
6969
│ key │ type │ columns │ rows │ estimated_size │
@@ -75,7 +75,7 @@ The dataset has 5 columns and 5,429,252 rows. We can check that by using the
7575
We can have a look at the first lines of the file using [`first`](/commands/docs/first.md):
7676

7777
```nu
78-
> $df | polars first
78+
> $df_0 | polars first
7979
╭───┬──────────┬─────────┬──────┬───────────┬──────────╮
8080
│ # │ anzsic06 │ Area │ year │ geo_count │ ec_count │
8181
├───┼──────────┼─────────┼──────┼───────────┼──────────┤
@@ -86,7 +86,7 @@ We can have a look at the first lines of the file using [`first`](/commands/docs
8686
...and finally, we can get an idea of the inferred data types:
8787

8888
```nu
89-
> $df | polars schema
89+
> $df_0 | polars schema
9090
╭───────────┬─────╮
9191
│ anzsic06 │ str │
9292
│ Area │ str │
@@ -218,10 +218,10 @@ Now, to read that file as a dataframe use the `polars open` command like
218218
this:
219219

220220
```nu
221-
> let df = polars open test_small.csv
221+
> let df_1 = polars open test_small.csv
222222
```
223223

224-
This should create the value `$df` in memory which holds the data we just
224+
This should create the value `$df_1` in memory which holds the data we just
225225
created.
226226

227227
::: tip
@@ -246,7 +246,7 @@ And if you want to see a preview of the loaded dataframe you can send the
246246
dataframe variable to the stream
247247

248248
```nu
249-
> $df
249+
> $df_1
250250
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
251251
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
252252
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -277,7 +277,7 @@ Let's start with basic aggregations on the dataframe. Let's sum all the columns
277277
that exist in `df` by using the `aggregate` command
278278

279279
```nu
280-
> $df | polars sum
280+
> $df_1 | polars sum
281281
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬──────╮
282282
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
283283
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼──────┤
@@ -290,7 +290,7 @@ a sum makes sense. If you want to filter out the text column, you can select
290290
the columns you want by using the [`polars select`](/commands/docs/polars_select.md) command
291291

292292
```nu
293-
> $df | polars sum | polars select int_1 int_2 float_1 float_2
293+
> $df_1 | polars sum | polars select int_1 int_2 float_1 float_2
294294
╭───┬───────┬───────┬─────────┬─────────╮
295295
│ # │ int_1 │ int_2 │ float_1 │ float_2 │
296296
├───┼───────┼───────┼─────────┼─────────┤
@@ -302,7 +302,7 @@ You can even store the result from this aggregation as you would store any
302302
other Nushell variable
303303

304304
```nu
305-
> let res = $df | polars sum | polars select int_1 int_2 float_1 float_2
305+
> let res = $df_1 | polars sum | polars select int_1 int_2 float_1 float_2
306306
```
307307

308308
::: tip
@@ -347,15 +347,15 @@ are going to call it `test_small_a.csv`)
347347
We use the `polars open` command to create the new variable
348348

349349
```nu
350-
> let df_a = polars open test_small_a.csv
350+
> let df_2 = polars open test_small_a.csv
351351
```
352352

353353
Now, with the second dataframe loaded in memory we can join them using the
354354
column called `int_1` from the left dataframe and the column `int_1` from the
355355
right dataframe
356356

357357
```nu
358-
> $df | polars join $df_a int_1 int_1
358+
> $df_1 | polars join $df_2 int_1 int_1
359359
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────┬─────────┬───────────┬───────────┬─────────╮
360360
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │ int_2_x │ float_1_x │ float_2_x │ first_x │
361361
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┼─────────┼───────────┼───────────┼─────────┤
@@ -376,7 +376,7 @@ as long as they have the same type.
376376
For example:
377377

378378
```nu
379-
> $df | polars join $df_a [int_1 first] [int_1 first]
379+
> $df_1 | polars join $df_2 [int_1 first] [int_1 first]
380380
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────┬─────────┬───────────┬───────────╮
381381
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │ int_2_x │ float_1_x │ float_2_x │
382382
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┼─────────┼───────────┼───────────┤
@@ -402,7 +402,7 @@ operations with the same group condition.
402402
To create a `GroupBy` object you only need to use the [`polars_group-by`](/commands/docs/polars_group-by.md) command
403403

404404
```nu
405-
> let group = $df | polars group-by first
405+
> let group = $df_1 | polars group-by first
406406
> $group
407407
╭─────────────┬──────────────────────────────────────────────╮
408408
│ LazyGroupBy │ apply aggregation to complete execution plan │
@@ -463,8 +463,8 @@ as integers, decimals, or strings. Let's create a small dataframe using the
463463
command `polars into-df`.
464464

465465
```nu
466-
> let a = [[a b]; [1 2] [3 4] [5 6]] | polars into-df
467-
> $a
466+
> let df_3 = [[a b]; [1 2] [3 4] [5 6]] | polars into-df
467+
> $df_3
468468
╭───┬───┬───╮
469469
│ # │ a │ b │
470470
├───┼───┼───┤
@@ -480,11 +480,11 @@ a dataframe. This will change in the future, as the dataframe feature matures
480480
:::
481481

482482
We can append columns to a dataframe in order to create a new variable. As an
483-
example, let's append two columns to our mini dataframe `$a`
483+
example, let's append two columns to our mini dataframe `$df_3`
484484

485485
```nu
486-
> let a2 = $a | polars with-column $a.a --name a2 | polars with-column $a.a --name a3
487-
> $a2
486+
> let df_4 = $df_3 | polars with-column $df_3.a --name a2 | polars with-column $df_3.a --name a3
487+
> $df_4
488488
╭───┬───┬───┬────┬────╮
489489
│ # │ a │ b │ a2 │ a3 │
490490
├───┼───┼───┼────┼────┤
@@ -520,7 +520,7 @@ the data as packed as possible (check [Arrow columnar
520520
format](https://arrow.apache.org/docs/format/Columnar.html)). The other
521521
optimization trick is the fact that whenever possible, the columns from the
522522
dataframes are shared between dataframes, avoiding memory duplication for the
523-
same data. This means that dataframes `$a` and `$a2` are sharing the same two
523+
same data. This means that dataframes `$df_3` and `$df_4` are sharing the same two
524524
columns we created using the `polars into-df` command. For this reason, it isn't
525525
possible to change the value of a column in a dataframe. However, you can
526526
create new columns based on data from other columns or dataframes.
@@ -535,8 +535,8 @@ Let's start our exploration with Series by creating one using the `polars into-d
535535
command:
536536

537537
```nu
538-
> let new = [9 8 4] | polars into-df
539-
> $new
538+
> let df_5 = [9 8 4] | polars into-df
539+
> $df_5
540540
╭───┬───╮
541541
│ # │ 0 │
542542
├───┼───┤
@@ -554,8 +554,8 @@ other Series. Let's create a new Series by doing some arithmetic on the
554554
previously created column.
555555

556556
```nu
557-
> let new_2 = $new * 3 + 10
558-
> $new_2
557+
> let df_6 = $df_5 * 3 + 10
558+
> $df_6
559559
╭───┬────╮
560560
│ # │ 0 │
561561
├───┼────┤
@@ -576,8 +576,8 @@ use `scope variables`
576576
Let's rename our previous Series so it has a memorable name
577577

578578
```nu
579-
> let new_2a = $new_2 | polars rename "0" memorable
580-
> $new_2a
579+
> let df_7 = $df_6 | polars rename "0" memorable
580+
> $df_7
581581
╭───┬───────────╮
582582
│ # │ memorable │
583583
├───┼───────────┤
@@ -591,7 +591,7 @@ We can also do basic operations with two Series as long as they have the same
591591
data type
592592

593593
```nu
594-
> $new - $new_2a
594+
> $df_5 - $df_7
595595
╭───┬─────────────────╮
596596
│ # │ sub_0_memorable │
597597
├───┼─────────────────┤
@@ -604,8 +604,8 @@ data type
604604
And we can add them to previously defined dataframes
605605

606606
```nu
607-
> let new_df = $a | polars with-column $new --name new_col
608-
> $new_df
607+
> let df_8 = $df_3 | polars with-column $df_5 --name new_col
608+
> $df_8
609609
╭───┬───┬───┬─────────╮
610610
│ # │ a │ b │ new_col │
611611
├───┼───┼───┼─────────┤
@@ -619,7 +619,7 @@ The Series stored in a Dataframe can also be used directly, for example,
619619
we can multiply columns `a` and `b` to create a new Series
620620

621621
```nu
622-
> $new_df.a * $new_df.b
622+
> $df_8.a * $df_8.b
623623
╭───┬─────────╮
624624
│ # │ mul_a_b │
625625
├───┼─────────┤
@@ -632,8 +632,8 @@ we can multiply columns `a` and `b` to create a new Series
632632
and we can start piping things in order to create new columns and dataframes
633633

634634
```nu
635-
> let $new_df_a = $new_df | polars with-column ($new_df.a * $new_df.b / $new_df.new_col) --name my_sum
636-
> $new_df_a
635+
> let df_9 = $df_8 | polars with-column ($df_8.a * $new_df.b / $new_df.new_col) --name my_sum
636+
> $df_9
637637
╭───┬───┬───┬─────────┬────────╮
638638
│ # │ a │ b │ new_col │ my_sum │
639639
├───┼───┼───┼─────────┼────────┤
@@ -652,8 +652,8 @@ that we can build boolean masks out of them. Let's start by creating a simple
652652
mask using the equality operator
653653

654654
```nu
655-
> let mask = $new == 8
656-
> $mask
655+
> let mask_0 = $df_5 == 8
656+
> $mask_0
657657
╭───┬───────╮
658658
│ # │ 0 │
659659
├───┼───────┤
@@ -666,7 +666,7 @@ mask using the equality operator
666666
and with this mask we can now filter a dataframe, like this
667667

668668
```nu
669-
> $new_df_a | polars filter-with $mask
669+
> $df_9 | polars filter-with $mask_0
670670
╭───┬───┬───┬─────────┬────────╮
671671
│ # │ a │ b │ new_col │ my_sum │
672672
├───┼───┼───┼─────────┼────────┤
@@ -679,8 +679,8 @@ Now we have a new dataframe with only the values where the mask was true.
679679
The masks can also be created from Nushell lists, for example:
680680

681681
```nu
682-
> let mask1 = [true true false] | polars into-df
683-
> $new_df_a | polars filter-with $mask1
682+
> let mask_1 = [true true false] | polars into-df
683+
> $df_9 | polars filter-with $mask_1
684684
╭───┬───┬───┬─────────┬────────╮
685685
│ # │ a │ b │ new_col │ my_sum │
686686
├───┼───┼───┼─────────┼────────┤
@@ -692,7 +692,7 @@ The masks can also be created from Nushell lists, for example:
692692
To create complex masks, we have the `AND`
693693

694694
```nu
695-
> $mask and $mask1
695+
> $mask_0 and $mask_1
696696
╭───┬─────────╮
697697
│ # │ and_0_0 │
698698
├───┼─────────┤
@@ -705,7 +705,7 @@ To create complex masks, we have the `AND`
705705
and `OR` operations
706706

707707
```nu
708-
> $mask or $mask1
708+
> $mask_0 or $mask_1
709709
╭───┬────────╮
710710
│ # │ or_0_0 │
711711
├───┼────────┤
@@ -719,8 +719,8 @@ We can also create a mask by checking if some values exist in other Series.
719719
Using the first dataframe that we created we can do something like this
720720

721721
```nu
722-
> let mask3 = $df | polars col first | polars is-in [b c]
723-
> $mask3
722+
> let mask_2 = $df_1 | polars col first | polars is-in [b c]
723+
> $mask_2
724724
╭──────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────╮
725725
│ input │ [table 2 rows] │
726726
│ function │ Boolean(IsIn) │
@@ -733,7 +733,7 @@ Using the first dataframe that we created we can do something like this
733733
and this new mask can be used to filter the dataframe
734734

735735
```nu
736-
> $df | polars filter-with $mask3
736+
> $df_1 | polars filter-with $mask_2
737737
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
738738
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
739739
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -756,7 +756,7 @@ This is example is not updated to recent Nushell versions.
756756
:::
757757

758758
```nu
759-
> $df | polars get first | polars set new --mask ($df.first =~ a)
759+
> $df_1 | polars get first | polars set new --mask ($df_1.first =~ a)
760760
╭───┬────────╮
761761
│ # │ string │
762762
├───┼────────┤
@@ -781,8 +781,8 @@ from our original dataframe. With that in mind, we can use the next command to
781781
extract that information
782782

783783
```nu
784-
> let indices = [1 4 6] | polars into-df
785-
> $df | polars take $indices
784+
> let indices_0 = [1 4 6] | polars into-df
785+
> $df_1 | polars take $indices_0
786786
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
787787
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
788788
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -798,8 +798,8 @@ column `first`. In order to do that, we can use the command `polars arg-unique`
798798
shown in the next example
799799

800800
```nu
801-
> let indices = $df | polars get first | polars arg-unique
802-
> $df | polars take $indices
801+
> let indices_1 = $df_1 | polars get first | polars arg-unique
802+
> $df_1 | polars take $indices_1
803803
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
804804
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
805805
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -818,8 +818,8 @@ The same result could be accomplished using the command [`sort`](/commands/docs/
818818
:::
819819

820820
```nu
821-
> let indices_1 = $df | polars get word | polars arg-sort
822-
> $df | polars take $indices_1
821+
> let indices_2 = $df_1 | polars get word | polars arg-sort
822+
> $df_1 | polars take $indices_2
823823
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
824824
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
825825
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -840,8 +840,8 @@ And finally, we can create new Series by setting a new value in the marked
840840
indices. Have a look at the next command
841841

842842
```nu
843-
> let indices_2 = [0 2] | polars into-df
844-
> $df | polars get int_1 | polars set-with-idx 123 --indices $indices_2
843+
> let indices_3 = [0 2] | polars into-df
844+
> $df_1 | polars get int_1 | polars set-with-idx 123 --indices $indices_3
845845
╭───┬───────╮
846846
│ # │ int_1 │
847847
├───┼───────┤
@@ -870,7 +870,7 @@ example, we can use it to count how many occurrences we have in the column
870870
`first`
871871

872872
```nu
873-
> $df | polars get first | polars value-counts
873+
> $df_1 | polars get first | polars value-counts
874874
╭───┬───────┬───────╮
875875
│ # │ first │ count │
876876
├───┼───────┼───────┤
@@ -887,7 +887,7 @@ Continuing with our exploration of `Series`, the next thing that we can do is
887887
to only get the unique unique values from a series, like this
888888

889889
```nu
890-
> $df | polars get first | polars unique
890+
> $df_1 | polars get first | polars unique
891891
╭───┬───────╮
892892
│ # │ first │
893893
├───┼───────┤
@@ -902,7 +902,7 @@ unique or duplicated. For example, we can select the rows for unique values
902902
in column `word`
903903

904904
```nu
905-
$df | polars filter-with ($in.word | polars is-unique)
905+
$df_1 | polars filter-with ($in.word | polars is-unique)
906906
```
907907
```output-numd
908908
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬───────╮
@@ -916,7 +916,7 @@ $df | polars filter-with ($in.word | polars is-unique)
916916
Or all the duplicated ones
917917

918918
```nu
919-
$df | polars filter-with ($in.word | polars is-duplicated)
919+
$df_1 | polars filter-with ($in.word | polars is-duplicated)
920920
```
921921
```output-numd
922922
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮

0 commit comments

Comments
 (0)