@@ -63,7 +63,7 @@ The dataset has 5 columns and 5,429,252 rows. We can check that by using the
63
63
` polars store-ls ` command:
64
64
65
65
``` nu
66
- > let df = polars open Data7602DescendingYearOrder.csv
66
+ > let df_0 = polars open Data7602DescendingYearOrder.csv
67
67
> polars store-ls | select key type columns rows estimated_size
68
68
╭──────────────────────────────────────┬───────────┬─────────┬─────────┬────────────────╮
69
69
│ key │ type │ columns │ rows │ estimated_size │
@@ -75,7 +75,7 @@ The dataset has 5 columns and 5,429,252 rows. We can check that by using the
75
75
We can have a look at the first lines of the file using [ ` first ` ] ( /commands/docs/first.md ) :
76
76
77
77
``` nu
78
- > $df | polars first
78
+ > $df_0 | polars first
79
79
╭───┬──────────┬─────────┬──────┬───────────┬──────────╮
80
80
│ # │ anzsic06 │ Area │ year │ geo_count │ ec_count │
81
81
├───┼──────────┼─────────┼──────┼───────────┼──────────┤
@@ -86,7 +86,7 @@ We can have a look at the first lines of the file using [`first`](/commands/docs
86
86
...and finally, we can get an idea of the inferred data types:
87
87
88
88
``` nu
89
- > $df | polars schema
89
+ > $df_0 | polars schema
90
90
╭───────────┬─────╮
91
91
│ anzsic06 │ str │
92
92
│ Area │ str │
@@ -218,10 +218,10 @@ Now, to read that file as a dataframe use the `polars open` command like
218
218
this:
219
219
220
220
``` nu
221
- > let df = polars open test_small.csv
221
+ > let df_1 = polars open test_small.csv
222
222
```
223
223
224
- This should create the value ` $df ` in memory which holds the data we just
224
+ This should create the value ` $df_1 ` in memory which holds the data we just
225
225
created.
226
226
227
227
::: tip
@@ -246,7 +246,7 @@ And if you want to see a preview of the loaded dataframe you can send the
246
246
dataframe variable to the stream
247
247
248
248
``` nu
249
- > $df
249
+ > $df_1
250
250
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
251
251
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
252
252
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -277,7 +277,7 @@ Let's start with basic aggregations on the dataframe. Let's sum all the columns
277
277
that exist in ` df ` by using the ` aggregate ` command
278
278
279
279
``` nu
280
- > $df | polars sum
280
+ > $df_1 | polars sum
281
281
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬──────╮
282
282
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
283
283
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼──────┤
@@ -290,7 +290,7 @@ a sum makes sense. If you want to filter out the text column, you can select
290
290
the columns you want by using the [ ` polars select ` ] ( /commands/docs/polars_select.md ) command
291
291
292
292
``` nu
293
- > $df | polars sum | polars select int_1 int_2 float_1 float_2
293
+ > $df_1 | polars sum | polars select int_1 int_2 float_1 float_2
294
294
╭───┬───────┬───────┬─────────┬─────────╮
295
295
│ # │ int_1 │ int_2 │ float_1 │ float_2 │
296
296
├───┼───────┼───────┼─────────┼─────────┤
@@ -302,7 +302,7 @@ You can even store the result from this aggregation as you would store any
302
302
other Nushell variable
303
303
304
304
``` nu
305
- > let res = $df | polars sum | polars select int_1 int_2 float_1 float_2
305
+ > let res = $df_1 | polars sum | polars select int_1 int_2 float_1 float_2
306
306
```
307
307
308
308
::: tip
@@ -347,15 +347,15 @@ are going to call it `test_small_a.csv`)
347
347
We use the ` polars open ` command to create the new variable
348
348
349
349
``` nu
350
- > let df_a = polars open test_small_a.csv
350
+ > let df_2 = polars open test_small_a.csv
351
351
```
352
352
353
353
Now, with the second dataframe loaded in memory we can join them using the
354
354
column called ` int_1 ` from the left dataframe and the column ` int_1 ` from the
355
355
right dataframe
356
356
357
357
``` nu
358
- > $df | polars join $df_a int_1 int_1
358
+ > $df_1 | polars join $df_2 int_1 int_1
359
359
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────┬─────────┬───────────┬───────────┬─────────╮
360
360
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │ int_2_x │ float_1_x │ float_2_x │ first_x │
361
361
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┼─────────┼───────────┼───────────┼─────────┤
@@ -376,7 +376,7 @@ as long as they have the same type.
376
376
For example:
377
377
378
378
``` nu
379
- > $df | polars join $df_a [int_1 first] [int_1 first]
379
+ > $df_1 | polars join $df_2 [int_1 first] [int_1 first]
380
380
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────┬─────────┬───────────┬───────────╮
381
381
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │ int_2_x │ float_1_x │ float_2_x │
382
382
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┼─────────┼───────────┼───────────┤
@@ -402,7 +402,7 @@ operations with the same group condition.
402
402
To create a ` GroupBy ` object you only need to use the [ ` polars_group-by ` ] ( /commands/docs/polars_group-by.md ) command
403
403
404
404
``` nu
405
- > let group = $df | polars group-by first
405
+ > let group = $df_1 | polars group-by first
406
406
> $group
407
407
╭─────────────┬──────────────────────────────────────────────╮
408
408
│ LazyGroupBy │ apply aggregation to complete execution plan │
@@ -463,8 +463,8 @@ as integers, decimals, or strings. Let's create a small dataframe using the
463
463
command ` polars into-df ` .
464
464
465
465
``` nu
466
- > let a = [[a b]; [1 2] [3 4] [5 6]] | polars into-df
467
- > $a
466
+ > let df_3 = [[a b]; [1 2] [3 4] [5 6]] | polars into-df
467
+ > $df_3
468
468
╭───┬───┬───╮
469
469
│ # │ a │ b │
470
470
├───┼───┼───┤
@@ -480,11 +480,11 @@ a dataframe. This will change in the future, as the dataframe feature matures
480
480
:::
481
481
482
482
We can append columns to a dataframe in order to create a new variable. As an
483
- example, let's append two columns to our mini dataframe ` $a `
483
+ example, let's append two columns to our mini dataframe ` $df_3 `
484
484
485
485
``` nu
486
- > let a2 = $a | polars with-column $a .a --name a2 | polars with-column $a .a --name a3
487
- > $a2
486
+ > let df_4 = $df_3 | polars with-column $df_3 .a --name a2 | polars with-column $df_3 .a --name a3
487
+ > $df_4
488
488
╭───┬───┬───┬────┬────╮
489
489
│ # │ a │ b │ a2 │ a3 │
490
490
├───┼───┼───┼────┼────┤
@@ -520,7 +520,7 @@ the data as packed as possible (check [Arrow columnar
520
520
format] ( https://arrow.apache.org/docs/format/Columnar.html ) ). The other
521
521
optimization trick is the fact that whenever possible, the columns from the
522
522
dataframes are shared between dataframes, avoiding memory duplication for the
523
- same data. This means that dataframes ` $a ` and ` $a2 ` are sharing the same two
523
+ same data. This means that dataframes ` $df_3 ` and ` $df_4 ` are sharing the same two
524
524
columns we created using the ` polars into-df ` command. For this reason, it isn't
525
525
possible to change the value of a column in a dataframe. However, you can
526
526
create new columns based on data from other columns or dataframes.
@@ -535,8 +535,8 @@ Let's start our exploration with Series by creating one using the `polars into-d
535
535
command:
536
536
537
537
``` nu
538
- > let new = [9 8 4] | polars into-df
539
- > $new
538
+ > let df_5 = [9 8 4] | polars into-df
539
+ > $df_5
540
540
╭───┬───╮
541
541
│ # │ 0 │
542
542
├───┼───┤
@@ -554,8 +554,8 @@ other Series. Let's create a new Series by doing some arithmetic on the
554
554
previously created column.
555
555
556
556
``` nu
557
- > let new_2 = $new * 3 + 10
558
- > $new_2
557
+ > let df_6 = $df_5 * 3 + 10
558
+ > $df_6
559
559
╭───┬────╮
560
560
│ # │ 0 │
561
561
├───┼────┤
@@ -576,8 +576,8 @@ use `scope variables`
576
576
Let's rename our previous Series so it has a memorable name
577
577
578
578
``` nu
579
- > let new_2a = $new_2 | polars rename "0" memorable
580
- > $new_2a
579
+ > let df_7 = $df_6 | polars rename "0" memorable
580
+ > $df_7
581
581
╭───┬───────────╮
582
582
│ # │ memorable │
583
583
├───┼───────────┤
@@ -591,7 +591,7 @@ We can also do basic operations with two Series as long as they have the same
591
591
data type
592
592
593
593
``` nu
594
- > $new - $new_2a
594
+ > $df_5 - $df_7
595
595
╭───┬─────────────────╮
596
596
│ # │ sub_0_memorable │
597
597
├───┼─────────────────┤
@@ -604,8 +604,8 @@ data type
604
604
And we can add them to previously defined dataframes
605
605
606
606
``` nu
607
- > let new_df = $a | polars with-column $new --name new_col
608
- > $new_df
607
+ > let df_8 = $df_3 | polars with-column $df_5 --name new_col
608
+ > $df_8
609
609
╭───┬───┬───┬─────────╮
610
610
│ # │ a │ b │ new_col │
611
611
├───┼───┼───┼─────────┤
@@ -619,7 +619,7 @@ The Series stored in a Dataframe can also be used directly, for example,
619
619
we can multiply columns ` a ` and ` b ` to create a new Series
620
620
621
621
``` nu
622
- > $new_df .a * $new_df .b
622
+ > $df_8 .a * $df_8 .b
623
623
╭───┬─────────╮
624
624
│ # │ mul_a_b │
625
625
├───┼─────────┤
@@ -632,8 +632,8 @@ we can multiply columns `a` and `b` to create a new Series
632
632
and we can start piping things in order to create new columns and dataframes
633
633
634
634
``` nu
635
- > let $new_df_a = $new_df | polars with-column ($new_df .a * $new_df.b / $new_df.new_col) --name my_sum
636
- > $new_df_a
635
+ > let df_9 = $df_8 | polars with-column ($df_8 .a * $new_df.b / $new_df.new_col) --name my_sum
636
+ > $df_9
637
637
╭───┬───┬───┬─────────┬────────╮
638
638
│ # │ a │ b │ new_col │ my_sum │
639
639
├───┼───┼───┼─────────┼────────┤
@@ -652,8 +652,8 @@ that we can build boolean masks out of them. Let's start by creating a simple
652
652
mask using the equality operator
653
653
654
654
``` nu
655
- > let mask = $new == 8
656
- > $mask
655
+ > let mask_0 = $df_5 == 8
656
+ > $mask_0
657
657
╭───┬───────╮
658
658
│ # │ 0 │
659
659
├───┼───────┤
@@ -666,7 +666,7 @@ mask using the equality operator
666
666
and with this mask we can now filter a dataframe, like this
667
667
668
668
``` nu
669
- > $new_df_a | polars filter-with $mask
669
+ > $df_9 | polars filter-with $mask_0
670
670
╭───┬───┬───┬─────────┬────────╮
671
671
│ # │ a │ b │ new_col │ my_sum │
672
672
├───┼───┼───┼─────────┼────────┤
@@ -679,8 +679,8 @@ Now we have a new dataframe with only the values where the mask was true.
679
679
The masks can also be created from Nushell lists, for example:
680
680
681
681
``` nu
682
- > let mask1 = [true true false] | polars into-df
683
- > $new_df_a | polars filter-with $mask1
682
+ > let mask_1 = [true true false] | polars into-df
683
+ > $df_9 | polars filter-with $mask_1
684
684
╭───┬───┬───┬─────────┬────────╮
685
685
│ # │ a │ b │ new_col │ my_sum │
686
686
├───┼───┼───┼─────────┼────────┤
@@ -692,7 +692,7 @@ The masks can also be created from Nushell lists, for example:
692
692
To create complex masks, we have the ` AND `
693
693
694
694
``` nu
695
- > $mask and $mask1
695
+ > $mask_0 and $mask_1
696
696
╭───┬─────────╮
697
697
│ # │ and_0_0 │
698
698
├───┼─────────┤
@@ -705,7 +705,7 @@ To create complex masks, we have the `AND`
705
705
and ` OR ` operations
706
706
707
707
``` nu
708
- > $mask or $mask1
708
+ > $mask_0 or $mask_1
709
709
╭───┬────────╮
710
710
│ # │ or_0_0 │
711
711
├───┼────────┤
@@ -719,8 +719,8 @@ We can also create a mask by checking if some values exist in other Series.
719
719
Using the first dataframe that we created we can do something like this
720
720
721
721
``` nu
722
- > let mask3 = $df | polars col first | polars is-in [b c]
723
- > $mask3
722
+ > let mask_2 = $df_1 | polars col first | polars is-in [b c]
723
+ > $mask_2
724
724
╭──────────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────╮
725
725
│ input │ [table 2 rows] │
726
726
│ function │ Boolean(IsIn) │
@@ -733,7 +733,7 @@ Using the first dataframe that we created we can do something like this
733
733
and this new mask can be used to filter the dataframe
734
734
735
735
``` nu
736
- > $df | polars filter-with $mask3
736
+ > $df_1 | polars filter-with $mask_2
737
737
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
738
738
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
739
739
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -756,7 +756,7 @@ This is example is not updated to recent Nushell versions.
756
756
:::
757
757
758
758
``` nu
759
- > $df | polars get first | polars set new --mask ($df .first =~ a)
759
+ > $df_1 | polars get first | polars set new --mask ($df_1 .first =~ a)
760
760
╭───┬────────╮
761
761
│ # │ string │
762
762
├───┼────────┤
@@ -781,8 +781,8 @@ from our original dataframe. With that in mind, we can use the next command to
781
781
extract that information
782
782
783
783
``` nu
784
- > let indices = [1 4 6] | polars into-df
785
- > $df | polars take $indices
784
+ > let indices_0 = [1 4 6] | polars into-df
785
+ > $df_1 | polars take $indices_0
786
786
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
787
787
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
788
788
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -798,8 +798,8 @@ column `first`. In order to do that, we can use the command `polars arg-unique`
798
798
shown in the next example
799
799
800
800
``` nu
801
- > let indices = $df | polars get first | polars arg-unique
802
- > $df | polars take $indices
801
+ > let indices_1 = $df_1 | polars get first | polars arg-unique
802
+ > $df_1 | polars take $indices_1
803
803
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
804
804
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
805
805
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -818,8 +818,8 @@ The same result could be accomplished using the command [`sort`](/commands/docs/
818
818
:::
819
819
820
820
``` nu
821
- > let indices_1 = $df | polars get word | polars arg-sort
822
- > $df | polars take $indices_1
821
+ > let indices_2 = $df_1 | polars get word | polars arg-sort
822
+ > $df_1 | polars take $indices_2
823
823
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
824
824
│ # │ int_1 │ int_2 │ float_1 │ float_2 │ first │ second │ third │ word │
825
825
├───┼───────┼───────┼─────────┼─────────┼───────┼────────┼───────┼────────┤
@@ -840,8 +840,8 @@ And finally, we can create new Series by setting a new value in the marked
840
840
indices. Have a look at the next command
841
841
842
842
``` nu
843
- > let indices_2 = [0 2] | polars into-df
844
- > $df | polars get int_1 | polars set-with-idx 123 --indices $indices_2
843
+ > let indices_3 = [0 2] | polars into-df
844
+ > $df_1 | polars get int_1 | polars set-with-idx 123 --indices $indices_3
845
845
╭───┬───────╮
846
846
│ # │ int_1 │
847
847
├───┼───────┤
@@ -870,7 +870,7 @@ example, we can use it to count how many occurrences we have in the column
870
870
` first `
871
871
872
872
``` nu
873
- > $df | polars get first | polars value-counts
873
+ > $df_1 | polars get first | polars value-counts
874
874
╭───┬───────┬───────╮
875
875
│ # │ first │ count │
876
876
├───┼───────┼───────┤
@@ -887,7 +887,7 @@ Continuing with our exploration of `Series`, the next thing that we can do is
887
887
to only get the unique unique values from a series, like this
888
888
889
889
``` nu
890
- > $df | polars get first | polars unique
890
+ > $df_1 | polars get first | polars unique
891
891
╭───┬───────╮
892
892
│ # │ first │
893
893
├───┼───────┤
@@ -902,7 +902,7 @@ unique or duplicated. For example, we can select the rows for unique values
902
902
in column ` word `
903
903
904
904
``` nu
905
- $df | polars filter-with ($in.word | polars is-unique)
905
+ $df_1 | polars filter-with ($in.word | polars is-unique)
906
906
```
907
907
``` output-numd
908
908
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬───────╮
@@ -916,7 +916,7 @@ $df | polars filter-with ($in.word | polars is-unique)
916
916
Or all the duplicated ones
917
917
918
918
``` nu
919
- $df | polars filter-with ($in.word | polars is-duplicated)
919
+ $df_1 | polars filter-with ($in.word | polars is-duplicated)
920
920
```
921
921
``` output-numd
922
922
╭───┬───────┬───────┬─────────┬─────────┬───────┬────────┬───────┬────────╮
0 commit comments