-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A small comment about new variables in data frames #42
Comments
I've always wondered why does This should / could have been enough, IMO: within(mtcars, hwratio <- hp/wt) EDIT: I've actually put together a quick function in the development version of package admisc, and it seems to work: mt <- mtcars
inside(mt, hwratio <- hp/wt)
dim(mtcars) # 32 11
dim(mt) # 32 12 |
Within the tidyverse, I would go for the shorthand:
However, there is no reason to do so, because a very simple assignment in base R does the trick neatly. However, a pipe over multiple lines may be something where I would go for a "tidy" version. I can add a single line or out-comment one, and still have linearly legible code. 🤷 |
I think, strictly speaking, tidy proponents don't even favour the |
One of the easiest ways to make a program difficult to understand is to modify an object without using an assignment operator. That is deliberately difficult in R. |
I agree, but really, what is the purpose of creating a new variable without overwriting the object? These are all equivalent, in my mind: mtcars$hwratio <- mtcars$hp / mtcars$wt
# two assignment operators are definitely more difficult to understand for beginners
mtcars <- with(mtcars, hwratio <- hp/wt)
# perhaps this is more comprehensive
mtcars$hwratio <- with(mtcars, hp/wt)
# or even better
inside(mtcars, hwratio <- hp/wt) Note the later does have an assignment operator that signals (or should signal) creating a new variable. Anyways, if such a function is clearly documented, users should be aware and decide accordingly. |
|
Oh my... |
No it doesn't modify-in-place. I was just referring to the tidyverse discourse. 👍 I think the semantics in R does not provide an advantage to such modifications since, internally, every data frame is copied once it's modified. |
I really don't think use of within() is standard base R. I've never seen an R book or tutorial use it. And I certainly would not recommend teaching it, for exactly the same reason. Sorry for the long delay in replying. |
This is a base R function. If that is not standard, then I am afraid I don't understand what the purpose of the discussion is... |
My own view -- different people have different views -- is that in teaching
R learners who lack prior coding background, one should keep it as simple
as possible. Abstractions that are second nature to experienced coders are
not easy for such learners to understand, let alone use. So to me, just
because a construct is part of base-R does not mean it is appropriate for
these learners.
…On Tue, Feb 25, 2025 at 6:27 AM Adrian Dușa ***@***.***> wrote:
I really don't think use of within() is standard base R. I've never seen
an R book or tutorial use it. And I certainly would not recommend teaching
it, for exactly the same reason. Sorry for the long delay in replying.
This is a base R function. If that is not standard, then I am afraid I
don't understand what the purpose of the discussion is...
(and, to agree to disagree, I would definitely recommend teaching it)
—
Reply to this email directly, view it on GitHub
<#42 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ34ZL2GRWI2WHTXH3NGST2RR4T5AVCNFSM6AAAAABX2ZFP7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBSGE3DGMJUGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
[image: dusadrian]*dusadrian* left a comment (matloff/TidyverseSkeptic#42)
<#42 (comment)>
I really don't think use of within() is standard base R. I've never seen
an R book or tutorial use it. And I certainly would not recommend teaching
it, for exactly the same reason. Sorry for the long delay in replying.
This is a base R function. If that is not standard, then I am afraid I
don't understand what the purpose of the discussion is...
(and, to agree to disagree, I would definitely recommend teaching it)
—
Reply to this email directly, view it on GitHub
<#42 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ34ZL2GRWI2WHTXH3NGST2RR4T5AVCNFSM6AAAAABX2ZFP7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMOBSGE3DGMJUGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Completely agree, with only one observation: beginners don't do coding, but scripting. |
Hard to distinguish between scripting and coding, though. Some R users may never run something consisting of more than 5 or 10 lines, but to me it's coding. I don't mean to harp on this, but I really believe the main issue is degrees of abstraction that an R user is capable of at any given time. The more examples they see, even if as you say just substituting new names, the deeper that capability becomes. |
Some examples give this code for creating new variables in data frame:
mtcars$hwratio <- mtcars$hp / mtcars$wt
.The corresponding Tidy version is this:
mtcars %>% mutate(hwratio=hp/wt) -> mtcars
.Maybe it's not a "beginner" topic, but I think the typical base R way is this:
mtcars <- within(mtcars, hwratio <- hp/wt)
.There's really no difference between that and the Tidy version, since as far as I can tell, mutate and within do the same thing. Tidy insists the
%>%
operator though. Without that, it's nearly identical (mtcars <- mutate(mtcars, hwaatio=hp/wt)
), and one wonders why you would add a slew of dependencies simply to rename "within" to "mutate".The text was updated successfully, but these errors were encountered: