-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support DATETIME function in PyDough #267
Conversation
…ts and fill out comments/docstrings
FROM (VALUES | ||
()) | ||
(NULL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just having VALUES ()
is invalid, so using VALUES (NULL)
helps
@@ -0,0 +1,47 @@ | |||
SELECT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are now automatically generated along with the XXX_ansi.sql
file for all the tests in test_pydough_to_sql.py
, so we can verify that each supported dialect works on the expected functions.
@@ -153,14 +154,14 @@ def impl(file_name: str) -> str: | |||
|
|||
|
|||
@pytest.fixture(scope="session") | |||
def get_sql_test_filename() -> Callable[[str], str]: | |||
def get_sql_test_filename() -> Callable[[str, DatabaseDialect], str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change allows the SQL files to be dialect-specific.
pytest.param(DatabaseDialect.SQLITE, id="sqlite"), | ||
] | ||
) | ||
def empty_context_database(request) -> DatabaseContext: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates a new fixture with a DatabaseContext
w/o an actual connection for every dialect, so we can test SQL generation for all dialects.
) | ||
|
||
|
||
def datetime_sampler(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't cover every edge case for the string parsing of DATETIME
, but covers a large number of them that were generated to account for various units, aliases, signs, spacings, capitalizations, etc.
""" | ||
|
||
|
||
class DateTimeUnit(Enum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved logic into this enum so that DATETIME
wouldn't duplicate its unit-handling logic with DATEDIFF
.
# Set of names of of fields that can be included in the JSON | ||
# Set of names of fields that can be included in the JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a typo I mass-fixed
if isinstance(base, sqlglot_expressions.Datetime): | ||
base.this.append(trunc_expr) | ||
return base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This edge case allows us to flatten DATETIME(DATETIME('now', '-1 day'), 'start of year')
into DATETIME('now', '-1 day', 'start of year')
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarity purposes: Does the visitor revisit after appending? Maybe we can keep a small comment for further programmers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The visitor doesn't visit SQLGlot nodes, it returns them. In this case, base
is a SQLGlot expression node returned earlier by the SQLGlot relational expression visitor (which created base
from a relational expression). All we're doing is modifying base
after the fact, instead of returning a new object containing it.
if isinstance(base, sqlglot_expressions.Datetime): | ||
base.this.append(offset_expr) | ||
return base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This edge case allows us to flatten DATETIME(DATETIME('now', 'start of year'), '-1 day')
into DATETIME('now', 'start of year', '-1 day')
.
Returns: | ||
The SQLGlot expression to truncate `base`. | ||
""" | ||
if dialect == DatabaseDialect.SQLITE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with other SQLite bindings, we should either create a new function here or refactor the other bindings to align with the current conditional approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the core behavior is the same, but the helper function changes. What I probably want, in future, is to overhaul how the bindings work so there is a measure of inheritance. That way, there can be general superclasses that deal with most of the behavior, and subclasses can override specific helpers, and there can be inheritance as fallbacks. Whether this is literally via classes, or just a stack of objects, needs to be designed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, I agree with your sentiment, but I want fixing this to be part of a larger overhaul/refactor of the file.
" \n \t312 \nD \n\r ", | ||
), | ||
DATETIME(" Current_Date ", "+ 45 MM", "-135 s"), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Might be far-fetched but how about adding tests for camel-cases such as DATETIME("cUrReNt_Timestamp")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 might as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added a few weird ones.
Resolves #208. See issue for more details. Full list of additional changes:
DATETIME
operator works for both sqlite & ANSI with very permissive spacing/capitalization/alias rules for the various string literals.now
(& similar aliases): for ANSI generatesCURRENT_TIMESTAMP()
, for sqlite generatesDATETIME('now')
start of <unit>
(year/month/day): for ANSI generatesDATE_TRUNC(<unit>, dt)
, for sqlite generatesDATETIME(dt, 'start of <unit>')
start of <unit>
(hour/minute/second): for ANSI generatesDATE_TRUNC(<unit>, dt)
, for sqlite generatesSTRFTIME(fmt_str, dt)
where thefmt_str
truncates to that unit.±<amt> <unit>
: for ANSI generatesDATEADD(<unit>, ±<amt>dt)
, for sqlite generatesDATETIME(dt, '±<amt> <unit>)
DateTimeUnit
enum to handle common logic, such as converting a unit alias into a cannonical form or identifying the format string used to truncate to that unit.test_pydough_to_sql.py
to be parameterized with the afformentioned new fixture so sqlite & ANSI are both tested.sql
files to contain the dialect (e.g.xxx_ansi.sql
vsxxx_sqlite.sql
)VALUES
clause