Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support DATETIME function in PyDough #267

Merged
merged 15 commits into from
Feb 19, 2025
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 45 additions & 10 deletions documentation/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Below is the list of every function/operator currently supported in PyDough as a
* [LIKE](#like)
* [JOIN_STRINGS](#join_strings)
- [Datetime Functions](#datetime-functions)
* [DATETIME](#datetime)
* [YEAR](#year)
* [MONTH](#month)
* [DAY](#day)
Expand Down Expand Up @@ -234,6 +235,49 @@ For instance, `JOIN_STRINGS("; ", "Alpha", "Beta", "Gamma)` returns `"Alpha; Bet

Below is each function currently supported in PyDough that operates on date/time/timestamp values.

<!-- TOC --><a name="datetime"></a>
### DATETIME

The `DATETIME` function is used to build/augment date/timestamp values. The first argument is the base date/timestamp, and it can optionally take in a variable number of modifier arguments.

The base argument can be one of the following:

- A string literal indicating that the current timestamp should be built, which has to be one of the following: `now`, `current_date`, `current_timestamp`, `current date`, `current timestamp`. All of these aliases are equivalent, case-insensitive, and ignore leading/trailing whitespace.
- A column of datetime data.

The modifier arguments can be the following (all of the options are case-insensitive and ignore leading/trailing/extra whitespace):
- A string literal in the format `start of <UNIT>` indicating to truncate the datetime value to a certain unit, which can be the following:
- **Years**: Supported aliases are `"years"`, `"year"`, and `"y"`.
- **Months**: Supported aliases are `"months"`, `"month"`, and `"mm"`.
- **Days**: Supported aliases are `"days"`, `"day"`, and `"d"`.
- **Hours**: Supported aliases are `"hours"`, `"hour"`, and `"h"`.
- **Minutes**: Supported aliases are `"minutes"`, `"minute"`, and `"m"`.
- **Seconds**: Supported aliases are `"seconds"`, `"second"`, and `"s"`.
- A string literal in the form `±<AMT> <UNIT>` indicating to add/subtract a date/time interval to the datetime value. The sign can be `+` or `-`, and if omitted the default is `+`. The amount must be an integer. The unit must be one of the same unit strings allowed for trucation.

For example, `"Days"`, `"DAYS"`, and `"d"` are all treated the same due to case insensitivity.

If there are multiple modifiers, they operate left-to-right.

```py
# Returns the following datetime moments:
# 1. The current timestamp
# 2. The start of the current month
# 3. Exactly 12 hours from now
# 4. The last day of the previous year
# 5. The current day, at midnight
TPCH(
ts_1=DATETIME('now'),
ts_2=DATETIME('NoW', 'start of month'),
ts_3=DATETIME(' CURRENT_DATE ', '12 hours'),
ts_4=DATETIME('Current Timestamp', 'start of y', '- 1 D'),
ts_5=DATETIME('NOW', ' Start of Day '),
)

# For each order, truncates the order date to the first day of the year
Orders(order_year=DATETIME(order_year, 'START OF Y'))
```

<!-- TOC --><a name="year"></a>
### YEAR

Expand Down Expand Up @@ -311,16 +355,7 @@ orders(
)
```

The first argument in the `DATEDIFF` function supports the following aliases for each unit of time. The argument is **case-insensitive**, and if a unit is not one of the provided options, an error will be thrown:

- **Years**: Supported aliases are `"years"`, `"year"`, and `"y"`.
- **Months**: Supported aliases are `"months"`, `"month"`, and `"mm"`.
- **Days**: Supported aliases are `"days"`, `"day"`, and `"d"`.
- **Hours**: Supported aliases are `"hours"`, `"hour"`, and `"h"`.
- **Minutes**: Supported aliases are `"minutes"`, `"minute"`, and `"m"`.
- **Seconds**: Supported aliases are `"seconds"`, `"second"`, and `"s"`.

Invalid or unrecognized units will result in an error. For example, `"Days"`, `"DAYS"`, and `"d"` are all treated the same due to case insensitivity.
The first argument in the `DATEDIFF` function supports the following aliases for each unit of time. The argument is **case-insensitive**, and if a unit is not one of the provided options, an error will be thrown. See [`DATETIME`](#datetime) for the supported units and their aliases. Invalid or unrecognized units will result in an error.

<!-- TOC --><a name="conditional-functions"></a>
## Conditional Functions
Expand Down
2 changes: 1 addition & 1 deletion pydough/metadata/collections/collection_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ class CollectionMetadata(AbstractMetadata):
- `parse_from_json`
"""

# Set of names of of fields that can be included in the JSON
# Set of names of fields that can be included in the JSON
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a typo I mass-fixed

# object describing a collection. Implementations should extend this.
allowed_fields: set[str] = {"type", "properties"}

Expand Down
2 changes: 1 addition & 1 deletion pydough/metadata/collections/simple_table_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ class SimpleTableMetadata(CollectionMetadata):
of other such tables created from joins.
"""

# Set of names of of fields that can be included in the JSON
# Set of names of fields that can be included in the JSON
# object describing a simple table collection.
allowed_fields: set[str] = CollectionMetadata.allowed_fields | {
"table_path",
Expand Down
2 changes: 1 addition & 1 deletion pydough/metadata/properties/cartesian_product_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ class CartesianProductMetadata(ReversiblePropertyMetadata):
cartesian product between a collection and its subcollection.
"""

# Set of names of of fields that can be included in the JSON object
# Set of names of fields that can be included in the JSON object
# describing a cartesian product property.
allowed_fields: set[str] = PropertyMetadata.allowed_fields | {
"other_collection_name",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ class CompoundRelationshipMetadata(ReversiblePropertyMetadata):
certain inherited properties derived from the middle collection.
"""

# Set of names of of fields that can be included in the JSON object
# Set of names of fields that can be included in the JSON object
# describing a compound relationship property.
allowed_fields: set[str] = PropertyMetadata.allowed_fields | {
"primary_property",
Expand Down
2 changes: 1 addition & 1 deletion pydough/metadata/properties/property_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class PropertyMetadata(AbstractMetadata):
- `parse_from_json`
"""

# Set of names of of fields that can be included in the JSON object
# Set of names of fields that can be included in the JSON object
# describing a property. Implementations should extend this.
allowed_fields: set[str] = {"type"}

Expand Down
2 changes: 1 addition & 1 deletion pydough/metadata/properties/simple_join_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ class SimpleJoinMetadata(ReversiblePropertyMetadata):
join between a collection and its subcollection based on equi-join keys.
"""

# Set of names of of fields that can be included in the JSON object
# Set of names of fields that can be included in the JSON object
# describing a simple join property.
allowed_fields: set[str] = PropertyMetadata.allowed_fields | {
"other_collection_name",
Expand Down
2 changes: 1 addition & 1 deletion pydough/metadata/properties/table_column_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ class TableColumnMetadata(ScalarAttributeMetadata):
column of data from a relational table.
"""

# Set of names of of fields that can be included in the JSON object
# Set of names of fields that can be included in the JSON object
# describing a table column property.
allowed_fields: set[str] = PropertyMetadata.allowed_fields | {
"data_type",
Expand Down
2 changes: 2 additions & 0 deletions pydough/pydough_operators/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
"COUNT",
"ConstantType",
"DATEDIFF",
"DATETIME",
"DAY",
"DEFAULT_TO",
"DIV",
Expand Down Expand Up @@ -85,6 +86,7 @@
CONTAINS,
COUNT,
DATEDIFF,
DATETIME,
DAY,
DEFAULT_TO,
DIV,
Expand Down
1 change: 1 addition & 0 deletions pydough/pydough_operators/expression_operators/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ These functions must be called on singular data as a function.

##### Datetime Functions

- `DATETIME`: constructs a new datetime, either from an existing one or the current datetime, and augments it by adding/subtracting intervals of time and/or truncating it to various units.
- `YEAR`: returns the year component of a datetime.
- `MONTH`: returns the month component of a datetime.
- `DAY`: returns the day component of a datetime.
Expand Down
2 changes: 2 additions & 0 deletions pydough/pydough_operators/expression_operators/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
"CONTAINS",
"COUNT",
"DATEDIFF",
"DATETIME",
"DAY",
"DEFAULT_TO",
"DIV",
Expand Down Expand Up @@ -79,6 +80,7 @@
CONTAINS,
COUNT,
DATEDIFF,
DATETIME,
DAY,
DEFAULT_TO,
DIV,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"CONTAINS",
"COUNT",
"DATEDIFF",
"DATETIME",
"DAY",
"DEFAULT_TO",
"DIV",
Expand Down Expand Up @@ -65,7 +66,7 @@
RequireNumArgs,
SelectArgumentType,
)
from pydough.types import BooleanType, Float64Type, Int64Type, StringType
from pydough.types import BooleanType, DateType, Float64Type, Int64Type, StringType

from .binary_operators import BinaryOperator, BinOp
from .expression_function_operators import ExpressionFunctionOperator
Expand Down Expand Up @@ -132,6 +133,9 @@
MIN = ExpressionFunctionOperator("MIN", True, RequireNumArgs(1), SelectArgumentType(0))
MAX = ExpressionFunctionOperator("MAX", True, RequireNumArgs(1), SelectArgumentType(0))
IFF = ExpressionFunctionOperator("IFF", False, RequireNumArgs(3), SelectArgumentType(1))
DATETIME = ExpressionFunctionOperator(
"DATETIME", False, AllowAny(), ConstantType(DateType())
)
YEAR = ExpressionFunctionOperator(
"YEAR", False, RequireNumArgs(1), ConstantType(Int64Type())
)
Expand Down
5 changes: 4 additions & 1 deletion pydough/sqlglot/sqlglot_relational_visitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from sqlglot.expressions import Identifier, Select, Subquery, values
from sqlglot.expressions import Literal as SQLGlotLiteral
from sqlglot.expressions import Star as SQLGlotStar
from sqlglot.expressions import convert as sqlglot_convert

from pydough.relational import (
Aggregate,
Expand Down Expand Up @@ -488,7 +489,9 @@ def visit_limit(self, limit: Limit) -> None:
self._stack.append(query)

def visit_empty_singleton(self, singleton: EmptySingleton) -> None:
self._stack.append(Select().from_(values([()])))
self._stack.append(
Select().select(SQLGlotStar()).from_(values([sqlglot_convert((None,))]))
)

def visit_root(self, root: RelationalRoot) -> None:
self.visit_inputs(root)
Expand Down
Loading