Saving data links to a DB_Table #11371

radeusgd · 2024-10-21T16:16:41Z

Pull Request Description

Closes Save a DB_Table as a data link #11295

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

The documentation has been updated, if necessary.
Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
All code follows the
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
Unit tests have been written where possible.
If meaningful changes were made to logic or tests affecting Enso Cloud integration in the libraries,
or the Snowflake database integration, a run of the Extra Tests has been scheduled.
- If applicable, it is suggested to paste a link to a successful run of the Extra Tests.

…k integration

TODO: add DB_Data_Link_Type to each data link and merge into single constructor, add tests

somebody1234

approving new schemas in datalinkSchema.json (code review, schema may want to be reviewed separately by libs team)

radeusgd · 2024-10-21T16:20:20Z

This is how the datalinks look like in the GUI:

Saving a table directly stores it by-name:

t.save_as_data_link (Enso_File.home / "postgres-table.datalink")

Simple query without interpolations (this could also be typed in by the user creating the datalink inside of the GUI):

t.rename_columns ["Y"] . save_as_data_link (Enso_File.home / "postgres-simple-query.datalink")

More complex serialized queries - these are not meant to be typed in by the user, but we still need to display them somehow.

t.set (t.at "X" * 1000 + 45) "Z" . save_as_data_link (Enso_File.home / "postgres-query.datalink")

radeusgd · 2024-10-21T18:03:57Z

I've added a warning if the query that is being saved refers to temporary tables - as after a session is closed those are destroyed and the datalink will most likely become invalid.

and after pressing the play button:

This also works if the query is more complex and comes from more tables:

(warning is also displayed if only some of the set of tables are temporary)

radeusgd · 2024-10-21T18:06:25Z

Running Snowflake tests:

GregoryTravis · 2024-10-21T20:57:42Z

distribution/lib/Standard/Database/0.0.0-dev/src/Internal/DB_Data_Link_Helpers.enso

+    referred_temporary_tables = _find_referred_temporary_tables table.connection table.context
+    if referred_temporary_tables.is_nothing then result else
+        warning = Illegal_State.Error "The saved query seems to refer to tables "+referred_temporary_tables.to_text+" which are temporary. Such tables may cease to exist once the session is closed, so the saved data link will no longer be valid and will fail to open."
+        Warning.attach warning result


I wonder if this should just be an error. Datalinks are mainly used to store things for the long term over many sessions, so I assume there isn't really a use case for storing a datalink just during one sessions (but perhaps I'm wrong about that).

Ah I wanted to mention it, but I forgot.

In the doc comment I noted:

Note that this is a heuristic and it may potentially lead to false positives if aliasing table names exist across schemas. Supporting tables with clashing names across schemas is something that may need to be revised overall in the Database library.

Indeed I'm not 100% sure of our table mapping in cases where one has two tables with the same name across two accessible schemas. This heuristic is simple and it could lead to false positives. If we do more 'reading tables across schemas' it'd probably have to be updated.

That's why I wanted to make it only a warning. If the warning is a false positive, it can be dismissed and one can continue working.

But if it were an error, it could prevent users from creating a valid datalink.

Of course, this is not ideal - ideally we'd fix the heuristic to make sure it always works - and then error would make sense.

But I'm not sure if I know how to fix this heuristic properly. The whole idea of warning on temporary tables is something I added as an addition to this PR. Trying to fix the heuristic to be 100% accurate is well out of scope of this PR as it would require much more work and will probably be backend-specific. This was making me reconsider if I should add this check at all, but I decided that a relatively accurate check as a warning is better than no check at all.

Alternatively, I can revert the check for now and create a ticket to create a more accurate implementation - as it can take more time to do it, so it would need to be scheduled.

jdunkerley

Approving but with some comments that are worth scanning over and implementing as you agree

jdunkerley · 2024-10-23T16:57:19Z

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso

@@ -300,7 +300,7 @@ Text.find self pattern:(Regex | Text)=".*" case_sensitivity=Case_Sensitivity.Sen
              ## This matches `aABbbbc` @ character 0 and `aBC` @ character 11
             "aABbbbccccaaBCaaaa".find_all "a[ab]+c" Case_Sensitivity.Insensitive
 Text.find_all : Text -> Case_Sensitivity -> Vector Match ! Regex_Syntax_Error | Illegal_Argument
-Text.find_all self pattern=".*" case_sensitivity=Case_Sensitivity.Sensitive =
+Text.find_all self pattern:Text=".*" case_sensitivity:Case_Sensitivity=..Sensitive =


Suggested change

Text.find_all self pattern:Text=".*" case_sensitivity:Case_Sensitivity=..Sensitive =

Text.find_all self pattern:(Regex | Text)=".*" case_sensitivity:Case_Sensitivity=..Sensitive =

otherwise we can't feed a regex object in.

Ooops, I did not realize that Regex is allowed here.

This should have tripped some tests, but didn't. Shows that we should up our test coverage a bit.

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso

jdunkerley · 2024-10-23T16:58:25Z

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso

@@ -394,7 +394,7 @@ Text.to_regex self case_insensitive=False = Regex.compile self case_insensitive
         'azbzczdzezfzg'.split ['b', 'zez'] == ['az', 'zczd', 'fzg']
 @delimiter make_delimiter_selector
 Text.split : Text | Vector Text -> Case_Sensitivity -> Boolean -> Vector Text ! Illegal_Argument
-Text.split self delimiter="," case_sensitivity=Case_Sensitivity.Sensitive use_regex=False =
+Text.split self delimiter="," case_sensitivity:Case_Sensitivity=..Sensitive use_regex=False =


Suggested change

Text.split self delimiter="," case_sensitivity:Case_Sensitivity=..Sensitive use_regex=False =

Text.split self delimiter="," case_sensitivity:Case_Sensitivity=..Sensitive use_regex:Boolean=False =

We should get rid of use_regex and take Regex or Text.

Seems like an out of scope change for this PR IMHO. Maybe I shouldn't have started the type changes at all - but I just wanted to do the trivial ones and thought they could be done 'in the meantime'.

But this requires logical changes and changes to the tests, I'd prefer to do it separately.

distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Extensions.enso

jdunkerley · 2024-10-23T17:11:28Z

distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Base_Generator.enso

+                prepared_statement : SQL_Statement ->
+                    SQL_Builder.from_fragments prepared_statement.fragments
+                raw_code : Text ->
+                    SQL_Builder.code raw_code


Could we do this as conversion on SQL_Builder?

I think I'd prefer this to be explicit.

jdunkerley · 2024-10-23T17:12:52Z

distribution/lib/Standard/Database/0.0.0-dev/src/DB_Table.enso

+       Creates a Data Link that will act as a view into the query represented by
+       this table.
+    @on_existing_file (Existing_File_Behavior.widget include_backup=False include_append=False)
+    save_as_data_link self destination (on_existing_file:Existing_File_Behavior = Existing_File_Behavior.Error) =


we need a type on destination (File?)

We should have arguments section in documentation.

I did not add a type as currently it is only accepting only Enso_File but in the future we could allow saving datalinks to other destinations, so I wanted to keep it open ended.

I guess Writable_File should do the trick.

Or I should not be too open ended yet and just do : Enso_File. We can always expand the type later as that's a compatible change.

distribution/lib/Standard/Database/0.0.0-dev/src/Internal/DB_Data_Link_Helpers.enso

jdunkerley · 2024-10-23T17:19:08Z

distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Data_Link_Setup.enso

+            word = case link_type of
+                DB_Data_Link_Type.Database -> "connection"
+                DB_Data_Link_Type.Table _ -> "table"
+                DB_Data_Link_Type.Query _ -> "query"
+                DB_Data_Link_Type.SQL_Statement _ -> "query"


method on DB_Data_Link?

not sure what this method should be called, it's very context-dependent how this conversion is handled.

radeusgd · 2024-10-24T11:43:04Z

Snowflake run after code review updates:

✅ https://github.com/enso-org/enso/actions/runs/11498558765

radeusgd added 17 commits October 21, 2024 17:55

(de)serialization of SQL_Statement

8e37018

WIP saving as data link

787530f

updating schema

86d0b9d

enable Cloud auth in Snowflake tests, to be able to test the data lin…

6110c05

…k integration

regenerate workflow after change

cfec370

checkpoint - interpreting datalinks, creating tables from SQL_Statement

fd643ca

TODO: add DB_Data_Link_Type to each data link and merge into single constructor, add tests

add custom tests for Postgres

fb743cf

add common tests for saving table/query as data link

10eed0f

simplify Postgres datalink - now one constructor with link type

7e7ac9e

parsing data link type

ff20d4b

fix compiler errors

5d5e609

shorter table name, typo

8011e7d

fixing

ec0d661

fix typo

1e23ec6

fix

3e4cc58

better names

cbd3f63

integrate Snowflake and SQLServer

3f1c71a

radeusgd self-assigned this Oct 21, 2024

radeusgd requested review from jdunkerley, GregoryTravis, AdRiley, marthasharkey, PabloBuchu, indiv0, somebody1234, MrFlashAccount, hubertp and Frizi as code owners October 21, 2024 16:16

somebody1234 approved these changes Oct 21, 2024

View reviewed changes

radeusgd added 5 commits October 21, 2024 19:18

add types - enable autoscoping for Case Sensitivity in Text

97dbecf

warn if temporary table is saved as data link

b0e6bec

tests for the temp warning

ab02438

fix for various temp table styles

9b58c89

test and fix edge case

f1e7b0a

changelog

3fc114e

GregoryTravis approved these changes Oct 21, 2024

View reviewed changes

radeusgd added 6 commits October 22, 2024 13:15

fix tests, more signatures

e0efc3c

further tweaking to is_trivial_query

1400bdc

more tweaking

cae6fd8

fix test

a768c0b

check invariants, run Snowflake on GH to have aws cli

bf4b8ef

better encapsulation and responsibility delegation

5fec3c3

radeusgd added CI: Keep up to date Automatically update this PR to the latest develop. and removed CI: Keep up to date Automatically update this PR to the latest develop. labels Oct 23, 2024

jdunkerley approved these changes Oct 23, 2024

View reviewed changes

radeusgd added 3 commits October 24, 2024 10:59

Merge branch 'develop' into wip/radeusgd/11295-save-table-as-datalink

15edb3a

CR1

8e7d237

CR2

034e490

radeusgd force-pushed the wip/radeusgd/11295-save-table-as-datalink branch from 69994d4 to 034e490 Compare October 24, 2024 11:41

radeusgd added the CI: Ready to merge This PR is eligible for automatic merge label Oct 24, 2024

mergify bot merged commit ca9df70 into develop Oct 24, 2024
39 checks passed

mergify bot deleted the wip/radeusgd/11295-save-table-as-datalink branch October 24, 2024 13:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving data links to a DB_Table #11371

Saving data links to a DB_Table #11371

radeusgd commented Oct 21, 2024 •

edited

Loading

somebody1234 left a comment

radeusgd commented Oct 21, 2024

radeusgd commented Oct 21, 2024

radeusgd commented Oct 21, 2024 •

edited

Loading

GregoryTravis Oct 21, 2024

radeusgd Oct 22, 2024

radeusgd Oct 22, 2024

radeusgd Oct 22, 2024

jdunkerley left a comment

jdunkerley Oct 23, 2024

radeusgd Oct 23, 2024

jdunkerley Oct 23, 2024

radeusgd Oct 24, 2024

jdunkerley Oct 23, 2024

radeusgd Oct 23, 2024

jdunkerley Oct 23, 2024

radeusgd Oct 23, 2024

jdunkerley Oct 23, 2024

radeusgd Oct 23, 2024

radeusgd commented Oct 24, 2024 •

edited

Loading

	Text.find_all self pattern:Text=".*" case_sensitivity:Case_Sensitivity=..Sensitive =
	Text.find_all self pattern:(Regex \| Text)=".*" case_sensitivity:Case_Sensitivity=..Sensitive =

	Text.split self delimiter="," case_sensitivity:Case_Sensitivity=..Sensitive use_regex=False =
	Text.split self delimiter="," case_sensitivity:Case_Sensitivity=..Sensitive use_regex:Boolean=False =

Saving data links to a DB_Table #11371

Saving data links to a DB_Table #11371

Conversation

radeusgd commented Oct 21, 2024 • edited Loading

Pull Request Description

Important Notes

Checklist

somebody1234 left a comment

Choose a reason for hiding this comment

radeusgd commented Oct 21, 2024

radeusgd commented Oct 21, 2024

radeusgd commented Oct 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jdunkerley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

radeusgd commented Oct 24, 2024 • edited Loading

radeusgd commented Oct 21, 2024 •

edited

Loading

radeusgd commented Oct 21, 2024 •

edited

Loading

radeusgd commented Oct 24, 2024 •

edited

Loading