Patterns

pchemguy · pchemguy · commit cf42abaa4cbd · 2022-07-20T12:46:16.000+03:00
diff --git a/Patterns JSON and DSV Output.md b/Patterns JSON and DSV Output.md
@@ -0,0 +1,167 @@
+---
+layout: default
+title: JSON and DSV Output
+nav_order: 5
+parent: Design Patterns
+permalink: /patterns/json-sql-output
+---
+
+Using JSON, rather than a record set, for encoding the returned data may also be more efficient due to the reduced number of required API calls. While SQLite API is usually fast, each returned value still costs several API calls. (Under certain circumstances, however, there might be a high SQLite-independent overhead for each API call.) Another important consideration is whether there are any potential side effects of data conversion between numeric and textual formats. Let us show a few examples.
+
+Consider a modified query from the [Surrogate Variables](/patterns/variables#DSV-Query) section:
+
+~~~sql
+WITH
+    delimiters(delimiter) AS (VALUES ('/')),
+    strings(string_id, string) AS (
+        VALUES
+            ('abc', 'C:/Winows/System32/drivers/etc/'),
+            ('def', 'C:/Users/Public/Desktop')
+    ),
+    folders AS (
+        SELECT string_id, "terms"."key" AS term_id, "terms"."value" AS term
+        FROM
+            delimiters, strings,
+            json_each('["' || replace(trim(string, delimiter), delimiter, '", "') || '"]') AS terms
+        ORDER BY string_id, term_id
+    ),
+    json_folders AS (
+        SELECT string_id, json_group_array(term) AS path_json
+        FROM folders
+        GROUP BY string_id
+    )
+SELECT * FROM json_folders;
+~~~
+
+which outputs:
+
+| string_id | path_json                                  |
+|-----------|--------------------------------------------|
+| abc       | ["C:","Winows","System32","drivers","etc"] |
+| def       | ["C:","Users","Public","Desktop"]          |
+
+This query has a new section, *json_folders*, at the end, which uses *json_group_array* to collect folders belonging to the same path. Note that the ordering clause is added to the *folders* section to ensure that the order of individual folders within JSON arrays after grouping reflects their positions in original paths.
+
+---
+
+Similarly, the following modified query:
+
+~~~sql
+WITH
+    folders AS (
+        SELECT
+            json_extract(dirs.value, '$.bin_id') AS bin_id,
+            json_extract(dirs.value, '$.prefix') AS prefix,
+            json_extract(dirs.value, '$.name')   AS name
+        FROM
+            json_each(
+                '['                                                                                    ||
+                    '{"bin_id": "239", "prefix": "C:/Winows/System32/drivers/etc", "name": "hosts"},'  ||
+                    '{"bin_id": "876", "prefix": "C:/Users/Public/Desktop",        "name": "pic"  },'  ||
+                    '{"bin_id": "374", "prefix": "C:/Users/Default/Music",         "name": "drum" }'   ||
+                ']'
+            ) AS dirs
+    ),
+    json_fs_objects AS (
+        SELECT
+            json_group_array(json_object('bin_id', bin_id, 'prefix', prefix, 'name', name)) AS fs_objects
+        FROM folders
+	)
+SELECT * FROM json_fs_objects;
+~~~
+
+return a scalar string containing a set of records in the JSON format:
+
+~~~sql
+[
+    {"bin_id": "239", "name": "hosts", "prefix": "C:/Winows/System32/drivers/etc"},
+    {"bin_id": "876", "name": "pic",   "prefix": "C:/Users/Public/Desktop"       },
+    {"bin_id": "374", "name": "drum",  "prefix": "C:/Users/Default/Music"        }
+]
+~~~
+
+JSON objects are a bit too verbous when returning a record set, but there are a few other options. For example:
+
+~~~sql
+WITH
+    folders AS (
+        SELECT
+            json_extract(dirs.value, '$.bin_id') AS bin_id,
+            json_extract(dirs.value, '$.prefix') AS prefix,
+            json_extract(dirs.value, '$.name')   AS name
+        FROM
+            json_each(
+                '['                                                                                    ||
+                    '{"bin_id": "239", "prefix": "C:/Winows/System32/drivers/etc", "name": "hosts"},'  ||
+                    '{"bin_id": "876", "prefix": "C:/Users/Public/Desktop",        "name": "pic"  },'  ||
+                    '{"bin_id": "374", "prefix": "C:/Users/Default/Music",         "name": "drum" }'   ||
+                ']'
+            ) AS dirs
+    ),
+    tsv_fs_objects AS (
+        SELECT group_concat(bin_id || x'09' || prefix || x'09' || name, x'0A') AS fs_objects
+        FROM folders
+    )
+SELECT * FROM tsv_fs_objects;
+~~~
+
+or
+
+~~~sql
+WITH
+    folders AS (
+        SELECT
+            json_extract(dirs.value, '$.bin_id') AS bin_id,
+            json_extract(dirs.value, '$.prefix') AS prefix,
+            json_extract(dirs.value, '$.name')   AS name
+        FROM
+            json_each(
+                '['                                                                                    ||
+                    '{"bin_id": "239", "prefix": "C:/Winows/System32/drivers/etc", "name": "hosts"},'  ||
+                    '{"bin_id": "876", "prefix": "C:/Users/Public/Desktop",        "name": "pic"  },'  ||
+                    '{"bin_id": "374", "prefix": "C:/Users/Default/Music",         "name": "drum" }'   ||
+                ']'
+            ) AS dirs
+    ),
+    tsv_fs_objects AS (
+        SELECT
+            group_concat(printf('%s' || x'09' || '%s' || x'09' || '%s', bin_id, prefix, name), x'0A') AS fs_objects
+        FROM folders
+    )
+SELECT * FROM tsv_fs_objects;
+~~~
+
+or
+
+~~~sql
+WITH
+    folders AS (
+        SELECT
+            json_extract(dirs.value, '$.bin_id') AS bin_id,
+            json_extract(dirs.value, '$.prefix') AS prefix,
+            json_extract(dirs.value, '$.name')   AS name
+        FROM
+            json_each(
+                '['                                                                                    ||
+                    '{"bin_id": "239", "prefix": "C:/Winows/System32/drivers/etc", "name": "hosts"},'  ||
+                    '{"bin_id": "876", "prefix": "C:/Users/Public/Desktop",        "name": "pic"  },'  ||
+                    '{"bin_id": "374", "prefix": "C:/Users/Default/Music",         "name": "drum" }'   ||
+                ']'
+            ) AS dirs
+    ),
+    records AS (
+            SELECT 
+                'bin_id' || x'09' || 'prefix' || x'09' || 'name' || x'0A' ||
+                'str'    || x'09' || 'str'    || x'09' || 'str' AS record
+        UNION ALL
+            SELECT bin_id || x'09' || prefix || x'09' || name AS record
+            FROM folders
+    ),
+    tsv_fs_objects AS (
+        SELECT group_concat(record, x'0A') AS fs_objects
+        FROM records
+    )
+SELECT * FROM tsv_fs_objects;
+~~~
+
+return a scalar string containing a set of records in the tab-separated value format. The last query also adds a table header.
diff --git a/Patterns SQL Interface - JSON Input.md b/Patterns SQL Interface - JSON Input.md
@@ -1,14 +1,14 @@
 ---
 layout: default
-title: JSON SQL Interface
+title: SQL Interface - JSON Input
 nav_order: 4
 parent: Design Patterns
-permalink: /patterns/json-sql
+permalink: /patterns/json-sql-input
 ---
 
 The JSON is a popular and robust format for passing structured information in text form. JSON libraries are broadly available in various environments, including RDBMS engines. The JSON format can serve as a kind of SQL interface for parameterized queries. When an application needs to pass a 1D vector of arbitrary length to the script, it is impossible to have a fixed parameterized script with dedicated query parameters assigned to individual values. Instead, the application packs the data into the JSON format and passes these data in a single string query parameter. The SQL script, in turn, incorporates the code for unpacking the JSON format before further processing.
 
-JSON encoding of the passed data has several benefits. It permits having a fixed SQL script accepting an array of arbitrary length as a query parameter. Also, it establishes a relatively simple, robust, and well-defined SQL interface based on a broadly supported format. Finally, because each query parameter may cost an additional API call, this approach may also improve the overall performance of the database call. For the same reasons, using JSON for encoding the returned data may be more efficient than using a record set. While SQLite API calls are usually fast, each returned value still costs several API calls in SQLite. (Under certain circumstances, however, there might be a high SQLite-independent overhead for each API call.) An important consideration to bear in mind related to the JSON containers is the potential side effects of data conversion between numeric and textual formats.
+JSON encoding of the passed data has several benefits. It permits having a fixed SQL script accepting an array of arbitrary length as a query parameter. Also, it establishes a relatively simple, robust, and well-defined SQL interface based on a broadly supported format. Finally, because each query parameter may cost an additional API call, this approach may also improve the overall performance of the database call. An important consideration to bear in mind related to the JSON containers is the potential side effects of data conversion between numeric and textual formats.
 
 Consider a table *fs_objects(bin_id, prefix, name)* containing a list of file system objects, uniquely identified by their absolute paths (*prefix* || *path_sep* || *name*) and a unique *bin_id*. Suppose an application needs to pass a set of new objects for insertion into this table. Such a set is a 1D vector of arbitrary size, which may contain:
 
@@ -142,7 +142,7 @@ WITH
             json_extract(dirs.value, '$.name')   AS name
         FROM
             json_each(
-                '['                                                                                     ||
+                '['                                                                                    ||
                     '{"bin_id": "239", "prefix": "C:/Winows/System32/drivers/etc", "name": "hosts"},'  ||
                     '{"bin_id": "876", "prefix": "C:/Users/Public/Desktop",        "name": "pic"  },'  ||
                     '{"bin_id": "374", "prefix": "C:/Users/Default/Music",         "name": "drum" }'   ||