docs(tutorials): add redshift datatype examples

Brooke-white · Brooke-white · commit 9903b89d46bb · 2022-02-07T10:02:26.000-08:00
diff --git a/README.rst b/README.rst
@@ -56,6 +56,7 @@ Tutorials
 - `001 - Connecting to Amazon Redshift <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/001%20-%20Connecting%20to%20Amazon%20Redshift.ipynb>`_
 - `002 - Data Science Library Integrations <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/002%20-%20Data%20Science%20Library%20Integrations.ipynb>`_
 - `003 - Amazon Redshift Feature Support <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/003%20-%20Amazon%20Redshift%20Feature%20Support.ipynb>`_
+- `004 - Amazon Redshift Datatypes <https://github.com/aws/amazon-redshift-python-driver/blob/master/tutorials/004%20-%20Amazon%20Redshift%20Datatypes.ipynb>`_
 
 We are working to add more documentation and would love your feedback. Please reach out to the team by `opening an issue <https://github.com/aws/amazon-redshift-python-driver/issues/new/choose>`__ or `starting a discussion <https://github.com/aws/amazon-redshift-python-driver/discussions/new>`_ to help us fill in the gaps in our documentation.
 
diff --git a/tutorials/003 - Amazon Redshift Feature Support.ipynb b/tutorials/003 - Amazon Redshift Feature Support.ipynb
@@ -13,26 +13,27 @@
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "# Overview\n",
     "`redshift_connector` aims to support the latest and greatest features provided by Amazon Redshift so you can get the most out of your data."
-   ],
-   "metadata": {
-    "collapsed": false
-   }
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "## COPY and UNLOAD Support - Amazon S3\n",
     "`redshift_connector` provides the ability to `COPY` and `UNLOAD` data from an Amazon S3 bucket. Shown below is a sample workflow which copies and unloads data from an Amazon S3 bucket"
-   ],
-   "metadata": {
-    "collapsed": false
-   }
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {
+    "pycharm": {
+     "name": "#%% md\n"
+    }
+   },
    "source": [
     "1. Upload the following text file to an Amazon S3 bucket and name it `category_csv.txt`\n",
     "\n",
@@ -42,17 +43,16 @@
     "    14,Shows,Opera,\"All opera, light, and \"\"rock\"\" opera\"\n",
     "    15,Concerts,Classical,\"All symphony, concerto, and choir concerts\"\n",
     "```"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%% md\n"
-    }
-   }
+   ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "metadata": {
+    "pycharm": {
+     "name": "#%%\n"
+    }
+   },
    "outputs": [],
    "source": [
     "import redshift_connector\n",
@@ -70,53 +70,35 @@
     "            print(cursor.fetchall())\n",
     "            cursor.execute(\"unload ('select * from category') to 's3://testing/unloaded_category_csv.txt'  iam_role 'arn:aws:iam::123:role/RedshiftCopyUnload' csv;\")\n",
     "            print('done')\n"
-   ],
-   "metadata": {
-    "collapsed": false,
-    "pycharm": {
-     "name": "#%%\n"
-    }
-   }
+   ]
   },
   {
    "cell_type": "markdown",
+   "metadata": {},
    "source": [
     "After executing the above code block, we can see the requested data was unloaded into the following file, `unloaded_category_csv.text0000_part00`, in the specified Amazon s3 bucket\n"
-   ],
-   "metadata": {
-    "collapsed": false
-   }
-  },
-  {
-   "cell_type": "markdown",
-   "source": [
-    "## Datatype Support\n",
-    "`redshift_connector` supports Amazon Redshift specific datatypes in order to provide users integration of their data into Python projects. Please see the projects [README](https://github.com/aws/amazon-redshift-python-driver/blob/master/README.rst) for a list of supported datatypes."
-   ],
-   "metadata": {
-    "collapsed": false
-   }
+   ]
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
   "language_info": {
    "codemirror_mode": {
     "name": "ipython",
-    "version": 2
+    "version": 3
    },
    "file_extension": ".py",
    "mimetype": "text/x-python",
    "name": "python",
    "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython2",
-   "version": "2.7.6"
+   "pygments_lexer": "ipython3",
+   "version": "3.9.7"
   }
  },
  "nbformat": 4,
- "nbformat_minor": 0
+ "nbformat_minor": 1
 }
diff --git a/tutorials/004 - Amazon Redshift Datatypes.ipynb b/tutorials/004 - Amazon Redshift Datatypes.ipynb
@@ -0,0 +1,214 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "## Datatype Support\n",
+    "`redshift_connector` supports Amazon Redshift specific datatypes in order to provide users integration of their data into Python projects. Please see the projects [README](https://github.com/aws/amazon-redshift-python-driver/blob/master/README.rst) for a list of supported datatypes."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Examples\n",
+    "The following sections provide basic examples showing how to work with Amazon Redshift datatypes.\n",
+    "\n",
+    "#### Geometry\n",
+    "- **Send**: A string holding geometry data in WKB (well known binary) format.\n",
+    "- **Receive**: A string holding geometry data in WKB format.\n",
+    "\n",
+    "**Note**: Geometry data can be sent and receive in formats other than WKB if Amazon Redshift spatial functions are applied. Please see the [Amazon Redshift documentation for a list of spacial functions](https://docs.aws.amazon.com/redshift/latest/dg/geospatial-functions.html).\n",
+    "\n",
+    "[Geometry](https://docs.aws.amazon.com/redshift/latest/dg/GeometryType-function.html)\n",
+    "\n",
+    "Sending data in WKB format:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import redshift_connector\n",
+    "\n",
+    "with redshift_connector.connect(...) as conn:\n",
+    "    with conn.cursor() as cursor:\n",
+    "        cursor.execute(\"create table datatype_test (c1 geometry);\")\n",
+    "        cursor.execute(\n",
+    "            \"insert into datatype_test (c1) values (%s);\",\n",
+    "            (\n",
+    "               '0103000020E61000000100000005000000000000000000000000000000000000000000000000000000000000000000F03F000000000000F03F000000000000F03F000000000000F03F000000000000000000000000000000000000000000000000',\n",
+    "               # using WKB format\n",
+    "            )\n",
+    "        )\n",
+    "        cursor.execute(\"select c1 from datatype_test;\")\n",
+    "        result = cursor.fetchone()\n",
+    "        print(\"c1={}\\n\".format(result[0],))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Sending data in WKT (well known text) format:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import redshift_connector\n",
+    "\n",
+    "with redshift_connector.connect(...) as conn:\n",
+    "    with conn.cursor() as cursor:\n",
+    "        cursor.execute(\"create table datatype_test (c1 geometry);\")\n",
+    "        cursor.execute(\n",
+    "            \"insert into datatype_test (c1) values (ST_GeomFromText(%s));\",\n",
+    "            (\n",
+    "                'LINESTRING(1 2,3 4,5 6,7 8,9 10,11 12,13 14,15 16,17 18,19 20)', # using WKT format\n",
+    "            )\n",
+    "        )\n",
+    "        cursor.execute(\"select c1, c2 from datatype_test;\")\n",
+    "        result = cursor.fetchone()\n",
+    "        print(\"c1={}\\nc2={}\".format(result[0], result[1]))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Super\n",
+    "- **Send**: A string containing JSON data.\n",
+    "- **Receive**: A string containing JSON data\n",
+    "\n",
+    "[Super](https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html)\n",
+    "[json_parse](https://docs.aws.amazon.com/redshift/latest/dg/JSON_PARSE.html)\n",
+    "[Unnesting SUPER arrays](https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#unnest)\n",
+    "[Querying semistructured data](https://docs.aws.amazon.com/redshift/latest/dg/query-super.html)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import redshift_connector\n",
+    "\n",
+    "with redshift_connector.connect(...) as conn:\n",
+    "    with conn.cursor() as cursor:\n",
+    "        cursor.execute(\n",
+    "            \"CREATE TABLE foo AS SELECT json_parse(%s) AS multi_level_array;\",\n",
+    "            ('[[1.1, 1.2], [2.1, 2.2], [3.1, 3.2]]',)\n",
+    "        )\n",
+    "        cursor.execute(\"SELECT array, element FROM foo AS f, f.multi_level_array AS array, array AS element;\")\n",
+    "        result = cursor.fetchall()\n",
+    "        print(result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Retrieving array elements from json array stored in super datatype"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import redshift_connector \n",
+    "import json\n",
+    "\n",
+    "with redshift_connector.connect(...) as conn:\n",
+    "    with conn.cursor() as cursor:\n",
+    "        cursor.execute(\n",
+    "            \"CREATE TABLE foo AS SELECT json_parse(%s) AS vals;\",\n",
+    "            (json.dumps({\"x\": [1,2,3,4], \"y\": [5,6,7,8], \"z\": [9,10,11,12]}),)\n",
+    "        )\n",
+    "        cursor.execute(\"SELECT vals.x FROM foo;\")\n",
+    "        result = cursor.fetchall()\n",
+    "        print(result)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import redshift_connector \n",
+    "import json\n",
+    "\n",
+    "with redshift_connector.connect(...) as conn:\n",
+    "    with conn.cursor() as cursor:\n",
+    "        cursor.execute(\"create table t (s super);\")\n",
+    "        cursor.execute(\"insert into t values (json_parse(%s));\", ('[10001,10002,\"abc\"]',))\n",
+    "        cursor.execute(\"insert into t values (json_parse(%s));\", (json.dumps({\"x\": [1,2,3,4]}),))\n",
+    "        cursor.execute(\"select s from t;\")\n",
+    "        result = cursor.fetchall()\n",
+    "        print(result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Varbyte\n",
+    "- **Send**: A string or bytes\n",
+    "- **Receive**: A string containing data in hexidecimal format\n",
+    "\n",
+    "[Varbyte](https://docs.aws.amazon.com/redshift/latest/dg/r_VARBYTE_type.html)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import redshift_connector\n",
+    "\n",
+    "with redshift_connector.connect(...) as conn:\n",
+    "    with conn.cursor() as cursor:\n",
+    "        cursor.execute(\"create table t (v varbyte);\")\n",
+    "        cursor.execute(\"insert into t values (%s), (%s);\", ('aa', 'abc', ))\n",
+    "        cursor.execute(\"insert into t values (%s), (%s);\", (b'aa', b'abc',))\n",
+    "        cursor.execute(\"insert into t values (%s), (%s);\", (b'\\x00\\x01\\x02\\x03',b'\\x00\\x0a\\x0b\\x0c'))\n",
+    "        cursor.execute(\"select v from t;\")\n",
+    "        result = cursor.fetchall()\n",
+    "        print(result)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}