|
1 | 1 | # Databases-and-SQL-for-Data-Science-with-Python
|
2 |
| -Databases and SQL for Data Science with Python |
| 2 | + |
| 3 | +A database is a repository of data that provides functionality for adding, modifying, and querying the data. (A database is a logically coherent collection of data with some inherent meaning). SQL is a language used to query or retrieve data from a relational database. |
| 4 | + |
| 5 | +The Relational Model is the most used data model for databases because it allows for data independence. |
| 6 | + |
| 7 | +The primary key of a relational table uniquely identifies each tuple or row, preventing duplication of data and providing a way of defining relationships between tables. |
| 8 | + |
| 9 | +SQL statements fall into two different categories: Data Definition Language (DDL) statements and Data Manipulation Language (DML) statements. |
| 10 | + |
| 11 | + |
| 12 | +In this lab, you will use MySQL. MySQL is a Relational Database Management System (RDBMS) designed to efficiently store, manipulate, and retrieve data |
| 13 | + |
| 14 | +You can use the WHERE clause to refine your query results. |
| 15 | + |
| 16 | +You can use the wildcard character (%) as a substitute for unknown characters in a pattern. |
| 17 | + |
| 18 | +You can use BETWEEN ... AND ... to specify a range of numbers. |
| 19 | + |
| 20 | +You can sort query results into ascending or descending order, using the ORDER BY clause to specify the column to sort on. |
| 21 | + |
| 22 | +You can group query results by using the GROUP BY clause. |
| 23 | + |
| 24 | +Functions, Multiple Tables, and Sub-queries |
| 25 | + |
| 26 | +Most databases come with built-in functions that you can use in SQL statements to perform operations on data within the database itself. |
| 27 | + |
| 28 | +When you work with large datasets, you may save time by using built-in functions rather than first retrieving the data into your application and then executing functions on the retrieved data. |
| 29 | + |
| 30 | +You can use sub-queries to form more powerful queries than otherwise. |
| 31 | + |
| 32 | +You can use a sub-select expression to evaluate some built-in aggregate functions like the average function. |
| 33 | + |
| 34 | +Derived tables or table expressions are sub-queries where the outer query uses the results of the sub-query as a data source. |
| 35 | + |
| 36 | +You can access a database from a language like Python by using the appropriate API. Examples include ibm_db API for IBM DB2, psycopg2 for ProstgreSQL, and dblib API for SQL Server. |
| 37 | + |
| 38 | +DB-API is Python's standard API for accessing relational databases. It allows you to write a single program that works with multiple kinds of relational databases instead of writing a separate program for each one. |
| 39 | + |
| 40 | +The DB_API connect constructor creates a connection to the database and returns a Connection Object, which is then used by the various connection methods. |
| 41 | + |
| 42 | +The connection methods are: |
| 43 | +The cursor() method, which returns a new cursor object using the connection. |
| 44 | +The commit() method, which is used to commit any pending transaction to the database. |
| 45 | +The rollback() method, which causes the database to roll-back to the start of any pending transaction. |
| 46 | +The close() method, which is used to close a database connection. |
| 47 | + |
| 48 | +You can use SQL Magic commands to execute queries more easily from Jupyter Notebooks. |
| 49 | +Magic commands have the general format %sql select * from tablename. |
| 50 | +Cell magics start with a double %% (percent) sign and apply to the entire cell. |
| 51 | +Line magics start with a single % (percent) sign and apply to a particular line in a cell. |
0 commit comments