SQL is the Structured Query Language. It was created specifically to manage and access data held in relational databases. The first commercial venture to use SQL was Oracle, which still has its SQL*Plus query language as the basis of all its relational database management system (RDMS).
Today, MySQL, SQL Server, SAP, and Sybase all use SQL for database access. In this guide, we will give you the standard version of SQL commands, paying particular attention to the Oracle implementation in SQL*Plus.
EDITOR’S CHOICE: We recommend SolarWinds infrastructure management tools to monitor all of your Microsoft tools, including SQL Server. You need a suite of monitoring utilities that can interact in order to cover all of the facilities that support your relational database.
Relational databases
SQL manages relational databases. A relational database contains tables that are separated to reduce the repetition of data. The process of deciding on the proper groupings of data categories is called “normalization.” In some instances, performance demands may dictate that tables work better if not completely normalized. This process is called “denormalization” and it is a rare occurrence.
Each table needs a column that uniquely identifies each row. This is called the “primary key.” The other tables that can join to that entity must contain a reference to the primary key in each record. That reference is called a “foreign key.” You join tables together by matching the foreign key in one table with the primary key in another. This action is performed with SQL. The Entity-Relationship Diagram is a useful reference when you are deciding on how to structure your SQL statement.
In order to speed up access, it is normal to place an index on each primary key in the database. It is also possible to store frequently-used joins to create virtual tables, which are called “views.”
One more database object is a trigger. This is a piece of SQL that launches whenever a value is added to a column or is changed. A trigger will update values in a column in a different table or other columns in the same table. This happens if a value is added or changed in the column on which the trigger is placed. It can also restrict the values that can be entered into a column.
DQL, DDL, DCL, and DML
SQL includes four command subsets. These are Data Query Language (DQL), Data Description Language (DDL), Data Control Language (DCL), and Data Manipulation Language (DML).
On most projects, the Database Administrator (DBA) is the only team member that has the privileges to use DDL. This command group creates, changes, and drops tables, indexes, triggers, and views.
DML is used to add, change, or remove records from tables in the database. Access to these SQL functions is usually restricted to support staff or senior, responsible members of the user community.
DQL is use to generate reports from the database. This command subset is used by programmers and can also be made available to end users so they can perform their own queries.
The management of user privileges is stored in the same database as the data. However, access to these tables is usually restricted to the DBA. The management of user access is all implemented through DCL.
There is one useful SQL command that doesn’t fall into the classifications of either DQL, DDL, DCL, or DML. This is “describe” which is usually implemented as DESC. This shows the structure of an object in the database. In the case of a table, it will show a list of columns and the datatype and size of each. It will show the SQL that created a view or a trigger and it will show the columns that are involved in a given index.
Relational operators
The theory behind relational databases was defined by E. F. Codd in 1970. Within Codd’s definition lies a requirement of a query tool that must implement eight essential relational operations. These are:
- SELECT
- PROJECT
- JOIN
- PRODUCT
- UNION
- INTERSECT
- DIFFERENCE
- DIVIDE
You need to master each of these operations in order to effectively access data from a relational database. These are explained below.
SELECT
SELECT was originally called RESTRICT. It is the ability to extract one record or a group of matching records from a table in the database. As well as being a theoretical operation, SELECT is the name of the main command in any SQL implementation. In fact, all data retrieval is executed with a SELECT. The command is so powerful that it can be modified in many, many ways. Most of the SQL you will need to learn are the “clauses” on a SELECT statement.
PROJECT
PROJECT is the ability to isolate a column (or “attribute”) in a table. This is always implemented with a SELECT statement.
JOIN
JOIN links together tables on a common attribute. You use the foreign key in one table to match to primary key values in another table. The act effectively unifies two tables temporarily. This is implemented as a clause in a SELECT statement. A join between two tables will automatically leave out any records that do not appear in both supporting tables. An “outer join” will display records that don’t have a match in the other table, leaving the fields that should have been supplied by the other table left blank.
PRODUCT
PRODUCT is also known as “times.” This is implemented as a SELECT statement that draws data from two tables, but doesn’t specify the join between them. In real life, there are very few practical reasons why anyone would ever perform a times on two tables. It is usually a warning sign that you have written a JOIN clause badly. The result of a PRODUCT is that every qualifying record in one table matches with every qualifying record in the other table. So, if you create a SELECT statement from two tables, each of which have ten records, the results will be a total of 100 records.
UNION
UNION is a common operator in set theory. If you have two sets, the union combines all of the members of both sets. Duplication is removed from the results. In SQL, this is implemented by the OR Boolean operator.
INTERSECT
INTERSECT is the other main operator of set theory. If you have two sets, the intersect describes those subjects that are members of both sets. Any subject that is a member of only one of the two sets is excluded from the results. In SQL, this is implemented by the AND Boolean operator.
DIFFERENCE
DIFFERENCE is also known as “minus” and in notation is represented by the minus sign (“-“). There are several ways to implement a DIFFERENCE. It can be created by including a Boolean NOT in the selection criteria and it can also be implemented by a MINUS operator followed by a subquery.
DIVIDE
DIVIDE is an unusual operation that, like PRODUCT, has very few practical applications. Codd regarded DIVIDE as the acid test of any relational system and its query tool, so every SQL implementation includes this capability just to qualify for the title “relational.” In reality, you will never use it. Implementing a DIVIDE require a series of subqueries feeding into a MINUS in Oracle SQL*Plus and an EXCEPT operator in some other SQL implementations, such as SQL Server and PostgreSQL.
Boolean operators
George Boole was a British mathematician who died in 1864. He invented Boolean algebra, which had no practical application until the fields of Informatics and relational databases emerged in the late 20th century. Boolean algebra is central to the operations of SQL — you could only write the most basic SQL query without using Boolean logic. Boolean logic also underpins all search engines. The key Boolean operators in SQL are:
- AND
- OR
- NOT
Another quirky Victorian Brit that you need to know about is Augustus de Morgan. He came up with an observation on transformation in Boolean set theory, of which every SQL programmer needs to be aware. This is that a NOT can alter how an AND and an OR work. The short version is that you need to be careful how you use brackets in your statements to avoid unintended consequences. So “and” and “or” should also be considered to be important Boolean operators in SQL.
AND
AND implements the INTERSECT in set theory. If you want to pull records out of a database that only match with two requirements and not just one of that composite criteria, then you need to use an AND. For example, you may run a seed business and your database contains a table that stores all of your seeds with their attributes. A customer calls in and wants only yellow begonias. So, you would search your database for species = “begonia” AND color = “yellow” to get the right seed catalog.
OR
OR gives you all of the members of two sets – it is the UNION in set theory. So, in your seed database, if you ran a query: species = “begonia” OR color = “yellow”, you would get results that included yellow roses, yellow begonia, pink begonias, and yellow daffodils. In this example, yellow begonias qualify twice. In relational theory, only one record should be returned. Fortunately, SQL includes the qualifier DISTINCT, so by entering SELECT DISTINCT * FROM flowers WHERE species = “begonia” OR color = “yellow”, that yellow begonia record would appear only once.
NOT
NOT is easy to understand. It excludes the results that the following statement would produce: AND NOT, OR NOT. In practice, you will almost never use “OR NOT” in a query. If your seed customer wanted any color of begonia other than yellow, you would use species = “begonia” AND NOT color = “yellow” to select products to offer.
Brackets
According to De Morgan’s Law:
“the negation of a disjunction is the conjunction of the negations; and the negation of a conjunction is the disjunction of the negations”.
In plain English, use brackets carefully in your SQL statement when also including a NOT to prevent your ANDs from transforming into ORs and vice versa. Bracketing conditions together can also shorten your selection criteria. For example, NOT (species = “begonia” AND color = “yellow”) is the same as NOT species = “begonia” OR NOT color = “yellow”.
Extracting records
Different SQL implementations use different syntax, but in this guide, we will follow Oracle’s SQL*Plus. The basic structure of an SQL data selection is:
SELECT