New Document
Computer Science
Computer Catlog
Oracle Catlog

History of Oracle
Introduction to Terms
Oracle Configurations
Roles of Database Administrator
Oracle Architecture
A Brief History of SQL
Dr. Codd's 12 Rules
An Overview of SQL
The SELECT Statement
Expressions, Conditions, and Operators
Functions
Clauses in SQL
Joining Tables
Sub Query
Manipulating Data
Building a Database
Views and Indexes
Controlling Transactions
Database Security
Advanced SQL Topics
Cursors
Stored Procedures
Triggers
Embedded SQL
SQL Tuning
Using Views in Data Dictionary
Using SQL to Generate SQL Statements

Joins


    Joins will enable you to gather and manipulate data across several tables. You will understand and be able to do the following:

Perform an outer join

Perform a left join

Perform a right join

Perform an equi-join

Perform a non-equi-join

Join a table to itself


    One of the most powerful features of SQL is its capability to gather and manipulate data from across several tables. Without this feature you would have to store all the data elements necessary for each application in one table. Without common tables you would need to store the same data in several tables. Imagine having to redesign, rebuild, and repopulate your tables and databases every time your user needed a query with a new piece of information. The JOIN statement of SQL enables you to design smaller, more specific tables that are easier to maintain than larger tables.

Multiple Tables in a Single SELECT Statement


    Like Dorothy in The Wizard of Oz, you have had the power to join tables, "Introduction to the Query: The SELECT Statement," when you learned about SELECT and FROM. Unlike Dorothy, you don't have to click you heels together three times to perform a join. Use the following two tables, named, cleverly enough, TABLE1 and TABLE2.

NOTE:The queries in examples were produced using Borland's ISQL tool. You will notice some differences between these queries and the ones that we used earlier in the book. For example, these queries do not begin with an SQL prompt. Another difference is that ISQL does not require a semicolon at the end of the statement. (The semicolon is optional in ISQL.) But the SQL basics are still the same.


INPUT:

SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION
FROM ORDERS O, PART P

ANALYSIS:

    The preceding code is just a portion of the result set. The actual set is 14 (number of rows in ORDERS) x 6 (number of rows in PART), or 84 rows. It is similar to the result from joining TABLE1 and TABLE2 earlier today, and it is still one statement shy of being useful. Before we reveal that statement, we need to regress a little and talk about another use for the alias.

Finding the Correct Column


    When you joined TABLE1 and TABLE2, you used SELECT *, which returned all the columns in both tables. In joining ORDERS to PART, the SELECT statement is a bit more complicated:

SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION

    SQL is smart enough to know that ORDEREDON and NAME exist only in ORDERS and that DESCRIPTION exists only in PART, but what about PARTNUM, which exists in both? If you have a column that has the same name in two tables, you must use an alias in your SELECT clause to specify which column you want to display. A common technique is to assign a single character to each table, as you did in the FROM clause:

FROM ORDERS O, PART P

    You use that character with each column name, as you did in the preceding SELECT clause. The SELECT clause could also be written like this:

SELECT ORDEREDON, NAME, O.PARTNUM, P.PARTNUM, DESCRIPTION

    But remember, someday you might have to come back and maintain this query. It doesn't hurt to make it more readable. Now back to the missing statement.

Equi-Joins


INPUT:

SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION
FROM ORDERS O, PART P WHERE O.PARTNUM = P.PARTNUM

ANALYSIS:


    Using the column PARTNUM that exists in both of the preceding tables, you have just combined the information you had stored in the ORDERS table with information from the PART table to show a description of the parts the bike shops have ordered from you. The join that was used is called an equi-join because the goal is to match the values of a column in one table to the corresponding values in the second table.

    You can further qualify this query by adding more conditions in the WHERE clause. For example:

INPUT:

SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION
FROM ORDERS O, PART P WHERE O.PARTNUM = P.PARTNUM AND O.PARTNUM = 76

    The number 76 is not very descriptive, and you wouldn't want your sales people to have to memorize a part number. (We have had the misfortune to see many data information systems in the field that require the end user to know some obscure code for something that had a perfectly good name. Please don't write one of those!) Here's another way to write the query:

TIP: When you join two tables without the use of a WHERE clause, you are performing a Cartesian join. This join combines all rows from all the tables in the FROM clause. If each table has 200 rows, then you will end up with 40,000 rows in your results (200 x 200). Always join your tables in the WHERE clause unless you have a real need to join all the rows of all the selected tables.

NOTE:Notice the consistent use of table and column aliases in the SQL statement examples. You will save many, many keystrokes by using aliases. They also help to make your statement more readable.


    Can you join more than one table? For example, to generate information to send out an invoice, you could type this statement:

INPUT:

SELECT C.NAME, C.ADDRESS, (O.QUANTITY * P.PRICE) TOTAL
FROM ORDER O, PART P, CUSTOMER C
WHERE O.PARTNUM = P.PARTNUM AND O.NAME = C.NAME

NOTE:Notice that when joining the three tables (ORDERS, PART, and CUSTOMER) that the ORDERS table was used in two joins and the other tables were used only once. Tables that will return the fewest rows with the given conditions are commonly referred to as driving tables, or base tables. Tables other than the base table in a query are usually joined to the base table for more efficient data retrieval. Consequently, the ORDERS table is the base table in this example. In most databases a few base tables join (either directly or indirectly) all the other tables. (See Day 15, "Streamlining SQL Statements for Improved Performance," for more on base tables.)

NOTE:In the example, SQL grouped TABLE1 and TABLE2 to create a new table with X (rows in TABLE1) x Y (rows in TABLE2) number of rows. A physical table is not created by the join, but rather in a virtual sense. The join between the two tables produces a new set that meets all conditions in the WHERE clause, including the join itself. The SELECT statement has reduced the number of rows displayed, but to evaluate the WHERE clause SQL still creates all the possible rows. The sample tables in today's examples have only a handful of rows. Your actual data may have thousands of rows. If you are working on a platform with lots of horsepower, using a multiple-table join might not visibly affect performance. However, if you are working in a slower environment, joins could cause a significant slowdown.

We aren't telling you not to use joins, because you have seen the advantages to be gained from a relational design. Just be aware of the platform you are using and your customer's requirements for speed versus reliability.


Non-Equi-Joins


    Because SQL supports an equi-join, you might assume that SQL also has a non-equi-join. You would be right! Whereas the equi-join uses an = sign in the WHERE statement, the non-equi-join uses everything but an = sign. For example:

INPUT:

SELECT O.NAME, O.PARTNUM, P.PARTNUM, O.QUANTITY * P.PRICE TOTAL
FROM ORDERS O, PART P WHERE O.PARTNUM > P.PARTNUM

ANALYSIS:

    This listing goes on to describe all the rows in the join WHERE O.PARTNUM > P.PARTNUM. In the context of your bicycle shop, this information doesn't have much meaning, and in the real world the equi-join is far more common than the non-equi-join. However, you may encounter an application in which a non-equi-join produces the perfect result.

Outer Joins versus Inner Joins


    Just as the non-equi-join balances the equi-join, an outer join complements the inner join. An inner join is where the rows of the tables are combined with each other, producing a number of new rows equal to the product of the number of rows in each table. Also, the inner join uses these rows to determine the result of the WHERE clause. An outer join groups the two tables in a slightly different way. Using the PART and ORDERS tables from the previous examples, perform the following inner join:

INPUT:

SELECT P.PARTNUM, P.DESCRIPTION, P.PRICE, O.NAME, O.PARTNUM
FROM PART P JOIN ORDERS O ON ORDERS.PARTNUM = 54

ANALYSIS:

    The result is that all the rows in PART are spliced on to specific rows in ORDERS where the column PARTNUM is 54. Here's a RIGHT OUTER JOIN statement:

NOTE:The syntax you used to get this join--JOIN ON--is not ANSI standard. The implementation you used for this example has additional syntax. You are using it here to specify an inner and an outer join. Most implementations of SQL have similar extensions. Notice the absence of the WHERE clause in this type of join.


RIGHT OUTER JOIN


INPUT:

SELECT P.PARTNUM, P.DESCRIPTION, P.PRICE, O.NAME, O.PARTNUM
FROM PART P RIGHT OUTER JOIN ORDERS O ON ORDERS.PARTNUM = 54

ANALYSIS:

    This type of query is new. First you specified a RIGHT OUTER JOIN, which caused SQL to return a full set of the right table, ORDERS, and to place nulls in the fields where ORDERS.PARTNUM <>54. Following is a LEFT OUTER JOIN statement:

LEFT OUTER JOIN


INPUT:

SELECT P.PARTNUM, P.DESCRIPTION,P.PRICE, O.NAME, O.PARTNUM
FROM PART P LEFT OUTER JOIN ORDERS O ON ORDERS.PARTNUM = 54

ANALYSIS:

    You get the same six rows as the INNER JOIN. Because you specified LEFT (the LEFT table), PART determined the number of rows you would return. Because PART is smaller than ORDERS, SQL saw no need to pad those other fields with blanks.

    Don't worry too much about inner and outer joins. Most SQL products determine the optimum JOIN for your query. In fact, if you are placing your query into a stored procedure (or using it inside a program, you should not specify a join type even if your SQL implementation provides the proper syntax. If you do specify a join type, the optimizer chooses your way instead of the optimum way.

    Some implementations of SQL use the + sign instead of an OUTER JOIN statement. The + simply means "Show me everything even if something is missing." Here's the syntax:

SYNTAX:

SELECT e.name, e.employee-id, ep.salary, ep.marital-status
FROM employee-tbl e, employee-pay-tbl ep
WHERE e.employee-id = ep.employee-id(+) AND e.name like '%MITH';

ANALYSIS:

    This statement is joining the two tables. The + sign on the ep.employee-id column will return all rows even if they are empty.

Joining a Table to Itself (Self Join)


    Today's final topic is the often-used technique of joining a table to itself. The syntax of this operation is similar to joining two tables. For example, to join table TABLE1 to itself:

INPUT:

SELECT F.PARTNUM, F.DESCRIPTION, S.PARTNUM,S.DESCRIPTION
FROM PART F, PART S
WHERE F.PARTNUM = S.PARTNUM AND F.DESCRIPTION <> S.DESCRIPTION

ANALYSIS:

    Now you are a hero until someone asks why the table has only two entries. You, remembering what you have learned about JOINs, retain your hero status by explaining how the join produced two rows that satisfied the condition WHERE F.PARTNUM = S.PARTNUM AND F.DESCRIPTION <>S.DESCRIPTION. Of course, at some point, the row of data containing the duplicate PARTNUM would have to be corrected.

Summary


    A Join combines all possible combinations of rows present in the selected tables. These new rows are then available for selection based on the information that you want.