Better to know some
... than all
Dr. Codd's 12 Rules for a Relational Database Model
The most popular data storage model is the relational database, which grew from the seminal paper "A Relational Model of Data for Large Shared Data Banks," written by Dr. E. F. Codd in 1970. SQL evolved to service the concepts of the relational database model. Dr. Codd defined 13 rules, oddly enough referred to as Codd's 12 Rules, for the relational model:
0. A relational DBMS must be able to manage databases entirely through its relational capabilities.
1. Information rule-- All information in a relational database (including table and column names) is represented explicitly as values in tables.
2. Guaranteed access--Every value in a relational database is guaranteed to be accessible by using a combination of the table name, primary key value, and column name.
3. Systematic null value support--The DBMS provides systematic support for the treatment of null values (unknown or inapplicable data), distinct from default values, and independent of any domain.
4. Active, online relational catalog--The description of the database and its contents is represented at the logical level as tables and can therefore be queried using the database language.
5. Comprehensive data sublanguage--At least one supported language must have a well-defined syntax and be comprehensive. It must support data definition, manipulation, integrity rules, authorization, and transactions.
6. View updating rule--All views that are theoretically updatable can be updated through the system.
7. Set-level insertion, update, and deletion--The DBMS supports not only set-level retrievals but also set-level inserts, updates, and deletes.
8. Physical data independence--Application programs and ad hoc programs are logically unaffected when physical access methods or storage structures are altered.
9. Logical data independence--Application programs and ad hoc programs are logically unaffected, to the extent possible, when changes are made to the table structures.
10. Integrity independence--The database language must be capable of defining integrity rules. They must be stored in the online catalog, and they cannot be bypassed.
11. Distribution independence--Application programs and ad hoc requests are logically unaffected when data is first distributed or when it is redistributed.
12. Nonsubversion--It must not be possible to bypass the integrity rules defined through the database language by using lower-level languages. Most databases have had a "parent/child" relationship; that is, a parent node would contain file pointers to its children.
Codd's relational database management system
This method has several advantages and many disadvantages. In its favor is the fact that the physical structure of data on a disk becomes unimportant. The programmer simply stores pointers to the next location, so data can be accessed in this manner. Also, data can be added and deleted easily. However, different groups of information could not be easily joined to form new information. The format of the data on the disk could not be arbitrarily changed after the database was created. Doing so would require the creation of a new database structure.
Codd's idea for an RDBMS uses the mathematical concepts of relational algebra to break down data into sets and related common subsets.
Because information can naturally be grouped into distinct sets, Dr. Codd organized his database system around this concept. Under the relational model, data is separated into sets that resemble a table structure. This table structure consists of individual data elements called columns or fields. A single set of a group of fields is known as a record or row.
Designing the Database Structure
The most important decision for a database designer, after the hardware platform and the RDBMS have been chosen, is the structure of the tables. Decisions made at this stage of the design can affect performance and programming later during the development process. The process of separating data into distinct, unique sets is called normalization.
Today's Database Landscape
Computing technology has made a permanent change in the ways businesses work around the world. Information that was at one time stored in warehouses full of filing cabinets can now be accessed instantaneously at the click of a mouse button. Orders placed by customers in foreign countries can now be instantly processed on the floor of a manufacturing facility. Although 20 years ago much of this information had been transported onto corporate mainframe databases, offices still operated in a batch-processing environment. If a query needed to be performed, someone notified the management information systems (MIS) department; the requested data was delivered as soon as possible (though often not soon enough).
In addition to the development of the relational database model, two technologies led to the rapid growth of what are now called client/server database systems. The first important technology was the personal computer. Inexpensive, easy-to-use applications such as Lotus 1-2-3 and Word Perfect enabled employees (and home computer users) to create documents and manage data quickly and accurately. Users became accustomed to continually upgrading systems because the rate of change was so rapid, even as the price of the more advanced systems continued to fall.
The second important technology was the local area network (LAN) and its integration into offices across the world. Although users were accustomed to terminal connections to a corporate mainframe, now word processing files could be stored locally within an office and accessed from any computer attached to the network. After the Apple Macintosh introduced a friendly graphical user interface, computers were not only inexpensive and powerful but also easy to use. In addition, they could be accessed from remote sites, and large amounts of data could be off-loaded to departmental data servers.
During this time of rapid change and advancement, a new type of system appeared. Called client/server development because processing is split between client computers and a database server, this new breed of application was a radical change from mainframe-based application programming. Among the many advantages of this type of architecture are
* Reduced maintenance costs
* Reduced network load (processing occurs on database server or client computer)
* Multiple operating systems that can interoperate as long as they share a common network protocol
* Improved data integrity owing to centralized data location
Client/server computing is a processing model in which a single application is partitioned between multiple processors (front-end and back-end) and the processors cooperate (transparent to the end user) to complete the processing as a single unified task. Implementing Client/Server Computing A client/server bond product ties the processors together to provide a single system image (illusion). Shareable resources are positioned as requestor clients that access authorized services. The architecture is endlessly recursive; in turn, servers can become clients and request services of other servers on the network, and so on and so on.
This type of application development requires an entirely new set of programming skills. User interface programming is now written for graphical user interfaces, whether it be MS Windows, IBM OS/2, Apple Macintosh, or the UNIX X-Window system. Using SQL and a network connection, the application can interface to a database residing on a remote server. The increased power of personal computer hardware enables critical database information to be stored on a relatively inexpensive standalone server. In addition, this server can be replaced later with little or no change to the client applications.
A Cross-Product Language
You can apply the basic concepts introduced in this book in many environments--for example, Microsoft Access running on a single-user Windows application or SQL Server running with 100 user connections. One of SQL's greatest benefits is that it is truly a cross-platform language and a cross-product language. Because it is also what programmers refer to as a high-level or fourth-generation language (4GL), a large amount of work can be donehigher-level language 4GL (fourth-generation) language fourth-generation (4GL) language in fewer lines of code.
Oracle Corporation released the first commercial RDBMS that used SQL. Although the original versions were developed for VAX/VMS systems, Oracle was one of the first vendors to release a DOS version of its RDBMS. (Oracle is now available on more than 70 platforms.) In the mid-1980s Sybase released its RDBMS, SQL Server. With client libraries for database access, support for stored procedures (discussed on Day 14, "Dynamic Uses of SQL"), and interoperability with various networks, SQL Server became a successful product, particularly in client/server environments. One of the strongest points for both of theseSQL Server powerful database systems is their scalability across platforms. C language code (combined with SQL) written for Oracle on a PC is virtually identical to its counterpart written for an Oracle database running on a VAX system.
SQL and Client/Server Application Development
The common thread that runs throughout client/server application development is the use client/server computing of SQL and relational databases. Also, using this database technology in a single-user business application positions the application for future growth.