Success Stories of Software Engineer: DataBase Design :Part1

Suppose you have a large collection of compact discs and you want to create a database to track them. The first step is to determine what the data that you are going to store is about. One good way to start is to think about why you want to store the data in the first place. In our case, we most likely want to be able to look up CDs by artist, title, and song. Since we want to look up those items, we know they must be included in the database. In addition, it is often useful to simply list items that should be tracked. One possible list might include: CD title, record label, band name, song title. As a starting point, we will store the data in the table shown in Table 2-1.

Table 2-1. A CD Database Made Up of a Single Table

Band Name	CD Title	Record Label	Songs
Stevie Wonder	Talking Book	Motown	You Are the Sunshine of My Life, Maybe Your Baby, Superstition, . . .
Miles Davis Quintet	Miles Smiles	Columbia	Orbits, Circle, . . .
Wayne Shorter	Speak No Evil	Blue Note	Witch Hunt, Fee-Fi-Fo-Fum
Herbie Hancock	Headhunters	Columbia	Man, . . .
Herbie Hancock	Maiden Voyage	Blue Note	Maiden Voyage

There are basically three components to design a database: Conceptual, logical and physical. Logical database design involves modeling your business requirements and data using database components, such as tables and constraints, without regard for how or where the data will be physically stored. Physical database design involves mapping the logical design onto physical media, taking advantage of the hardware and software features available, which allows the data to be physically accessed and maintained as quickly as possible, and indexing.

It is important to correctly design the database to model your business requirements, and to take advantage of hardware and software features early in the development cycle of a database application, because it is difficult to make changes to these components later.

Logical database Design:

A logical schema won't exist in your database. A logical schema is a design-centric database structure built to meet your business requirements. It is a model that exists on a white board or in a diagraming tool. It is like the architect's drawings of your database.

Basic database logical model objects (DBMS specific) are: Schema, Tables, Columns, Data Types, Constraints: primary key, foreign key, uniqueness, referential integrity, default, range

Logical Schema
- is made out of actually what you plan (your initial data model) and it's objects are:
- tablespaces
- tables
- views
- procedures
- packages etc...

Logical design involves defining business entities, attributes for each entity, and relationships among entities. The latter are sometimes referred to as cardinality and optionality rules. For example, a grocery store application might include entities for store, employee, customer, item and sales transaction. Each store can have multiple employees, but each employee might only be allowed to be a full-time employee in a single store. Similarly each sales transaction must have a single customer, and can have one or more items sold.

Logical data models (LDMs) are often produced in the form of an Entity Relationship Diagram (ERD), which uses notation for expressing entities as boxes and relationships as lines. Attributes either appear in the boxes or are listed for each entity. Relationships are lines between the boxes, with the cardinality and optionality expressed as a circle for zero (or optional), a vertical line for "one" and a crow's foot for "many".

Store -||---employs---0|< Employee

(a Store employs zero one or more Employees and an employee is employed by exactly one Store).

Customer -||-- participates in -|< Sales Transaction

(a Customer participates in one or more Sales Transactions and a Sales Transaction is participated in by exactly one Customer)

For Logical data modeling:

1. Identify and model the entities.

2. Identify and model the relationships between the entities.

3. Identify and model the attributes.

4. Identify unique identifiers for each entity.

5. Normalize.

Logical modeling is not specific to the database engine or other technical platform. It is organizationally independent as well. Logical models are generally highly normalized. Logical models are often used to derive or generate physical models via a physical design process.

Note that some methodologies (such as ORM) refer to a technology-independent data models as Conceptual Data Models (CDM) and a Logical Data Model is constrained by technology constraints. It is important if reading or discussing CDMs and LDMs to understand the context of how these terms are used.

Physical Database Design:-

The I/O subsystem (storage engine) is a key component of any relational database. A successful database implementation usually requires careful planning at the early stages of your project. The storage engine of a relational database requires much of this planning, which includes determining:

What type of disk hardware to use, such as RAID (redundant array of independent disks) devices.
How to place your data onto the disks. For more information, see Data Placement Using File groups.
Which index design to use to improve query performance in accessing data.
How to set all configuration parameters appropriately for the database to perform well.

Physical design involves generating the schema for tables, indexes, default and check constraints, and views for a specific physical structure such as a database, file, or XML document. A physical model is generally specific to the database engine and version and is often optimized for a specific application usage of the data. Physical models include data types for each attribute and can be normalized or de-normalized. Physical models can change after the application is deployed in the production environment. For example, indexes might be altered to tune the performance. New tables, constraints, defaults and rules might be added to enhance the application's feature set and enforce new set of business rules.

Designing additional physical level structures are Files: storage files, partitions, etc. Performance: indexes, views, etc. Functional: scripts (stored procedures), functions, triggers, transactions, etc. Security: users, roles, permissions, etc.

Physical Schema
-here is where you have your data files, redo logs, control files etc, physical objects that reside on a O.S. and are in row state. These objects are used by logical structures to make our data make sense and readable.

Now the questions comes into your mind! What was the point in creating the logical data model? E.g.You want to create a database to store data about CDs. The data model is only an intermediate step along the way. Ultimately, you would like to end up with a MySQL or mSQL database where you can store data. How do you get there? Physical database design translates your logical data model into a set of SQL statements that define your MySQL or mSQL database.

Since MySQL and mSQL are relational database systems, it is relatively easy to translate from a logical data model, such as the one we described earlier, into a physical MySQL or mSQL database. Here are the rules for translation:

1. Entities become tables in the physical database.

2. Attributes become columns in the physical database. You have to choose an appropriate datatype for each of the columns.

3. Unique identifiers become columns that are not allowed to have NULLs. These are called primary keys in the physical database. You may also choose to create a unique index on the identifiers to enforce uniqueness. For your purposes, mSQL does not have a concept of a primary key. It simply has unique indices. This issue does not apply to MySQL.

4. Relationships are modeled as foreign keys. We will cover this later.

If we apply these rules to our data model -- minus the Record Label address information -- we will end up with the physical database described in Table 2-1.

Table 2-1. Physical Table Definitions for the CD Database

Table	Column	Datatype	Notes
CD	CDId	INT	primary key
	CDTitle	TEXT(50)
Artist	ArtistId	INT	primary key
	ArtistName	TEXT(50)
Song	SongId	INT	primary key
	SongName	TEXT(50)
RecordLabel	RecordLabelId	INT	primary key
	RecordLabelName	TEXT(50)	primary key

The first thing you may notice is that all of the spaces are gone from the entity names in our physical schema. This is because these names need to translate into SQL calls to create these tables. Table names should thus conform to SQL naming rules. Another thing to notice is that we made all primary keys of type INT. Because these attributes are complete inventions on our part, they can be of any indexible datatype. The fact that they are of type INT here is almost purely arbitrary. It is almost arbitrary because it is actually faster to search on numeric fields in many database engines and hence numeric fields make good primary keys. However, we could have chosen CHAR as the type for the primary key fields and everything would work just fine. The bottom line is that this choice should be driven by your criteria for choosing identifiers.

The rest of the columns are set to be of type TEXT with a length of 50. This definition works for both MySQL and mSQL. For MySQL, however, VARCHAR would be a better choice but not important to this example. Picking the right datatype for columns is very important, but we will not dwell on it here since we have not yet covered the datatypes for MySQL and mSQL.

We now have a starting point for a physical schema. We haven't yet translated the relationships into the physical data model. As we discussed earlier, once you have refined your data model, you should have mostly 1-to-1 and 1-to-M relationships -- the M-to-M relationships were resolved via junction tables. We model relationships by adding a foreign key to one of the tables involved in the relationship. A foreign key is the unique identifier or primary key of the table on the other side of the relationship.

The most common relationship is the 1-to-M relationship. This relationship is mapped by placing the primary key on the "one" side of the relationship into the table on the "many" side. In our example, this rule means that we need to do the following:

· Place a RecordLabelId column in the CD table.

· Place a CDId column in the Song table.

· Place an ArtistId column in the Song table.

Table 2-2 shows the new schema.

Table 2-2. The Physical Data Model for the CD Database

Table	Column	Datatype	Notes
CD	CdId	INT	primary key
	CDTitle	TEXT(50)
	RecordLabelId	INT	foreign key
Artist	ArtistId	INT	primary key
	ArtistName	TEXT(50)
Song	SongId	INT	primary key
	SongName	TEXT(50)
	CdId	INT	foreign key
	ArtistId	INT	foreign key
RecordLabel	RecordLabelId	INT	primary key
	RecordLabelName	TEXT(50)

We do not have any 1-to-1 relationships in this data model. If we did have such a relationship, it should be mapped by picking one of the tables and giving it a foreign key column that matches the primary key from the other table. In theory, it does not matter which table you choose, but practical considerations may dictate which column makes the most sense as a foreign key.

We now have a complete physical database schema ready to go. The last remaining task is to translate that schema into SQL. For each table in the schema, you write one CREATE TABLE statement. Typically, you will choose to create unique indices on the primary keys to enforce uniqueness.

We are, in a sense, jumping ahead at this point. You may not be familiar with SQL yet, and it is not the purpose of this chapter to introduce the MySQL and mSQL variants of SQL. Nevertheless, here are two sample scripts to create the CD database. The first script, Example 2-1 is for MySQL. Example 2-2 is for mSQL.

Example 2-1. An Example Script for Creating the CD Database in MySQL

CREATE TABLE CD (CD_ID INT NOT NULL,

RECORD_LABEL_I INT,

CD_TITLE TEXT,

PRIMARY KEY (CD_ID))

CREATE TABLE Artist (ARTIST_ID INT NOT NULL,

ARTIST_NAME TEXT,

PRIMARY KEY (ARTIST_ID))

CREATE TABLE Song (SONG_ID INT NOT NULL,

CD_ID INT,

SONG_NAME TEXT,

PRIMARY KEY (SONG_ID))

CREATE TABLE RecorLabel (RECORD_LABEL_ID INT NOT NULL,

RECORD_LABEL_NAME TEXT,

PRIMARY KEY(RECORD_LABEL_ID))

Example 2-2. An Example Script for Creating the CD Database in mSQL

CREATE TABLE CD (CD_ID INT NOT NULL,

RECORD_LABEL_ID INT,

CD_TITLE TEXT(50))

CREATE UNIQUE INDEX CD_IDX ON CD (CD_ID)

CREATE TABLE Artist (ARTIST_ID INT NO NULL,

ARTIST_NAME TEXT(50))

CREATE UNIQUE INDEX Artist_IDX ON Artist (ARTIST_ID)

CREATE TABLE Song (SONG_ID INT NOT NULL,

CD_ID INT,

SONG_NAME TEXT(50))

CREATE UNIQUE INDEX Song_IDX ON Song (SONG_ID)

CREATE TABLE RecordLabel (RECORD_LABEL_ID INT NOT NULL,

RECORD_LABEL_NAME TEXT(50))

CREATE UNIQUE INDEX RecordLabel_IDX

ON RecordLabel(RECORD_LABEL_ID)

Data models are meant to be database independent.

Success Stories of Software Engineer

HTML Dropdown

Thursday, 8 October 2015

DataBase Design :Part1

Table 2-1. A CD Database Made Up of a Single Table

Table 2-1. Physical Table Definitions for the CD Database

Table 2-2. The Physical Data Model for the CD Database

No comments:

Post a Comment