Click here to Skip to main content
13,733,345 members
Click here to Skip to main content
Add your own
alternative version

Tagged as

Stats

2.4K views
38 downloads
4 bookmarked
Posted 18 Jul 2018
Licenced CPOL

Database Operations, Entity Relationships And Normalization - Part 2

, 27 Jul 2018
Rate this:
Please Sign up or sign in to vote.
In this article, we'll explore different entity relationships, types of relationships and how we can normalize our database. We'll import Chinook database and start working with it and we'll take a look on CRUD operations in the database

Download Chinook.zip

Introduction

Here we'll explore SQL Server and how we create the entities and define the relationship among entities. How we can normalize our database upto 3rd level. This is how we remove the redundancy and inconsistency of data with the help of Normalization.  In this article, we'll discuss the different types of database keys working inside.

Roadmap

SQL Server

SQL Server is the database management system which has 2 parts.

  • Database Engine
  • Management Studio

It is the GUI which helps you to talk with Database Engine.

Sometimes when we’re working with SQL Server, then on SQL Server authentication we find some error on login.

Actually it is because SQL Services are off. Just click on Windows Start button or Search “Services” in Windows 10. And Start Sql Server Service.

And now you can easily login in your SQL Server. Here we’ve few options.

The server name is important because it is the server where you’re developing the database. If you’re working locally then by default here you’ll see your server name which is installed in your machine and if you’re working on an online server. Then you just need to provide the IP address of server name and then the authentication type will be SQL Server Authentication and provide its username and password.

Now let’s suppose depends upon your requirements. You’re developing the application and you have not deployed your application yet on any server then you can make 1 machine a remote server and you can connect the other laptops with this machine. And here you’ll use this machine server name username password. But you don’t know what is its username and password because these kinds of things we set on SQL Server installation.

Just connect the machine with SQL Server from Windows authentication. And here you can set your new server username and password.

Security > Logins > sa > This popup will appear. Make sure you uncheck this checkbox. And then click on Status menu tab and make sure that your status is Grant and enabled.

Now your sa username password has been set. Now you can log in through these credentials as well.

So here we’re working with Chinook database. I’ve uploaded the file here. Just download the file and open it in SSMS and execute the queries by clicking on Execute button or press F5 just.

And now the database has been successfully created.

Database Diagram

Now we want to see the database diagrams that how the entities are related to each other. Just right-click on Database Diagrams and add a new database diagram. Now select all the tables which you want to see how they’re related to each other.

So this is our Database Diagram. The vital part is designing any database is the ability to create relationships between your tables. Here we have an example of relationships in our database diagram. Every table has a primary key, the fields which you’re seeing in each table with a key is the primary key of that table. It is automatically generating the unique number for each item. One of the benefits of having that primary key is we can use it elsewhere, this primary key will become the foreign key in the next table.

ER Modelling

There are many methods for designing the database and probably more methods to represent a graphical representation of the design. Almost all modeling methods use some variations of ER Modelling. We’ll use ER Modelling here.

Benefits & Risks of ER Modelling

There are good reasons for the popularity of ER Modelling when used the communication with laymen.

  • It is a lot easier to understand than some other diagramming techniques.
  • It is extremely simple to convert a finished ER Model into a logical database design. And it in fact, very impossible to start with the database that was already built and to reverse engineer that into ER Diagram that you can then use as a starting point to understand the database design, find and repair flaws and extend the functionality.

However, there are also risks involved when we’re using ER Modelling.

  • None of the diagramming techniques can capture all the details and analyst may discover you in your work and details that are not captured in Model often get lost and have to discover here again.

One to Many

It is the most common. The relationship which you’re seeing in the above database diagram with this symbol. Here we create 2 tables where 1 record is related to multiple records in 2nd table.

Many to Many

It is not as common as one to many but it still quite common now. You often have to think about this one, because these business situations can often feel like one to many.

Let’s say we’ve author and title entity. One author might have written many books. This is one to many relationship.

So here we add the foreign key field which has the value of the primary key of the author. One more important thing is, always keep in mind we make the foreign key attribute in the dependent table. Here we’ve title entity is dependent upon author entity. So we create a foreign key in title entity. So here we add a new column for the title table.

But here is the issue, let’s suppose 1 book is written by many authors. Then would we do here. Few people follow the rule of creating foreign key attribute which I’ve mentioned above and created one more column field in the title entity. (Rule is: Always create the foreign key attribute in Dependent Table)

However, adding new columns to your tables and particularly these kinds of scenarios where repeating groups or repeating columns is a really bad idea. And it is the very bad technique in database design.

 

So we get rid off that idea. Some people think that I’m a cheap a little bit and I do something quick and dirty and I put comma separated value in Author field with the help of json.

So this is not the suggested approach as well. In fact, we’re going to solve this problem by getting rid of the author column entirely and we go back to the 2 completely detached tables and what we do to fix this is we add another table. This is the junction table.

The only reason of this table to exist is to join author title together. So, in fact, the name of the table would be authorbook.

In any relational database, we can’t express the many to many relationship directly. You can only do with one to many relationship. This is how we’re expressing the entities with many to many relationship. In our Chinook database, we have an example of many to many relationship.

One to One

It is not so common relationship. We can handle it by creating one more field in the same table or by creating a new table and reference the first table with 2nd table with primary key and foreign key utility.

Recursive Relationship

Let’s take an example of employee. Let’s suppose in an organization different kind of employees are working like Manager, Employees, CEO etc. All these objects are actually working in a company so they are all lie in Employee table. One or More employees can have a boss, and the boss itself is an employee. This boss reports to the Manager of his department. This manager also an employee. So this is how employees are inter-related to each other. That’s why here we’ve Recursive relation. We just create the foreign key of its own primary key. This is how we create the recursive relation of a table.

Now you might be thinking that recursive relationship certainly will be one to one but it is an uncommon scenario. Recursive relation can be one to many and many to many.

Higher Arity

So far all we’ve seen relationships represent classes or facts that associate 2 entity occurrences. But what about facts that associated 3 or even more occurrences. Let’s assume University of the People wants to track what kind of devices are used by their members to what specific costs. So their data model would, of course, has entity types for Members, Course, and Device. Here is the big picture.

  • Binary (arity 2) relationships: between 2 entity types
  • Ternary (arity 3) relationship: between 3 entity types

The number of entity types participating in a relationship is called arity or order of the relationship. Higher arity is possible in theory but in real-world scenario, it is the very rare case when you’re dealing with 4 arity number tables. And even most of the time, we don’t see higher arity relationships in our practical life. So the solution of the above problem with higher arity is,

Note:

When you design your database then always make the same name of your Foreign Keys in one table and the primary key in another table. It is the standardized nature to design the database.

Database Creation

Now let’s suppose you want to create your own database.

Right Click on Databases > New Database > Name your Database

When we create new database, here 2 files are also created along with database.

  • DatabaseName_Log

Every information relevant to the log of database like insertion, deletion, modification inserted into the log file.

  • DatabaseName

It contains your actual data. This is your mdf file.

And here you can change the location of these files and if you want these files then you can get these files from this location.

Table Creation

As we create the database, similarly the same way we create the table.

Just Right click on tables > Add New Table > And Define your table with attributes and datatypes. And if you want to make them required then uncheck Null checkbox. And if you want to make it optional then check Null checkbox. And when you save the definition of your table, it will automatically save with dbo schema. Dbo stands for database object.

This dbo is the schema of your tables. SQL Server makes your tables and objects separate with the help of schema. Everything we create like tables, views, stored procedures lies into dbo by default. But if we want to separate out our tables or objects from this by default convention like you want to create the tables for HR then you want to create the tables in HR schema, some tables we create for finance so we want fin dbo.

So for this purpose you need to create the schema first and when you’ll create the tables then we specify schema of the tables. And so we can use different schemas for different tables.

Keep in mind, databases doesn’t understand the GUI. Everything we’re doing with GUI related to database, our each action is generating the sql script behind the scene.

Data Types In SQL

In beginnings, we’ve a system to represent the character, ANSI standardized. It just have 55 character options but if you want to write the small English alphabets, Urdu language, special symbols, counting numbers then we use Unicode. In Unicode, we store each character in 2 bytes. And in Non-Unicode, we just store only 1 character into 1 byte. Now after using Unicode, we have the option to use Chinese, Urdu, Persian any language we want to show.

Let me tell you 2 things when you’re studying the data types.

  • Precision

The total number of digits is its precision. Like 2,147,483,648.12345 its precision is 15.

  • Scale

And the scale of the above digit is 5. Total number comes after the fraction.

Here we’ve some important links of data types documentation and tutorialspoint. Here you’ll get an idea about data types very well.

Query Designer

One of the Nice Feature in SSMS, the ability to go something called Query Builder to Construct Query Very Quickly

  • Open New Query Tab
  • Right Click on the Blank Query Tab and Select Design Query In Editor
  • Select the tables from where you need the data to show.
  • The Query will be constructed for you in the panel down in the form of Join because we are accessing the data from multiple tables.
  • Now Tick Mark the Required columns from the table windows to show into the new Select Statement.

Select Statement

Let me tell you one important thing, Select the different fields from the table is quite a common task but remember that data insertion, updating, deletion are easy but to show the data is quite difficult. And we perform these kinds of operations normally in our daily routine.

SELECT * FROM Customer

This is used to show all the fields of the table with data. Now we want few fields to show on the screen.

SELECT CustomerId,
       FirstName,
       LastName,
       Email
FROM Customer

Now let’s suppose we want to show the fields with something meaningful heading like FirstName should display as Name or First name (with space) and Email with (username) so we use alias here.

SELECT CustomerId,
       FirstName AS ‘Name’,
       LastName,
       Email AS ‘User Email’
FROM Customer

Insertion

Now we want to insert the data into our table.

Look this is our Customer table design. And if we provide the required fields, data will be successfully inserted into the table.

INSERT INTO Customer (FirstName,
                      LastName,
                      Email)
VALUES ('Usama',
        'Shahid',
        'learnbeginners@gmail.com')

As we’ve many other fields in a table that’s why we’re specifying the field names to insert the values in it with the help of the pattern. And if you don’t provide the parameter names

INSERT INTO Customer
VALUES ('Usama',
        'Shahid',
        'learnbeginners@gmail.com')

Then we’ll see the error because provided values are not matching with table definition. This is how we insert the data. Here we have the primary key is identity as well which automatically become unique on a new entry. That’s why we’re not providing the CustomerId in the parameter because it automatically updated on a new entry in a table. But let’s suppose, you’ve set the field as primary key in a table and it is not identity and you want to uniquely identify the record then you’ve to provide your own the values as well.

Let’s suppose Course entity has information related to courses. Now we want to place the unique courseId number in a table then obviously we’ll make this field as primary key but we’ll not apply the identity on it.

Update

When we need to update our record then the Primary Id should be in our mind.

UPDATE Customer
SET FirstName = 'Uwaish',
        Email = 'usama@gmail.com'
WHERE CustomerId = 60

If we miss the WHERE Statement here, then it will lead to disaster. Our all the records in the table will be changed with this new information. So we need to be really careful when we need to update the data. Same the case with the Delete, so be careful about the WHERE clause.

DELETE

DELETE FROM Customer
WHERE CustomerId = 60

Drop

If you want to drop your full table with data and schema then you drop the table with sql query in this way,

DROP TABLE table_name

We can’t recover the DROP table.

Truncate

Basically, it deletes all the data from the table. Now you might be thinking that DELETE statement and Truncate statement both are doing the same task. But the only difference is we can use WHERE clause with DELETE but we can’t use WHERE with TRUNCATE.

We use TRUNCATE only when we want to refresh or reset the database as well on removing the address.

Important Note When You’re Working With Multiple Queries

When you’re working with SQL queries and you have a lot of queries in the same .sql file then always comment your update and delete statements. Because if you execute the complete sql file by mistake or you might press F5 during working on databases then your all the records will modify/delete in the table. So the best practice is always commenting your update and delete queries.

Just put (double dashes) before your query to comment your query.

-- Delete From Customer

Drop vs Truncate

Drop is used to removing existing entities. And Truncate statement is used to remove all the rows from a table without logging the individual row deletions.

DDL vs DML

We can divide the database language into 2 different categories broadly DDL and DML.

DDL stands for Data Definition Language. It has a few different commands

  • CREATE              (for creation)
  • ALTER                (for modification)
  • DROP                  (for deletion)

DML stands for Data Manipulation Language. Here we manipulate or play our database.

  • SELECT
  • INSERT
  • UPDATE
  • DELETE
  • BULK INSERT    (it imports a data file into database table or views in a user-specified format)

Sometimes when we’re working with database, we have a lot of data in notepad or excel sheet. Then how we insert it? Instead of writing queries again and again of Insert, we can import all the data with BULK INSERT utility of SQL.

  • MERGE

It performs UPDATE, INSERT, DELETE operations on the table in 1 transaction. So here we’ve 2 tables source table and target table. And we’ll merge them. We’ll see merge statement with an example later on.

3 Pillars of Database

There are 3 major components of database.

  • Database Designing

Here we’ll deal with different tables, how many tables we’ll use in our system. What are their attributes? How we create the relationship among tables? These kinds of things deal with database designing.

  • Database Development

Here we develop our database with SSMS GUI or with Query

  • Database Administration

Suppose Database is increasingly enlarging in size. And let’s suppose the machine where you’ve installed your database server if it is crashed then what we’ll do. So here we take the backup and restore it and make sure the fine tuning of the database.

So designing is the most important part where you learn the different concepts of the database. If we know the database anomalies and how they come in the database then obviously we make the solution of these kinds of problems.

Database Anomalies

When we talk about the database, we meant to say about data storage and data retrieval from the database.

So let’s suppose you’re writing something on the notebook and you’re writing something again and again. This is what we say data redundancy. And let’s suppose we’re talking about one person and we put some data of him like NIC, now somewhere else I again put his NIC but different from the previous one. This is what we say data inconsistency. Now might be 1st NIC is correct or the 2nd one is correct or both are incorrect. These kinds of problems we tackle in the database.

These kind of logical mistakes are database anomalies which we handle with the help of Normalization. We’ll explore Normalization in later articles.

First, we explore the problems which can occur in the database.

Roll No.

Name

Class

NIC

City

Course

1

Zeeshan

9th

12345

Burewala

Math

1

Zeeshan

9th

12345

Burewala

Islamiyat

1

Zeeshan

9th

12345

Burewala

Physics

2

Waqar

10th

 

Arifwala

Arts

2

Waqar

10th

 

Arifwala

Chemistry

2

Waqar

10th

 

Arifwala

Science

Insert Anomalies

Suppose 1 student is enrolled in a subject and if he enrolled again and again in different subjects then obviously we replicate the data of student again and again this is redundancy.

Now let’s suppose our institute launches one more subject like Social Studies. But there is no student is enrolled in this subject yet. Then we insert the information of subject in this table.

Roll No.

Name

Class

NIC

City

Course

 

 

 

 

 

Social Studies

We just add the name of the subject doesn’t enter the name of a student. Right? It means remaining column containing null values. This is what we say insert anomaly.

Delete Anomaly

Now let’s suppose Waqar has left the institute and now he’s gone somewhere else. Now he’s not the part here. So if I delete the Waqar records from the table then obviously the subject information will also be deleted. This is delete anomaly.

It means when we want to delete some kind of information then it also deletes some kind of other information as well. This is delete anomaly.

Update Anomaly

Here we’ve Zeeshan is enrolled in 3 subjects. Now we want to change the name of Zeeshan, so on the updation of name of Zeeshan in the table. If I update 2 records and left the 3rd one by mistake. Then we’ve 2 places in the table where the data is correct and the 3rd one is actually was our mistake. This is what we say data inconsistency.

And let’s suppose we want to update the name of the subject. From Math to Mathematics, and if we update the record in one place and change it with mathematics and left the 2nd place with just math name. This is what we say data inconsistency. Now we can’t realize the correct entry.

Our Dream

So our dream is we want to do something like if we apply the CRUD operations on the database then 1 thing should not affect the other one. It should be redundant free and data should be consistent. These kinds of things we handle with Normalization.

This is why normalization is important. Most ofthe time developers just working on the things and try to learn them but even they don’t know why they are doing that. So the precious recommendation is first of all try to find the problem, actually why we’re doing that why we need it. Then if we come on to the solution of this problem. We can learn more.

Database Keys

Every entity has its own unique attribute like Book has unique ISBN number, every person has a different name, every electronic device has its own Mac Address, every phone number has its own unique identification number. So these are the candidate keys through we can identify our entity uniquely.

Different fields or attributes which we can use to identify that entity record as unique is/are candidate keys. Again in simple words, we see in the table how much column exist in the table which can identify the record unique independently. So 1 table can have many candidate keys like roll number, NIC, mobile number etc. And if we combine 2 or more candidate keys to make the record unique, is what we say composite key. Unique key is the column where the values can’t repeat like a student has its own NIC number which can’t assign to anyone else. So we can say that this column value are unique but might be it can have null value in this column. So this constraint is unique but nullable.

Now let’s talk about the primary key, let’s suppose we’ve 4 or 5 candidate keys in a table. So we make a strong field as primary key which can’t be null and which can identify the record uniquely in every condition. Now we use this primary key in multiple tables which becomes our foreign key.

Normalization

We’ve already seen that there are anomalies in the database and we resolve them with normalization. Suppose you’ve existing database or you’re creating a new database. And you know that if you add something here then it might occur a logical redundancy in the table or some type of inconsistency occur. So you design your database by taking care of such kind of things.

Roll No.

Name

Class

NIC

City

Courses

1

Zeeshan

9th

12345

Burewala

Islamiyat, Math,

Physics

 

1st Normal Form

Each cell should have unit value.

Any field doesn’t contain more than 1 value. So it becomes,

Roll No.

Name

Class

NIC

City

Course

1

Zeeshan

9th

12345

Burewala

Math

1

Zeeshan

9th

12345

Burewala

Islamiyat

1

Zeeshan

9th

12345

Burewala

Physic

Now it is in the 1st normal form.

2nd Normal Form

And most of the time we see tables which are already in 1st normal form. So before talking about 2nd normal form, let’s understand the concept of Functional Dependency.

Functional Dependency is totally the concept of mathematics come from functions. And dependency means if our values dependent upon other values or not. Means that where the roll number 1 comes, 9th class also comes with roll number records. Same as it is when we put roll number 1, CNIC also dependent upon this specific student. All these fields in the table are dependent upon the roll number 1. So we can say that all these columns are functional dependent upon roll number 1. But as we can say Course information varies in each record of this roll number 1. It means Courses are not dependent upon roll number.

So we need to identify the attributes which are not fully functional dependent upon the primary key. And we make them separate in 2 different tables. Now just think about the relationship between student and course entities. Obviously, 1 student can enroll in multiple courses and in 1 course many students can enroll. So the relationship is many to many between 2 entities. And we know that when we’re dealing with many to many, we need to make an extra 3rd table.

Roll No.

Name

Class

NIC

City

1

Zeeshan

9th

12345

Burewala

2

Waqas

9th

12346

Arifwala

 

Student Table

Serial Number

Course Name

1

Physics

2

Chemistry

3

Math

Course Table

And now the 3rd table comes inthe to picture,

Roll Number

Serial Number

1

1

1

3

2

2

StudentCourses mapping table

Now the entities aren’t dependent upon each other. Let’s suppose if we need to add one more course in the table then their will be no effect on Student entity. And if we add one more student in the table and if he doesn’t enroll in any course then there will be no impact on the Course entity.

So the 2nd normal form is dependent column should be rely on the primary key of its record.

Important Point

If you move back to the 1st normal form and if we want to extract the data from the table then it really it would be very easy and fast to extract the data from single table but there will be a redundancy in the table. But after applying 2nd normal form, our 1 table has broken into 3 parts. Now data retrieval is not so easy, it takes time because now we can’t extract the data directly. Now there are relationship among tables.

Let’s suppose you want to know the courses enrolled by Zeeshan, then you move to the Student table and pick the Student_id and then come on to the mapping table and extract the Serial Numbers of Courses and then move to Course table. This is how it works. That’s why we use joins here for data retrieval after applying normalization on the tables. This is the disadvantage of Normalization. That’s why sometimes we keep our data denormalize. We’ll discuss it later on.

3rd Normal Form

Now let’s suppose, here is a table

Roll No.

Name

Class

Birth City

Province

Country

1

Waqas

9th

Arifwala

Punjab

Pakistan

2

Zeeshan

9th

Lahore

Punjab

Pakistan

Now if you focus on this table, Province and Country is dependent upon the non-primary key attribute (Birth City) because Province and Country are dependent upon the Birth City. This is what we say Transitive Dependency among attributes. Here similar information is repeating again and again. So we again separate this information from the base table.

Now 1 student belongs to only 1 city. So it is one to one relationship.

Roll No.

Name

Class

CityId

1

Waqas

9th

1

2

Zeeshan

9th

2

 

Student Table

And now our City Table is,

CityId

Birth City

Province

Country

1

Arifwala

Punjab

Pakistan

2

Lahore

Punjab

Pakistan

Note

When you design your database then always make the same name of your Foreign Keys in one table and the primary key in another table. It is the standardized nature to design the database.

Conclusion

Before learning a new thing, we should know and try to find the answer of this question why we're doing that. Actually why we need this new approach. Before database we're already working with file-based system, everything we store on registers. Then why we need to database. Why you begin start to find the answer of why then you'll learn alot.

Before Normalization, there is a great redundancy and data inconsistency occur in some places which increase our lots of time and effort and most of the time, we confuse about the records. So we make our data consistent and accurate by applying Normalization in the database. Here we've explored the relationships among entities and how it works. This is the first step we design our database. If you make your database design better, you'll make your life a lot easier. 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author


You may also be interested in...

Comments and Discussions

 
QuestionMessage Closed Pin
31-Jul-18 2:42
member31-Jul-18 2:42 
GeneralMy vote of 3 Pin
Klaus Luedenscheidt25-Jul-18 18:39
memberKlaus Luedenscheidt25-Jul-18 18:39 
GeneralRe: My vote of 3 Pin
Muhammad Usama Khan27-Jul-18 9:31
professionalMuhammad Usama Khan27-Jul-18 9:31 
GeneralThanks Pin
18-Jul-18 23:46
member18-Jul-18 23:46 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web04-2016 | 2.8.180920.1 | Last Updated 27 Jul 2018
Article Copyright 2018 by Muhammad Usama Khan
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid