> For the complete documentation index, see [llms.txt](https://dailyjournal.gitbook.io/notes/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://dailyjournal.gitbook.io/notes/databases/database/normalization.md).

# Normalization

Normalization is a schema/database design process that minimizes data duplication and enforces data integrity.

A simple definition for practical purposes is:

1. Separate each *entity* into its own table.
2. Separate each discrete *attribute* into its own column.
3. Uniquely identify each entity instance (row) using a *primary key*.
4. Use *foreign key* columns to link related entities.

{% embed url="<https://www.itprotoday.com/sql-server/sql-design-why-you-need-database-normalization>" %}

## First Normal Form (1NF)

* In the *first normal form*, only single values are permitted at the intersection of each row and column; hence, there are no repeating groups.
* To normalize a relation that contains a repeating group, remove the repeating group and form two new relations.
* The PK of the new relation is a combination of the PK of the original relation plus an attribute from the newly created relation for unique identification.

#### Process for 1NF

**`Student_Grade_Report`**` ``(StudentNo, StudentName, Major, CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation, Grade)`

* In the `Student_Grade_Report` table, the repeating group is the course information. A student can take many courses.
* Remove the repeating group. In this case, it’s the course information for each student.
* Identify the PK for your new table.
* The PK must uniquely identify the attribute value (StudentNo and CourseNo).
* After removing all the attributes related to the course and student, you are left with the student course table (**StudentCourse**).
* The Student table (**Student**) is now in first normal form with the repeating group removed.

**`Student`**` ``(StudentNo, StudentName, Major)`

**`StudentCourse`**` ``(StudentNo, CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation, Grade)`

#### How to update 1NF anomalies

* To add a new course, we need a student.
* When course information needs to be updated, we may have inconsistencies.
* To delete a *student,* we might also delete critical information about a course.

## Second Normal Form (2NF)

* For the *second normal form*, the relation must first be in 1NF. The relation is automatically in 2NF if, and only if, the PK comprises a single attribute.
* If the relation has a composite PK, then each non-key attribute must be fully dependent on the entire PK and not on a subset of the PK (i.e., there must be no partial dependency or augmentation).

#### Process for 2NF

To move to 2NF, a table must first be in 1NF.

* The Student table is already in 2NF because it has a single-column PK.

**`StudentCourse`**` ``(StudentNo, CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation, Grade)`

* When examining the `Student_Course` table, we see that not all the attributes are fully dependent on the PK; specifically, all course information. The only attribute that is fully dependent is grade.
* Identify the new table that contains the course information.
* Identify the PK for the new table.

**`CourseGrade`**` ``(StudentNo, CourseNo, Grade)`

**`CourseInstructor`**` ``(CourseNo, CourseName, InstructorNo, InstructorName, InstructorLocation)`

#### How to update 2NF anomalies

* When adding a new instructor, we need a course.
* Updating course information could lead to inconsistencies for instructor information.
* Deleting a course may also delete instructor information.

## Third Normal Form (3NF)

* To be in *third normal form*, the relation must be in second normal form. Also all transitive dependencies must be removed; a non-key attribute may not be functionally dependent on another non-key attribute.

#### Process for 3NF

* Eliminate all dependent attributes in transitive relationship(s) from each of the tables that have a transitive relationship.
* Create new table(s) with removed dependency.
* Check new table(s) as well as table(s) modified to make sure that each table has a determinant and that no table contains inappropriate dependencies.

**`Course`**` ``(CourseNo, CourseName, InstructorNo)`

**`Instructor`**` ``(InstructorNo, InstructorName, InstructorLocation)`

At this stage, there should be no anomalies in third normal form.

## Boyce-Codd Normal Form (BCNF)

Resources

{% embed url="<https://www.youtube.com/watch?v=ABwD8IYByfk>" %}

{% embed url="<https://opentextbc.ca/dbdesign01/chapter/chapter-12-normalization/>" %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://dailyjournal.gitbook.io/notes/databases/database/normalization.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
