11 Database Design
A good database ensures there is no redundancy or anomalies.
| ID | Name | CourseID | CourseName | CourseCredits |
|---|---|---|---|---|
| 198 | Ahmed | 212 | DBMS | 3 |
| 199 | Jameel | 212 | DBMS | 3 |
| 200 | Azra | 212 | DBMS | 3 |
| 201 | Habi | 212 | DBMS | 3 |
| 202 | Azhar | 213 | OOPS | 3 |
Bad Practices¶
Increase time and space complexity of database operations.
Redundancy¶
The same data is present in multiple places.
Note: It may be intentional for data backup.
Anomaly¶
Updation¶
If courseName and courseCredits change for 212, all the records have to be changed.
Insertion¶
If a new course comes up, but there are no students, then courseID cannot be entered into the bad table, as primary key (ID) will be null.
Deletion¶
If I delete ID 202, then I’ll lose details about the courseId 213
Keys¶
| Superkey | Primary Key | Candidate Key |
|---|---|---|
| Attribute(s) that can uniquely identify all attributes in a table. | Primary key is a minimal super key. | Key that can be a primary key. |
Decomposition¶
- Atomize every table wrt an entity
- Connect those tables using relational tables with foreign key
Prevents
- Redundancy
- Anomaly
Normalization¶
It is the process of structuring a database, usually a relational database, in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity.
Sometimes, it’s not feasible to re-create an entire database design. In those cases, we will have to normalize tables into a more appropriate design.
| Prime | Non-Prime |
|---|---|
| attributes in the candidate key | other attribute |
Candidate Key¶
A key/combination of keys that can help either directly/indirectly derive all attributes of a table.
Functional Dependency¶
Gives a unique tuple as the output
idk what this means: A key can be a functional dependancy, but the vice-versa does not hold
Partial Dependency¶
Consider a key combination as (name, age)
Query possible with just name
Subset of candidate key can derive non-prime attributes
Transitive Dependency¶
Non prime attributes gives non-prime
Full Dependency¶
Query possible only with (name, age) combination
Subset of candidate key cannot derive non-prime attributes
Normal Forms¶
| 1NF | 2NF | 3NF | BCNF | 4NF | 5NF | |
|---|---|---|---|---|---|---|
| No Multi-Valued Attributes | âś… | âś… | âś… | âś… | âś… | âś… |
| No Partial Dependency | âś… | âś… | âś… | âś… | âś… | |
| No Transitive Dependency | âś… | âś… | âś… | âś… | ||
| LHS = Candidate/Super Key | âś… | âś… | âś… | |||
| No Multi-Attribute Dependency | âś… | âś… | ||||
| Lossless Decomposition | âś… |
1NF¶
Given¶
| sid | sname | course |
|---|---|---|
| 01 | A | CC, CP, OOP |
| 02 | B | CP, DB |
| 03 | C | DB |
Normalized¶
We can turn into 2 tables
| sid | sname |
|---|---|
| 01 | A |
| 02 | B |
| 03 | C |
| sid | course |
|---|---|
| 01 | CP |
| 01 | CC |
| 01 | OOP |
| 02 | CP |
| 02 | DB |
| 03 | DB |
Finding Candidate Key¶
-
Find the keys with no incoming these compulsorily have to be in the combination, because there is no other way to reach them
-
Find the transitive closures of all
-
attributes
-
combination of failure keys
-
-
List out all keys
-
Candidate keys are only the keys that are
3NF¶
every functional dependency \(A \to B\) contains
- superkey \(A\)
- prime attribute \(B\)
Armstrong’s Inference Rules¶
\(\to\) means derives
| Rule | Condition | Inference |
|---|---|---|
| Reflexive | \(y \subset x\) | \(x \to y\) |
| Augmentation | \(x \to y\) | \(xz \to yz\) |
| Transitive | \(x \to y, y \to z\) | \(x \to z\) |
| Decomposition | \(x \to yz\) | \(x \to y, x \to z\) |
| Union | \(x \to y, x \to z\) | \(x \to yz\) |
| Psuedotransitivity | \(x \to y, wy \to z\) | \(wx \to z\) |
Canonical Minimal Cover¶
Removing one/more functional dependencies when a set of functional dependencies are given, ensuring that 5NF is still maintained.