Glossary/Database Normalisation — 1NF, 2NF, 3NF, BCNF
Database Management Systems

Database Normalisation — 1NF, 2NF, 3NF, BCNF

Eliminating redundancy and anomalies by decomposing tables into well-structured forms.


Definition

Normalisation organises a relational database to reduce data redundancy and prevent update anomalies. It applies a series of Normal Forms guided by Functional Dependencies (FDs). 1NF eliminates repeating groups. 2NF eliminates partial dependencies. 3NF eliminates transitive dependencies. BCNF is stricter than 3NF. Normalisation is the highest-weightage GATE DBMS topic — numerical questions on FDs and NF appear almost every year.

Real-life analogy: The messy spreadsheet

Consider: (StudentID, StudentName, CourseID, CourseName, Instructor, InstructorPhone). Problems: (1) Changing InstructorPhone requires updating every row with that instructor — one missed row creates inconsistency (update anomaly). (2) Deleting a student removes course info (delete anomaly). (3) Cannot store a course with no students (insert anomaly). Normalisation splits this into clean, anomaly-free tables.

Functional Dependencies and Armstrong axioms

FD X→Y: knowing the value of X uniquely determines Y. If two tuples have the same X values, they must have the same Y values. Example: StudentID→StudentName, CourseID→CourseName.

Armstrong axioms: (1) Reflexivity: Y⊆X implies X→Y. (2) Augmentation: X→Y implies XZ→YZ. (3) Transitivity: X→Y and Y→Z implies X→Z. Use these to compute X+ (closure of X — all attributes determined by X).

Computing attribute closure and finding candidate keys

from itertools import combinations

def closure(X: set, FDs: list) -> set:
    result = set(X)
    changed = True
    while changed:
        changed = False
        for (lhs, rhs) in FDs:
            if lhs.issubset(result) and not rhs.issubset(result):
                result |= rhs
                changed = True
    return result

def candidate_keys(attrs: set, FDs: list) -> list:
    keys = []
    for size in range(1, len(attrs) + 1):
        for combo in combinations(attrs, size):
            s = set(combo)
            if closure(s, FDs) == attrs:
                if not any(set(k) < s for k in keys):
                    keys.append(frozenset(s))
    return [set(k) for k in keys]

# Example: R(A,B,C,D,E), FDs: A→B, BC→D, D→E
attrs = {'A','B','C','D','E'}
FDs   = [({'A'},{'B'}), ({'B','C'},{'D'}), ({'D'},{'E'})]

print("Closure of {A,C}:", closure({'A','C'}, FDs))  # {A,B,C,D,E}
print("Candidate keys:",   candidate_keys(attrs, FDs)) # [{A,C}]

Normal Forms — 1NF through BCNF

Normal FormRequirementViolationFix
1NFAll attributes atomic, no repeating groupsHobbies = "Cricket,Chess" multi-valuedSeparate table StudentHobbies(StudentID, Hobby)
2NF1NF + no partial dependency (no non-key attr depends on part of composite PK)R(SID,CID,SName,Grade). SName depends only on SID.Split: Student(SID,SName), Enrollment(SID,CID,Grade)
3NF2NF + no transitive dependencyEmployee(EmpID,DeptID,DeptName). DeptID→DeptName.Split: Employee(EmpID,DeptID), Department(DeptID,DeptName)
BCNFFor every non-trivial FD X→Y, X must be a superkeyR(Course,Teacher,Student). Teacher→Course, Teacher is not a superkey.Split: TeacherCourse(Teacher,Course), Enrollment(Student,Teacher)

BCNF vs 3NF trade-off — GATE favourite

BCNF eliminates all redundancy but may lose some FDs (lossless but not always dependency-preserving). 3NF always preserves all FDs but allows slight redundancy. GATE pattern: "Is this in BCNF? In 3NF? Decompose it." Check: for every FD X→Y, is X a superkey (BCNF)? Or is X a superkey OR Y is prime (3NF)?

Practice questions (GATE-style)

  1. R(A,B,C,D). FDs: A→B, B→C, C→D, D→A. Find all candidate keys. (Answer: A→B→C→D→A is a cycle. Each single attribute determines all others. Candidate keys: {A}, {B}, {C}, {D}.)
  2. R(SID,CID,InstructorID,InstructorPhone). PK=(SID,CID). FD: InstructorID→InstructorPhone. Which NF is violated? (Answer: 3NF violated — InstructorPhone transitively depends on PK via InstructorID (non-key→non-key).)
  3. FD X→Y where X is not a superkey. R is definitely NOT in: (Answer: BCNF — BCNF requires every non-trivial FD LHS to be a superkey.)
  4. What is a lossless join decomposition? (Answer: Decomposing R into R1, R2 such that R1 ⋈ R2 = R exactly. Guaranteed lossless if R1∩R2 → R1 or R1∩R2 → R2.)
  5. Relation R(A,B,C). FDs: A→B, B→C, C→A. Is R in BCNF? (Answer: Yes — candidate keys are {A},{B},{C}. Every FD has a candidate key on the LHS, which is a superkey.)

On LumiChats

LumiChats can check normal forms, find candidate keys from FDs, and decompose relations into BCNF or 3NF step-by-step. Paste your relation schema and FDs and ask: 'Is this in BCNF? If not, decompose it.'

Try it free

Try LumiChats for ₹69

39+ AI models. Study Mode with page-locked answers. Agent Mode with code execution. Pay only on days you use it.

Get Started — ₹69/day

Related Terms

3 terms