OLAP enables multidimensional analysis of business data. A data cube is an n-dimensional structure where each dimension represents a business entity (time, location, product) and each cell contains measures (revenue, units sold). OLAP operations — roll-up, drill-down, slice, dice, pivot — let analysts navigate data at different granularities. Concept hierarchies define levels within each dimension (Day→Month→Quarter→Year). Measures are categorised as additive, semi-additive, or non-additive. Critical GATE DS&AI topic.
Real-life analogy: The adjustable telescope
A data cube is like a telescope you can point at any dimension: zoom out (roll-up) to see annual revenue by country; zoom in (drill-down) to daily revenue by city; slice off one year (slice) to focus on 2024; constrain multiple dimensions (dice) to see Q3 2024 electronics in North India; rotate the telescope (pivot) to swap rows and columns.
OLAP operations — roll-up, drill-down, slice, dice, pivot
| Operation | What it does | SQL equivalent | Example |
|---|---|---|---|
| Roll-up | Aggregate to higher level in hierarchy | GROUP BY higher-level attr | Day→Month: monthly totals from daily data |
| Drill-down | Go to lower level of detail | Add more specific GROUP BY | Country→State→City: revenue by city |
| Slice | Fix ONE dimension to a single value | WHERE dimension = value | Show all data for Year = 2024 only |
| Dice | Fix MULTIPLE dimensions to ranges | WHERE dim1 IN (...) AND dim2 = ... | Q3-Q4 2024, Electronics, North India |
| Pivot (Rotate) | Swap row and column dimensions | PIVOT / CASE-based cross-tab | Rows=Product → Cols=Product |
All five OLAP operations in SQL
-- Base: fact_sales with dim_date, dim_product, dim_location
-- ROLL-UP: daily → monthly (using ROLLUP for subtotals)
SELECT
COALESCE(CAST(d.Year AS TEXT), 'Grand Total') AS Year,
COALESCE(CAST(d.Month AS TEXT), 'Month Total') AS Month,
SUM(f.Revenue) AS Revenue
FROM fact_sales f JOIN dim_date d ON f.DateKey = d.DateKey
GROUP BY ROLLUP(d.Year, d.Month)
ORDER BY d.Year, d.Month;
-- DRILL-DOWN: Category → Subcategory → Product
SELECT p.Category, p.Subcategory, p.ProductName, SUM(f.Revenue)
FROM fact_sales f JOIN dim_product p ON f.ProductKey = p.ProductKey
GROUP BY p.Category, p.Subcategory, p.ProductName
ORDER BY p.Category, p.Subcategory, SUM(f.Revenue) DESC;
-- SLICE: fix Year = 2024 (one dimension fixed)
SELECT p.Category, SUM(f.Revenue)
FROM fact_sales f
JOIN dim_date d ON f.DateKey = d.DateKey
JOIN dim_product p ON f.ProductKey = p.ProductKey
WHERE d.Year = 2024
GROUP BY p.Category;
-- DICE: multiple dimension constraints (sub-cube)
SELECT d.Quarter, l.Region, p.Category, SUM(f.Revenue)
FROM fact_sales f
JOIN dim_date d ON f.DateKey = d.DateKey
JOIN dim_location l ON f.LocationKey = l.LocationKey
JOIN dim_product p ON f.ProductKey = p.ProductKey
WHERE d.Year = 2024 AND d.Quarter IN (3, 4)
AND l.Region = 'North India' AND p.Category = 'Electronics'
GROUP BY d.Quarter, l.Region, p.Category;
-- PIVOT: Rows=Category, Columns=Quarter
SELECT
p.Category,
SUM(CASE WHEN d.Quarter = 1 THEN f.Revenue ELSE 0 END) AS Q1,
SUM(CASE WHEN d.Quarter = 2 THEN f.Revenue ELSE 0 END) AS Q2,
SUM(CASE WHEN d.Quarter = 3 THEN f.Revenue ELSE 0 END) AS Q3,
SUM(CASE WHEN d.Quarter = 4 THEN f.Revenue ELSE 0 END) AS Q4,
SUM(f.Revenue) AS Annual
FROM fact_sales f
JOIN dim_date d ON f.DateKey = d.DateKey
JOIN dim_product p ON f.ProductKey = p.ProductKey
WHERE d.Year = 2024
GROUP BY p.Category ORDER BY Annual DESC;Concept hierarchies and measure categorisation
- Time hierarchy: Second → Minute → Hour → Day → Week → Month → Quarter → Year
- Location hierarchy: Street → City → District → State → Region → Country → Continent
- Product hierarchy: SKU → Product → SubCategory → Category → Department → Brand
- Employee hierarchy: Employee → Team → Department → Division → Company
| Measure type | Example | Sum over time? | Sum over region? | Use |
|---|---|---|---|---|
| Additive | Revenue, Units Sold, Order Count | Yes | Yes | SUM across all dimensions |
| Semi-additive | Account Balance, Inventory Level | No (use AVG or snapshot) | Yes | SUM across regions, AVG/LAST over time |
| Non-additive | Price, Profit Margin %, Temperature | No | No | AVG, MIN, MAX — never SUM |
Why you cannot SUM account balances over time
Account balance is semi-additive. If Account A has ₹1000 on Jan 1 and ₹1200 on Feb 1, the "annual balance" is NOT ₹2200. You want the balance at a specific snapshot. Sum balances ACROSS accounts at the same point in time — but use AVG or last snapshot when aggregating OVER time. GATE tests this distinction every year.
Additive vs semi-additive measure aggregation
-- ADDITIVE measure: Revenue — SUM across all dimensions freely
SELECT d.Year, l.Region, SUM(f.Revenue) AS TotalRevenue
FROM fact_sales f
JOIN dim_date d ON f.DateKey = d.DateKey
JOIN dim_location l ON f.LocationKey = l.LocationKey
GROUP BY d.Year, l.Region; -- Safe to SUM across both time AND region
-- SEMI-ADDITIVE measure: Account Balance
-- WRONG: SUM over time gives meaningless total
SELECT d.Month, SUM(s.Balance) AS WrongBalance FROM fact_account_snapshot s
JOIN dim_date d ON s.DateKey = d.DateKey GROUP BY d.Month;
-- CORRECT: Use LAST balance of each month (snapshot approach)
SELECT d.Month, SUM(s.Balance) AS CorrectMonthEndBalance
FROM fact_account_snapshot s
JOIN dim_date d ON s.DateKey = d.DateKey
WHERE d.IsMonthEnd = TRUE -- Only last day of each month
GROUP BY d.Month;
-- Or use AVG balance per account per period
SELECT a.AccountID, d.Month, AVG(s.Balance) AS AvgMonthlyBalance
FROM fact_account_snapshot s
JOIN dim_date d ON s.DateKey = d.DateKey
JOIN dim_account a ON s.AccountKey = a.AccountKey
GROUP BY a.AccountID, d.Month;Practice questions
- What OLAP operation takes you from monthly sales by country to quarterly sales by continent? (Answer: Roll-up — moving from finer granularity (Month, Country) to coarser (Quarter, Continent) in both Time and Location hierarchies simultaneously.)
- Identify the measure type: number of website visits per day. (Answer: Additive — visit counts can be summed across all dimensions: SUM across days gives total visits, SUM across regions gives regional total.)
- A data cube has 3 dimensions: Time (12 months), Product (500 SKUs), Region (50 regions). Total cells? (Answer: 12 × 500 × 50 = 300,000 cells. Many cells are empty (sparse cube) — not every product sold in every region every month. Sparse representation is used to save space.)
- Difference between slice and dice? (Answer: Slice: one cut along ONE dimension (fix Year=2024 → 2D cross-section). Dice: constraints on MULTIPLE dimensions (Year=2024 AND Region=North AND Category=Electronics → smaller sub-cube).)
- Why is profit margin % non-additive? (Answer: Percentage/ratio cannot be summed. 35% + 40% does not equal 75% total margin. Must compute from raw additive components: (total_profit / total_revenue) × 100 at the aggregated level.)
On LumiChats
LumiChats can implement any OLAP operation in SQL, design OLAP cubes for specific business domains, explain concept hierarchies, and generate complete PIVOT/ROLLUP/CUBE queries. Great for data engineering and BI development projects.
Try it free