Glossary/OLAP Operations — Data Cube, Concept Hierarchies & Measures
Data Warehousing

OLAP Operations — Data Cube, Concept Hierarchies & Measures

Slicing, dicing, rolling up, and drilling down through multidimensional data.


Definition

OLAP enables multidimensional analysis of business data. A data cube is an n-dimensional structure where each dimension represents a business entity (time, location, product) and each cell contains measures (revenue, units sold). OLAP operations — roll-up, drill-down, slice, dice, pivot — let analysts navigate data at different granularities. Concept hierarchies define levels within each dimension (Day→Month→Quarter→Year). Measures are categorised as additive, semi-additive, or non-additive. Critical GATE DS&AI topic.

Real-life analogy: The adjustable telescope

A data cube is like a telescope you can point at any dimension: zoom out (roll-up) to see annual revenue by country; zoom in (drill-down) to daily revenue by city; slice off one year (slice) to focus on 2024; constrain multiple dimensions (dice) to see Q3 2024 electronics in North India; rotate the telescope (pivot) to swap rows and columns.

OLAP operations — roll-up, drill-down, slice, dice, pivot

OperationWhat it doesSQL equivalentExample
Roll-upAggregate to higher level in hierarchyGROUP BY higher-level attrDay→Month: monthly totals from daily data
Drill-downGo to lower level of detailAdd more specific GROUP BYCountry→State→City: revenue by city
SliceFix ONE dimension to a single valueWHERE dimension = valueShow all data for Year = 2024 only
DiceFix MULTIPLE dimensions to rangesWHERE dim1 IN (...) AND dim2 = ...Q3-Q4 2024, Electronics, North India
Pivot (Rotate)Swap row and column dimensionsPIVOT / CASE-based cross-tabRows=Product → Cols=Product

All five OLAP operations in SQL

-- Base: fact_sales with dim_date, dim_product, dim_location

-- ROLL-UP: daily → monthly (using ROLLUP for subtotals)
SELECT
    COALESCE(CAST(d.Year AS TEXT), 'Grand Total') AS Year,
    COALESCE(CAST(d.Month AS TEXT), 'Month Total') AS Month,
    SUM(f.Revenue) AS Revenue
FROM fact_sales f JOIN dim_date d ON f.DateKey = d.DateKey
GROUP BY ROLLUP(d.Year, d.Month)
ORDER BY d.Year, d.Month;

-- DRILL-DOWN: Category → Subcategory → Product
SELECT p.Category, p.Subcategory, p.ProductName, SUM(f.Revenue)
FROM fact_sales f JOIN dim_product p ON f.ProductKey = p.ProductKey
GROUP BY p.Category, p.Subcategory, p.ProductName
ORDER BY p.Category, p.Subcategory, SUM(f.Revenue) DESC;

-- SLICE: fix Year = 2024 (one dimension fixed)
SELECT p.Category, SUM(f.Revenue)
FROM fact_sales f
JOIN dim_date d    ON f.DateKey    = d.DateKey
JOIN dim_product p ON f.ProductKey = p.ProductKey
WHERE d.Year = 2024
GROUP BY p.Category;

-- DICE: multiple dimension constraints (sub-cube)
SELECT d.Quarter, l.Region, p.Category, SUM(f.Revenue)
FROM fact_sales f
JOIN dim_date     d ON f.DateKey     = d.DateKey
JOIN dim_location l ON f.LocationKey = l.LocationKey
JOIN dim_product  p ON f.ProductKey  = p.ProductKey
WHERE d.Year = 2024 AND d.Quarter IN (3, 4)
  AND l.Region = 'North India' AND p.Category = 'Electronics'
GROUP BY d.Quarter, l.Region, p.Category;

-- PIVOT: Rows=Category, Columns=Quarter
SELECT
    p.Category,
    SUM(CASE WHEN d.Quarter = 1 THEN f.Revenue ELSE 0 END) AS Q1,
    SUM(CASE WHEN d.Quarter = 2 THEN f.Revenue ELSE 0 END) AS Q2,
    SUM(CASE WHEN d.Quarter = 3 THEN f.Revenue ELSE 0 END) AS Q3,
    SUM(CASE WHEN d.Quarter = 4 THEN f.Revenue ELSE 0 END) AS Q4,
    SUM(f.Revenue) AS Annual
FROM fact_sales f
JOIN dim_date d    ON f.DateKey    = d.DateKey
JOIN dim_product p ON f.ProductKey = p.ProductKey
WHERE d.Year = 2024
GROUP BY p.Category ORDER BY Annual DESC;

Concept hierarchies and measure categorisation

  • Time hierarchy: Second → Minute → Hour → Day → Week → Month → Quarter → Year
  • Location hierarchy: Street → City → District → State → Region → Country → Continent
  • Product hierarchy: SKU → Product → SubCategory → Category → Department → Brand
  • Employee hierarchy: Employee → Team → Department → Division → Company
Measure typeExampleSum over time?Sum over region?Use
AdditiveRevenue, Units Sold, Order CountYesYesSUM across all dimensions
Semi-additiveAccount Balance, Inventory LevelNo (use AVG or snapshot)YesSUM across regions, AVG/LAST over time
Non-additivePrice, Profit Margin %, TemperatureNoNoAVG, MIN, MAX — never SUM

Why you cannot SUM account balances over time

Account balance is semi-additive. If Account A has ₹1000 on Jan 1 and ₹1200 on Feb 1, the "annual balance" is NOT ₹2200. You want the balance at a specific snapshot. Sum balances ACROSS accounts at the same point in time — but use AVG or last snapshot when aggregating OVER time. GATE tests this distinction every year.

Additive vs semi-additive measure aggregation

-- ADDITIVE measure: Revenue — SUM across all dimensions freely
SELECT d.Year, l.Region, SUM(f.Revenue) AS TotalRevenue
FROM fact_sales f
JOIN dim_date d     ON f.DateKey     = d.DateKey
JOIN dim_location l ON f.LocationKey = l.LocationKey
GROUP BY d.Year, l.Region;   -- Safe to SUM across both time AND region

-- SEMI-ADDITIVE measure: Account Balance
-- WRONG: SUM over time gives meaningless total
SELECT d.Month, SUM(s.Balance) AS WrongBalance FROM fact_account_snapshot s
JOIN dim_date d ON s.DateKey = d.DateKey GROUP BY d.Month;

-- CORRECT: Use LAST balance of each month (snapshot approach)
SELECT d.Month, SUM(s.Balance) AS CorrectMonthEndBalance
FROM fact_account_snapshot s
JOIN dim_date d ON s.DateKey = d.DateKey
WHERE d.IsMonthEnd = TRUE   -- Only last day of each month
GROUP BY d.Month;

-- Or use AVG balance per account per period
SELECT a.AccountID, d.Month, AVG(s.Balance) AS AvgMonthlyBalance
FROM fact_account_snapshot s
JOIN dim_date    d ON s.DateKey    = d.DateKey
JOIN dim_account a ON s.AccountKey = a.AccountKey
GROUP BY a.AccountID, d.Month;

Practice questions

  1. What OLAP operation takes you from monthly sales by country to quarterly sales by continent? (Answer: Roll-up — moving from finer granularity (Month, Country) to coarser (Quarter, Continent) in both Time and Location hierarchies simultaneously.)
  2. Identify the measure type: number of website visits per day. (Answer: Additive — visit counts can be summed across all dimensions: SUM across days gives total visits, SUM across regions gives regional total.)
  3. A data cube has 3 dimensions: Time (12 months), Product (500 SKUs), Region (50 regions). Total cells? (Answer: 12 × 500 × 50 = 300,000 cells. Many cells are empty (sparse cube) — not every product sold in every region every month. Sparse representation is used to save space.)
  4. Difference between slice and dice? (Answer: Slice: one cut along ONE dimension (fix Year=2024 → 2D cross-section). Dice: constraints on MULTIPLE dimensions (Year=2024 AND Region=North AND Category=Electronics → smaller sub-cube).)
  5. Why is profit margin % non-additive? (Answer: Percentage/ratio cannot be summed. 35% + 40% does not equal 75% total margin. Must compute from raw additive components: (total_profit / total_revenue) × 100 at the aggregated level.)

On LumiChats

LumiChats can implement any OLAP operation in SQL, design OLAP cubes for specific business domains, explain concept hierarchies, and generate complete PIVOT/ROLLUP/CUBE queries. Great for data engineering and BI development projects.

Try it free

Try LumiChats for ₹69

39+ AI models. Study Mode with page-locked answers. Agent Mode with code execution. Pay only on days you use it.

Get Started — ₹69/day

Related Terms

4 terms