A star schema is a common way of organizing data in a data warehouse or a data mart. It consists of a central fact table that contains the measures or metrics of interest, and several dimension tables that describe the attributes or characteristics of the measures. The fact table and the dimension tables are joined by foreign keys, creating a star-like shape.
What is a fact table?
A fact table is a table that stores the quantitative data that is analyzed by the users or the organization. For example, in a sales data mart, the fact table might store information such as sales revenue, units sold, profit margin, etc. Each record in the fact table represents a specific event or transaction, such as a sale or an order.
What is a dimension table?
A dimension table is a table that stores the descriptive data that is used to filter, group, or label the facts. For example, in a sales data mart, the dimension tables might store information such as product name, product category, customer name, customer location, order date, etc. Each record in a dimension table represents a unique value or level of the attribute.
In a typical star schema, each dimension record is related to thousands of fact records. This means that each value or level of a dimension attribute can be associated with many facts. For example, in a sales data mart, each product name can be linked to many sales transactions, each customer name can be linked to many orders, each order date can be linked to many sales events, etc.
The relationship between fact and dimension tables is usually one-to-many (1:M), meaning that one dimension record can have many corresponding fact records, but one fact record can have only one corresponding dimension record for each attribute. This ensures that the facts are at the lowest level of detail and can be aggregated or summarized by any combination of dimensions.
Why use a star schema?
A star schema is a simple and intuitive way of modeling data for analytical purposes. It has several benefits, such as:
- It is easy to understand and query. The users can easily see the relationship between the facts and the dimensions, and write queries using simple joins.
- It allows for fast query performance and efficient use of database resources. The simple structure of the star schema reduces the number of joins and tables involved in a query, and enables the use of indexes and bitmap join indexes to speed up the query execution.
- It is flexible and scalable. The star schema can be easily extended by adding new measures to the fact table or new attributes to the dimension tables, without affecting the existing structure or queries.