Surrogate Key

November 27, 2013

A surrogate key may also be called as:

synthetic key
entity identifier
system-generated key
database sequence number
fact less key
technical key
arbitrary unique identifier

-It is the system generated key which cannot be edited by the user.
-It is the primary key of the dimension table in warehouse.
-It is nothing but the sequence which generates the numbers for primary key column.

Surrogate keys are system generated keys. They are integers.

Surrogate keys are extremely useful when having type 2 data (i.e. storing historical information)
For ex: Consider one has a table in which a person and his location are stored. Now when his location is changed and we want to keep a historical record of the same it is stored with a surrogate key that will help us to uniquely identify the record. This is also a reason that OLTP keys are not used in the warehouse and a separate dimension or surrogate key is maintained.

What is the data type of the surrogate key?

Surrogate keys are the system generated keys and used to eliminate the redundancy in the primary keys. So they are always integer numbers.

Generate surrogate key

Goal: Fill in a data warehouse dimension table with data which comes from different source systems and assign a unique record identifier (surrogate key) to each record.

Scenario overview and details : To illustrate this example, we will use two made up sources of information to provide data about customer’s dimension. Each extract contains customer records with a business key (natural key) assigned to it.

In order to isolate the data warehouse from source systems, we will introduce a technical surrogate key instead of re-using the source system's natural (business) key.
A unique and common surrogate key is a one-field numeric key which is shorter, easier to maintain and understand and independent from changes in source system than using a business key. Also, if a surrogate key generation process is implemented correctly, adding a new source system to the data warehouse processing will not require major efforts.

Surrogate key generation mechanism may vary depending on the requirements; however the inputs and outputs usually fit into the design shown below:

Inputs:
- an input represented by an extract from the source system
- data warehouse table reference for identifying the existing records
- maximum key lookup

Outputs:
- output table or file with newly assigned surrogate keys
- new maximum key
- updated reference table with new records

How to test the surrogate key?

usually test with the following way

1. total count = max(sgk) - min(sgk) +1
2. select sgk from table group by sgk having count(*)>1;

Comments

AnonymousFebruary 28, 2017 at 7:56 AM
Hi mahesh , can you explain what is the above two syntax do -

1. total count = max(sgk) - min(sgk) +1
2. select sgk from table group by sgk having count(*)>1;
ReplyDelete
Replies

Add comment

Search This Blog

Testing

Surrogate Key

Comments

Post a Comment

Popular posts from this blog

Data Migration Testing Startegy

Informatica DVO

Informatica DVO installation steps