登录查看更多内容

Natural Keys vs. Surrogate Keys

? Daniel Wanjiru

Certified Sr. SAS Programmer (SP) | Statistical Programmer Enthusiast | Living, Learning & Growing | SASensei #1 AFRICA

发布日期: 2023年12月14日

A natural key is a set of data that uniquely identifies an entity and distinguishes it from any other row in the table. The advantage of natural keys is that they exist already, so there’s no need to introduce a new value to the data schema. However, one of the difficulties in choosing a natural key is that just about any natural key one can think of has the potential to change. Because they have business meaning, natural keys are effectively coupled to the business, and they may need to be reworked when business requirements change. For example, the addition of a position or location that becomes a key in a new study, but which was not collected in previous studies, could require reworking the natural key in clinical trials data.

A surrogate key is a single-part, artificially established identifier for a record. Surrogate key assignment is a special case of derived data, one where a portion of the primary key is derived. A surrogate key is immune to changes in business needs. In addition, the key depends on only 1 field, so it is compact. A common way of deriving surrogate key values is to assign integer values sequentially. The --SEQ variable in the SDTM datasets is an example of a surrogate key for most datasets; in some instances, however, --SEQ might be a part of a natural key as a replacement for what might have been a key (e.g., a repeat sequence number) in the sponsor’s database .

#CDISC
#ClinicalDataInterchangeStandardsConsortium
#SDTM
#SDTMImplementationGuide
#ClinicalTrials
#ClinicalDataManagement
#DataStandardization
#DataQuality
#DataIntegrity
#NaturalKeys
#SurrogateKeys

Mazi ?? Ntintelo Sunil Gupta Athenkosi Nkonyeni CDISC 赛仕软件

Allan B.

SAS App Migration, Modernisation, and Manifestation

1 年

UUIDs make better surrogate keys when the table is large, and the loads are mainly inserts rather than updates (no need to calculate or maintain the highest integer) Surrogate keys are also invaluable when your natural keys contain null values

1 次回应

Jozef Aerts

Passionate about standards in clinical research and healthcare, and their implementation in IT systems.

1 年

--SEQ should essentially not be in SDTM. It needs to be re-calculated (post-execution) each time the number of records is increased or decreased. It also is used as a "lazy way" to connect records in different datasets with each other (--SPID is a better way) and especially to connect with the "non-standard variable" records in the corresponding SUPPxx. The latter should also not exist. They are only there as (the tools of) many reviewers are not capable to distinguish between a standard variable and a non-standard variable. And then there is also the 200-character limitation of XPT ... But as long as reviewers are uncabable to use natural keys (though defined in the define.xml) I am afraid we will need surrogate keys in SDTM, SEND and ADaM...

6 次回应

查看更多评论

要查看或添加评论，请登录

? Daniel Wanjiru的更多文章

Exploring SAS Dictionary Tables: A Comprehensive Guide

2025年3月13日

Exploring SAS Dictionary Tables: A Comprehensive Guide

In the world of SAS programming, dictionary tables are valuable resources for accessing metadata about your SAS…
Simple Syntax of PROC FCMP

2025年3月10日

Simple Syntax of PROC FCMP

PROC FCMP in SAS is used to define and store custom functions and subroutines. Here's a breakdown of the syntax and an…

4 条评论
Ensuring Consistency in Reporting Date and Time in Clinical Data Collection

2025年2月19日

Ensuring Consistency in Reporting Date and Time in Clinical Data Collection

In clinical research, accurately reporting the date and time of data collection is crucial for maintaining the…

9 条评论
Enhancing Clinical Data Interpretation with Null Flavors

2024年11月28日

Enhancing Clinical Data Interpretation with Null Flavors

In the realm of clinical trials, the accurate representation and interpretation of data are paramount. One innovative…

1 条评论
Understanding Intervals of Time and Duration in Clinical Trials

2024年11月16日

Understanding Intervals of Time and Duration in Clinical Trials

In clinical trials, accurately representing intervals of time and duration is crucial for data integrity and analysis…

4 条评论
Presence or Absence of Prespecified Interventions and Events in Clinical Studies

2024年11月8日

Presence or Absence of Prespecified Interventions and Events in Clinical Studies

In clinical research, the collection of data on interventions (e.g.

3 条评论
Understanding EPOCH in Clinical Trials: Best Practices and Guidelines

2024年10月30日

Understanding EPOCH in Clinical Trials: Best Practices and Guidelines

In clinical trials, the concept of EPOCH is crucial for accurately categorizing and analyzing data. Here’s a…
Understanding Event Categorization in Clinical Trials: The Role of DSCAT and DSDECOD in Accurate Reporting

2024年9月10日

Understanding Event Categorization in Clinical Trials: The Role of DSCAT and DSDECOD in Accurate Reporting

==> In clinical trials, categorizing events correctly is crucial for accurate data reporting and analysis. When an…

10 条评论
Navigating the Steps to Database Lock: A Comprehensive Guide

2024年6月7日

Navigating the Steps to Database Lock: A Comprehensive Guide

==> Greetings, LinkedIn community! Today, I’d like to delve into the pivotal process of database locking, a significant…

See all articles

Natural Keys vs. Surrogate Keys

? Daniel Wanjiru

Certified Sr. SAS Programmer (SP) | Statistical Programmer Enthusiast | Living, Learning & Growing | SASensei #1 AFRICA

? Daniel Wanjiru的更多文章

社区洞察

其他会员也浏览了

POLLUTED BEFORE BIRTH

You are what you eat

Egg Quality - why can't anyone agree?

Artificial Insemination Market Growth, Trends Forecast 2022-28

Maternal Mortality in Cameroon

Toxin Problems & Related Impacts on Reproductive System of Sows

Wishing You Health And Happiness In The New Year

New Documentary 'Genetically Modified Children' Links Monsanto, Philip Morris to Birth Defects in Argentina

Reproductive Management : What Should We Focus On Next ?

How mature do mice and rats need to be to mate?

? Daniel Wanjiru的更多文章

Exploring SAS Dictionary Tables: A Comprehensive Guide

Simple Syntax of PROC FCMP

Ensuring Consistency in Reporting Date and Time in Clinical Data Collection

Enhancing Clinical Data Interpretation with Null Flavors

Understanding Intervals of Time and Duration in Clinical Trials

Presence or Absence of Prespecified Interventions and Events in Clinical Studies

Understanding EPOCH in Clinical Trials: Best Practices and Guidelines

Understanding Event Categorization in Clinical Trials: The Role of DSCAT and DSDECOD in Accurate Reporting

Navigating the Steps to Database Lock: A Comprehensive Guide

社区洞察

其他会员也浏览了

POLLUTED BEFORE BIRTH

You are what you eat

Egg Quality - why can't anyone agree?

Artificial Insemination Market Growth, Trends Forecast 2022-28

Maternal Mortality in Cameroon

Toxin Problems & Related Impacts on Reproductive System of Sows

Wishing You Health And Happiness In The New Year

New Documentary 'Genetically Modified Children' Links Monsanto, Philip Morris to Birth Defects in Argentina

Reproductive Management : What Should We Focus On Next ?

How mature do mice and rats need to be to mate?