zt:Cardinality (SQL statements) 最好的解釋

lfree發表於2008-01-31

http://www.itpub.net/thread-934862-1-1.html

In SQL, the term cardinality refers to the uniqueness of data values contained in a particular column (attribute) of a database table.
When dealing with columnar value sets, there are 3 types of cardinality: high-cardinality, normal-cardinality, and low-cardinality.
High-Cardinality
High-cardinality refers to data table column values that are very uncommon. High-cardinality column values are typically identification numbers, email addresses, or user names. An example of a data table column with high-cardinality would be a USERS table with a column named USER_ID. This column would contain unique values of 1-n. Each time a new user is created in the USERS table, a new number would be created in the USER_ID column to identify them uniquely. Since the values held in the USER_ID column are unique, this column's cardinality type would be referred to as high-cardinality.
Normal-Cardinality
Normal-cardinality refers to data table column values that are somewhat uncommon. Normal-cardinality column values are typically names, street addresses, or vehicle types. An example of a data table column with normal-cardinality would be a CUSTOMER table with a column named LAST_NAME. This column would contain the last names of customers. While some people have common last names, such as Smith, others have uncommon last names. Therefore, an examination of all of the values held in the LAST_NAME column would show "clumps" of names in some places (e.g.: a lot of Smith's ) surrounded on both sides by a long series of unique values. Since there is a variety of possible values held in this column, its cardinality type would be referred to as normal-cardinality.
Low-Cardinality
Low-cardinality refers to data table column values that are not very unusual. Low-cardinality column values are typically status flags, boolean values, or major classifications such as gender. An example of a data table column with low-cardinality would be a CUSTOMER table with a column named NEW_CUSTOMER. This column would contain only 2 distinct values: Y or N, denoting whether the customer was new or not. Since there are only 2 possible values held in this column, its cardinality type would be referred to as low-cardinality.
Retrieved from ""

[@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-998928/,如需轉載,請註明出處,否則將追究法律責任。

相關文章