Data limitation

Definition
Data limitation

"involves manipulations that restrict the number of variables, the number of values for responses, or the number of cases that are made available to researchers. The purpose of data limitation is to reduce the number of unique values in a dataset (reducing the risk of identification) or to reduce the certainty of identification of a specific respondent by a secondary user."

Overview
A very simple approach sometimes taken with public-use data is to release only a small fraction of the data originally collected, effectively deleting half or more of all cases. This approach makes it difficult, even impossible, for a secondary user who knows that an individual is in the sample to be sure that she or he has identified the right person: the target individual may have been among those deleted from the public dataset.

For tabular data, as well as some microdata, one data limitation approach is cell suppression. The data steward essentially blanks out cells with small counts in tabular data or blanks out the values of identifiers or sensitive attributes in microdata.