How to mask sensitive SQL Server data with external data sources

Applies to

ApexSQL Mask

Summary

This article explains how to mask sensitive SQL Server data with external data sources.

Description

ApexSQL Mask is a tool used to mask sensitive SQL Server data by offering two types of masks, the substitution, and the redaction type.

When the substitution type of mask is chosen, several different masking generators can be used to mask sensitive SQL Server data, among them an external data source generator as well.

External data sources generator

When the user has already predefined saved external data, that data could be used to mask sensitive SQL Server data by choosing the external data source generator.

To use external data sources for mask sensitive data, first, click on the Manage button in the Home tab:

The Manage button in the Home tab

The Manage masks window will be opened, where, by default, the Substitution type of masking is selected. In the Manage mask window, masks can be created, edited, or deleted. By clicking on the New button, the process of creating the new substitution mask will be started:

The Manage masks window in the process of mask sensitive SQL Server data

The Create substitution mask window will be opened, where, in the Type drop-down list, External source generator can be selected for masking sensitive data:

Choosing a External source generator form the Type drop-down list in the Create substitution mask window

When the External source is chosen for masking sensitive data, below the Type drop-down list will appear the File combo box. In that combo box, the user can type the address of the file destination or with click on the Browse for folder button chooses the external data sources using which will mask data:

The File combo-box

Click on the Browse for folder button, will show the Open window where can be selected any external data source to mask sensitive SQL Server data:

The open window

The supported external data file formats for masking sensitive data are CSV, Excel, JSON, XML.

Depending on the selected external data file format, additional options will be shown in the area below the File combo box.

Randomize data by option group

The Randomize data by option group is contained in every external data source used to mask sensitive SQL Server data. The options which can be adjusted in the Randomize data by the group are:

The Randomize data by option group

  • Seed – randomized data using the seed value. The seed value providing the foundation for the selection of random values from the provided range of values
  • Timestamp – data is randomized by using a timestamp value. The timestamp is made up of a ten-digit number which represents the number of seconds that have passed since 1st January, UTC
  • Shuffle – data used for masking is shuffled
  • Loop – the loop is selected when it is a need to create the value to fill the requested number of data that needs to be masked

In addition to these standard options, each external data source format has its options that need to be configured.

Mask sensitive data with CSV external data source

Besides Randomize data by option group when masking sensitive data with CSV external data, the next options can be checked as well:

  • Separator – select which delimiter should be used for masking
  • Column index – select which column number will be used for masking
  • Skip first row as header – the first row is not masked but instead, it is skipped and treated like header

The Option group for CSV external data source masking

Masked sensitive data with Excel external data source

When excel external data source is used, additional options are:

  • Spreadsheet index – number of the spreadsheet that will be used to mask sensitive SQL Server data
  • Column index – select a number of columns in the Excel document using for masking
  • Skip first row as a header – the first row is not masked but instead, it is skipped and treated like header

The Option group for Excel external data source masking

Masked sensitive data with JSON external data source

In order to mask sensitive data with JSON external data source, it is necessary to find JSON value for masking. The following options are used for detecting that value in the JSON data:

  • Specific path – or JSONPath is a query language that lets extracting just the parts of a JSON document that ApexSQL Mask needs for masking data
  • Wildcard name – that refers to a name or character that can be substituted for zero or more characters in the string

The Option group for JSON external data source masking

Masked sensitive data with XML external data source

Masking sensitive data with an XML file is similar to masking data with JSON data. Except that in an XML file, an XML element used for masking confidential data is searched by the following options:

  • Specific path – in this case, that is XPath. Xpath is a syntax or language for finding XML elements in the XML file using XML path expression
  • Wildcard name – name or the character that can be alternated for one or more character

The Option group for XML external data source masking

When the selection and adjustment of the external data sources are finished, click the Apply button in the Create substitution mask window:

The Apply button in the Create substitution mask window

The Apply to columns window will appear, where the columns for masking with the external data source can be chosen. When the columns are selected click on the OK button:

The Apply to columns window

Conclusion

Using ApexSQL Mask, any type of data can be masked in described ways, based on the chosen external data sources.