Click here to Skip to main content
12,701,539 members (34,685 online)
Click here to Skip to main content
Add your own
alternative version

Stats

5.5K views
16 bookmarked
Posted

Masking sensitive data with SQL Server 2016

, 15 Aug 2016 CPOL
Rate this:
Please Sign up or sign in to vote.
We'll see how sensitive data in SQL Server 2016 can be masked from outer world using Dynamic Data Masking.

Introduction

Data Masking

At the time of writing this article, it’s been more than 2 months since the most secure and powerful SQL Server yet released for public. It has many features which you can use to make your data and application more secure, more maintained and easily retrieved as per your requirements. It has introduced both client side security such as Always Encrypted as well as server side security such as Data masking etc. In this post we’ll examine how Dynamic Data Masking can help secure our data.

Earlier we were masking the sensitive data by applying logic at application end or by replacing characters at SQL Server end and then returning the masked data to the application. In the former case the risk was at
                1. When application fails to mask data
                2. Sensitive data transferred over network
                3. All the users in SQL Server with SELECT permission still can access the unmasked data

In the later case, the first two risks are eliminated but still there is a chance that the sensitive data can be revealed by the SQL Server user through which the application retrieves data.
                1. When application fails to mask data
                2. Sensitive data transferred over network
                3. All the users in SQL Server with SELECT permission still can access the unmasked data

So, let’s quickly jump in to the next section to understand what Dynamic Data masking is and how it helps to eliminate all these chances of unmasked data getting revealed.

Data Masking

If you search for definition of Data Masking over the web, you’ll end up with varied descriptions. The simplest and straight forward definition can be –

“Data masking is the process of hiding original data with random characters or data.”
                                                                                                                   --wikipedia

If we look further, there are various data masking techniques available such as

  • Shuffling: Shuffling order of the characters in the value.
    Ex: 12345 -----> 35312
  • Nulling: Substituting characters in the value with null (hash) symbol.
    Ex: 12345 -----> ###45
  • Substitution: Substituting value with another value from a substitution table.
    Ex: Suvendu Giri -----> John Ptak
  • Masking Out: Masks the complete data or a selective part of the data.
    Ex: suvendu@mydomain.com -----> suvendu@xxxx.xxx
    etc.

Considering many techniques of data masking they can be categorized into 3 types-

  • Static Data Masking: Original data in production changed to masked data.
  • On-the-fly Data masking: Copies data from one source to another source and masking done on the later.
  • Dynamic data masking: Masking done dynamically at runtime.

Dynamic Data Masking

“Dynamic data masking limits sensitive data exposure by masking it to non-privileged users. Dynamic data masking helps prevent unauthorized access to sensitive data by enabling customers to designate how much of the sensitive data to reveal with minimal impact on the application layer. It’s a data protection feature that hides the sensitive data in the result set of a query over designated database fields, while the data in the database is not changed.”
--- MSDN

So, dynamic data masking alters the result for non-privileged users while streaming and not with data in the production database.

Data Masking in MySQL:

As far I know, till the recent version of MySQL there is no inbuilt mechanism to provide dynamic data masking capabilities. The possible options for MySQL users seems to be

  1. Use 3rd party tools
  2. Use custom scripts/functions

Examples:

SELECT '****' AS PhoneNumber
FROM `Employees`
--Input: 1234567890        Output: ****
SELECT REPEAT('*', CHAR_LENGTH(PhoneNumber) - 6) AS PhoneNumber
FROM `Employees`
--Input: 1234567890                        Output: **********
SELECT CONCAT(SUBSTR(PhoneNumber, 1, 4), REPEAT('*', CHAR_LENGTH(PhoneNumber) - 4)) AS PhoneNumber
FROM `Employees`
--Input: 1234567890                        Output: 1234******

Data Masking in PostgreSQL:

Same as MySQL. It seems they are still lacking this feature. However, we can make use of some custom scripts or user defined functions.

Azure SQL Database too supports the dynamic data masking similar to SQL Server and I have heard from my friends that Oracle provides some kind of mechanism for data masking, although I haven’t ever used it.

Dynamic Data Masking in SQL Server 2016:

There are four masking functions available in SQL Server 2016 to provide separate ways to mask your data when they are returned by a query. These are-

Default:

When you want all the characters to be masked, you can use Default masking function. The characters for masking and no. of masked characters to be returned by a query is determined on the following logic-

  • XXXX – four Xs if data length is more than or equals to four.
  • 0 in case of numeric datatype
  • 0 in case of binary data type
  • 01/01/1900 for datetime datatypes

Syntax:

MASKED WITH (FUNCTION = 'default()')

Creating table -

CREATE TABLE [dbo].[Employee](
[SecretCode] VARCHAR(10) MASKED WITH (FUNCTION = 'default()'),
-- other fields

Modifying column –

ALTER TABLE [dbo].[Employees]
ALTER COLUMN [SecretCode] ADD MASKED WITH (FUNCTION = 'default()');

Example:

Let’s check these functionalities using Table variable.

DECLARE @Employees AS TABLE
(
     SecretCode VARCHAR(10) MASKED WITH (FUNCTION = 'default()'),
     NetSalary MONEY MASKED WITH (FUNCTION = 'default()'),
     DateOfBirth DATETIME MASKED WITH (FUNCTION = 'default()')               
)

INSERT INTO @Employees
SELECT 'ABCDEFGHIJ', 80000,'1990-05-21'

SELECT * FROM @Employees

Result:

SecretCode        NetSalary            DateOfBirth
xxxx              0.00                 1900-01-01 00:00:00.000

NOTE: You need check this with a non-admin user as admin users are granted to see all the data. So, if you will execute this using an admin user then you’ll see no difference. Let’s say your non-admin user is ‘demouser’, then add following line at the top of the query.

EXECUTE AS USER='DemoUser'

Email:

It masks partially the characters of an email id. The interesting thing to notice is, it masks characters as well as the length of the email id making it impossible to predict the email id from the few displayed characters.

Syntax:

MASKED WITH (FUNCTION = 'email()')

Example:

Let’s modify the same example to add an emailed column as follows-

DECLARE @Employees AS TABLE
(
     SecretCode VARCHAR(10) MASKED WITH (FUNCTION = 'default()'),
     NetSalary MONEY MASKED WITH (FUNCTION = 'default()'),
     DateOfBirth DATETIME MASKED WITH (FUNCTION = 'default()'),
     EmailID VARCHAR(50) MASKED WITH (FUNCTION = 'email()')
)
INSERT INTO @Employees
SELECT 'ABCDEFGHIJ', 80000,'1990-05-21', 'suvendugiri@mydomain.in'

SELECT * FROM @Employees

Result:

SecretCode    NetSalary     DateOfBirth                EmailID
xxxx          0.00          1900-01-01 00:00:00.000    sXXX@XXXX.com

Partial:

It masks the data based on custom inputs to the function. It masks the data partially. 3 parameters need to be passed to this masked function such as

  1. Prefix: Number of characters that has to be shown from the start.
  2. Padding: Characters that you need to show in between prefix and suffix
  3. Suffix: Number of characters that has to be shown from the end.

Syntax:

MASKED WITH (FUNCTION = 'partial(prefix,padding,suffix)')

Example:

Further adding a new column to the same example as-

DECLARE @Employees AS TABLE
(
      SecretCode VARCHAR(10) MASKED WITH (FUNCTION = 'default()'),
      NetSalary MONEY MASKED WITH (FUNCTION = 'default()'),
      DateOfBirth DATETIME MASKED WITH (FUNCTION = 'default()'),
      EmailID VARCHAR(50) MASKED WITH (FUNCTION = 'email()'),
      FullName VARCHAR(50) MASKED WITH (FUNCTION = 'partial(2,"***",2)')
)

INSERT INTO @Employees
SELECT 'ABCDEFGHIJ', 80000,'1990-05-21', 'suvendugiri@mydomain.in','Suvendu Shekhar Giri'

SELECT * FROM @Employees

Result:

SecretCode   NetSalary   DateOfBirth              EmailID         FullName
xxxx         0.00        1900-01-01 00:00:00.000  sXXX@XXXX.com   Su***ri

Random:

Masks any numeric value with a random value based on the provided upper and lower boundary values as parameters to the masked function.

Syntax:

MASKED WITH (FUNCTION = ‘random(lower_bound,upper_bound)

Example:

DECLARE @Employees AS TABLE
(
     SecretCode VARCHAR(10) MASKED WITH (FUNCTION = 'default()'),
     NetSalary MONEY MASKED WITH (FUNCTION = 'default()'),
     DateOfBirth DATETIME MASKED WITH (FUNCTION = 'default()'),
     EmailID VARCHAR(50) MASKED WITH (FUNCTION = 'email()'),
     FullName VARCHAR(50) MASKED WITH (FUNCTION = 'partial(2,"***",2)'),
     Age INT MASKED WITH (FUNCTION = 'random(90,100)')
)

INSERT INTO @Employees
SELECT 'ABCDEFGHIJ', 80000,'1990-05-21', 'suvendugiri@mydomain.in','Suvendu Shekhar Giri',32

SELECT * FROM @Employees

Result:

SecretCode  NetSalary  DateOfBirth              EmailID        FullName  Age
xxxx        0.00       1900-01-01 00:00:00.000  sXXX@XXXX.com  Su***ri   93

If you want a user to see the values stored in database same as admin, you can grant access to the user using following command-

GRANT UNMASK TO DemoUser

You can again apply the restriction to see masked the data by following command

REVOKE UNMASK TO DemoUser

To remove the masked function from a column you can drop it similar to other constraints like-

ALTER TABLE Employees
ALTER COLUMN FullName DROP MASKED

Conclusion

Dynamic Data Masking in SQL Server 2016 is really a promising feature and will help avoiding writing so many custom scripts to mask the sensitive data. The role/access based viewing data makes perfect sense and very useful. Currently, the access for an user can be given on database. Would be great if there can be a mechanism to give these access based on table or schema.

Would like to know about your feedback, comments or suggestions.

Thanks for reading :)

History

12th August 2016: First Publish

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Suvendu Shekhar Giri
Software Developer
India India
With more than 4 years of experience in Microsoft Technologies like C#, ASP.Net, SQL Server and various web technologies like HTML, CSS, Javascript, JQuery etc.

I have a Master’s Degree in Computer Science and also a Microsoft Certified Professional in “Programming in HTML5 with JavaScript and CSS3”.

I always encourage learning and sharing of ideas because I believe “An idea is of no worth if it dies inside you”. Active member in various community sites and activities.

You may also be interested in...

Pro
Pro

Comments and Discussions

 
GeneralMy vote of 5 Pin
Renju Vinod11-Sep-16 21:17
professionalRenju Vinod11-Sep-16 21:17 
GeneralRe: My vote of 5 Pin
Suvendu Shekhar Giri11-Sep-16 22:20
mvpSuvendu Shekhar Giri11-Sep-16 22:20 
GeneralMy vote of 5 Pin
Surya(10980329)8-Sep-16 23:55
memberSurya(10980329)8-Sep-16 23:55 
GeneralRe: My vote of 5 Pin
Suvendu Shekhar Giri9-Sep-16 1:58
mvpSuvendu Shekhar Giri9-Sep-16 1:58 
SuggestionUpdate a paragraph related to Oracle database technology Pin
Jesus Carroll22-Aug-16 8:04
professionalJesus Carroll22-Aug-16 8:04 
PraiseRe: Update a paragraph related to Oracle database technology Pin
Suvendu Shekhar Giri22-Aug-16 8:12
mvpSuvendu Shekhar Giri22-Aug-16 8:12 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170118.1 | Last Updated 16 Aug 2016
Article Copyright 2016 by Suvendu Shekhar Giri
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid