Intel® Tamper Protection Toolkit Helps Protect the Scrypt Encryption Utility against Reverse Engineering

Android on Intel

5.00/5 (1 vote)

May 2, 2016

CPOL

13 min read

8477

Intel® Developer Zone offers tools and how-to information for cross-platform app development, platform and technology information, code samples, and peer expertise to help developers innovate and succeed. Join our communities for Android, Internet of Things, Intel® RealSense™ Technology, and Windows to download tools, access dev kits, share ideas with like-minded developers, and participate in hackathon’s, contests, roadshows, and local events.

Introduction

This article describes how the Intel® Tamper Protection Toolkit can help protect critical code and valuable data in a password-based encryption utility (Scrypt Encryption Utility) [3] against static and dynamic reverse-engineering and tampering. Scrypt [4] is a modern secure password-based key derivation function that is widely used in security-conscious software. There is a potential threat to scrypt described in [2] when an attacker can force generation of weak keys by forcing use of specific parameters. Intel® Tamper Protection Toolkit can be used to help mitigate this threat. We explain how to refactor relevant code and apply tamper protection to the utility.

In this article we discuss the following components of the Intel Tamper Protection Toolkit:

Iprot. An obfuscation tool that creates self-modifying and self-encrypted code
crypto library. A library that provides iprot-compatible implementations of basic crypto operations: cryptographic hash function, keyed-hash message authentication code (HMAC), and symmetric ciphers.

You can download the Intel Tamper Protection Toolkit at https://software.intel.com/en-us/tamper-protection.

Scrypt Encryption Utility Migration to Windows

Since the Scrypt Encryption Utility is targeted at Linux* and we want to show how to use the Intel Tamper Protection Toolkit on Windows*, our first task is to port the Scrypt Encryption Utility to Windows. Platform-dependent code will be framed with the following conditional directive:

#if defined(WIN_TP)
// Windows-specific code
#else
// Linux-specific code
#endif  // defined(WIN_TP)

Example 1: Basic structure of a conditional directive

The WIN_TP preprocessing symbol localizes Windows-specific code. WIN_TP should be defined for a Windows build, otherwise reference code is chosen for the build.

We use Microsoft Visual Studio* 2013 for building and debugging the utility. There are differences between Windows and Linux in various categories, such as process, thread, memory, file management, infrastructure services, and user interfaces. We had to address these differences for the migration, described in detail below.

The utility uses getopt() to handle command-line arguments. See a list of the program arguments in the Scrypt Encryption Utility section in [2]. The function getopt() is accessed from the unitstd.h POSIX OS header file. We used the get_opt() implementation from an open source project getopt_port [1]. Two new files, getopt.h and getopt.c, taken from this project were added into our source code tree.
Another function, gettimeofday(), present in the POSIX API helps the utility measure salsa opps, a number of salsa20/8 operations per second performed on the user’s platform. The utility needs the metric salsa opps to pick a secure configuration N, r, and p for input parameters so that the Scrypt algorithm executes at least the desired minimal number of salsa20/8 operations to avoid brute force attacks. We added the gettimeofday() implementation [5] to the scryptenc_cpuperf.c file.
Before the utility starts configuring the algorithm it asks the OS about the amount of available RAM allowed to be occupied for the derivation by calling the POSIX system function getrlimit(RLIMIT_DATA, …). For Windows, both soft and hard limits for the maximum size of process’s data segment (initialized data, uninitialized data, and heap) are established to be equal to 4 GB:
```
/* ... RLIMIT_DATA... */
#if defined(WIN_TP)
rl.rlim_cur = 0xFFFFFFFF;
rl.rlim_max = 0xFFFFFFFF;
if((uint64_t)rl.rlim_cur < memrlimit) {
	memrlimit = rl.rlim_cur;
}
#else
if (getrlimit(RLIMIT_DATA, &rl))
	return (1);
if ((rl.rlim_cur != RLIM_INFINITY) &&
     ((uint64_t)rl.rlim_cur < memrlimit))
	memrlimit = rl.rlim_cur;
#endif  // defined(WIN_TP)
```
Example 2: RLIMIT data limiting the process to 4GB
Additionally, the MSVS compiler directive to inline functions in sysendian.h is added:
```
#if defined(WIN_TP)
static __inline uint32_t
#else
static inline uint32_t
#endif  // WIN_TP
be32dec(const void *pp);
```
Example 3: Adding sysendian.h inline functions

We migrated the tarsnap_readpass(…) function, which handles and masks retrieving passwords through a terminal. The function turns off echoing and masks the password with blanks in the terminal. The password is stored in memory buffer and sent to the next functions:

/* If we're reading from a terminal, try to disable echo. */
#if defined(WIN_TP)
if ((usingtty = _isatty(_fileno(readfrom))) != 0) {
	GetConsoleMode(hStdin, &mode);
	if (usingtty)
		mode &= ~ENABLE_ECHO_INPUT;
	else
		mode |= ENABLE_ECHO_INPUT;
	SetConsoleMode(hStdin, mode);
}
#else
if ((usingtty = isatty(fileno(readfrom))) != 0) {
	if (tcgetattr(fileno(readfrom), &term_old)) {
		warn("Cannot read terminal settings");
		goto err1;
	}
	memcpy(&term, &term_old, sizeof(struct termios));
	term.c_lflag = (term.c_lflag & ~ECHO) | ECHONL;
	if (tcsetattr(fileno(readfrom), TCSANOW, &term)) {
		warn("Cannot set terminal settings");
		goto err1;
	}
}
#endif  // defined(WIN_TP)

Example 4: Password control via terminal

In the original getsalt() a salt is built from pseudorandom numbers read from the Linux special file /dev/urandom. On Windows we suggest using the rdrand() instruction to read from a hardware random number generator available on Intel® Xeon® and Intel® Core™ processor families starting from Ivy Bridge microarchitecture. The C standard pseudorandom generator is not used, as getsalt() is incompatible with the Intel Tamper Protection Toolkit obfuscation tool. The function getsalt() should be protected with the obfuscator against static and dynamic tampering and reverse-engineering since a salt produced by this function is categorized as sensitive in the Scrypt Encryption Utility section in [2]. The example below shows both original and ported codes of random number generation to fill a salt:
```
#if defined(WIN_TP)
	uint8_t i = 0;

	for (i = 0; i < buflen; i++, buf++)
	{
		_rdrand32_step(buf);
	}
#else
	/* Open /dev/urandom. */
	if ((fd = open("/dev/urandom", O_RDONLY)) == -1)
		goto err0;
	/* Read bytes until we have filled the buffer. */
	while (buflen > 0) {
		if ((lenread = read(fd, buf, buflen)) == -1)
			goto err1;
		/* The random device should never EOF. */
		if (lenread == 0)
			goto err1;
		/* We're partly done. */
		buf += lenread;
		buflen -= lenread;
	}
	/* Close the device. */
	while (close(fd) == -1) {
		if (errno != EINTR)
			goto err0;
	}
#endif defined(WIN_TP)
```
Example 5: Original and ported random number generation code

Utility Protection with the Intel® Tamper Protection Toolkit

Now we will make changes in the utility design and code to help protect sensitive data identified in the threat model in the Password-Based Key Derivation section in [2]. The protection of the sensitive data is achieved by code obfuscation using iprot, the obfuscating compiler included in the Intel Tamper Protection Toolkit. It is reasonable to obfuscate only those functions that create, handle, and use sensitive data.

From the Code Obfuscation section in [2] we know that iprot takes as input a dynamic library (.dll) and produces a binary with only obfuscated export functions specified in the command line. So we put all functions working with sensitive data into a dynamic library to be obfuscated, leaving others, like command-line parsing and password reading, in the main executable.

Figure 1 shows the new design for the protected utility. The utility is split into two parts: the main executable and a dynamic library to be obfuscated. The main executable is responsible for parsing a command line, and reading a passphrase and input file into a memory buffer. The dynamic library includes export functions such as scryptenc_file and scryptdec_file that work with sensitive data (N, r, p, salt).

The key data structure used by the dynamic library is the Scrypt context, which stores HMAC digested information about the Scrypt parameters N, r, p and salt. The HMAC digest in the context is used to determine whether the latest changes in the context are done by trusted functions such as scrypt_ctx_enc_init, scrypt_ctx_dec_init, scryptenc_file, and scryptdec_file, which have an HMAC key to resign and to verify the context. These trusted functions will be resistant to modifications since we intend to obfuscate them by the obfuscation tool. Two new functions, scrypt_ctx_enc_init and scrypt_ctx_dec_init, appear to initialize the Scrypt context for both encryption and decryption modes.

Figure 1: Design for protected Scrypt Encryption Utility.

Encryption Flow

The utility uses getopt() to handle command-line arguments. See a list of the program arguments in the Password-Based Key Derivation Function section in [2].
Input file for encryption and a passphrase are read into the memory buffer.
The main executable calls scrypt_ctx_enc_init to initialize the Scrypt context for computing secure Scrypt parameters (N, r, p and salt) for specified CPU time and RAM size to the key derivation through command-line options like maxmem, maxmemfrac, and maxtime. At the end of this call the initialization function creates an HMAC digest, including the newly updated state, to prevent tampering when the function returns. The initialization function will also return the amount of memory the application must allocate to proceed with encryption.
The utility in the main executable dynamically allocates memory based on the size returned by the initialization function.
The executable calls scrypt_ctx_enc_init a second time. The function verifies integrity of the Scrypt context using Hash MAC digest. If integrity verification passes, the function sets the buffer location in the context with the allocated location and updates HMAC. File reading and dynamic memory allocation are done in the executable to avoid iprot incompatible code in the dynamic library. Code containing system calls and C standard functions generate indirect jumps and relocations that are not supported by the obfuscator.
The executable calls scryptenc_file to encrypt the file using the user-supplied passphrase. The function verifies integrity of the Scrypt context with parameters (N, r, p, and salt) used for the key derivation. If verification passes it calls the Scrypt algorithm to derivate a key. The derived key is then used for encryption. The export function forms the same output as the original Scrypt utility. This means the output has similar hash values used for integrity verification of encrypted data and correctness of passphrase during decryption.

Decryption Flow

The utility uses getopt() to handle command-line arguments. See a list of the program arguments in the Password-Based Key Derivation section in [2].
Input file for decryption and a passphrase are read into memory buffer.
The main executable calls scrypt_ctx_dec_init to check whether the provided parameters in the encrypted file data are valid and whether the key derivation function can be computed within the allowed memory and CPU time.
The utility in the main executable dynamically allocates memory based on the size returned by the initialization function.
The executable calls scrypt_ctx_dec_init a second time. The function does the same as in the encryption case.
The executable calls scryptdec_file to decrypt file using password. The function verifies integrity of the Scrypt context with parameters (N, r, p, and salt) used for the key derivation. If verification passes it calls the Scrypt algorithm to derive a key. Using hash values in encrypted data the function verifies correctness of password and integrity of encrypted data.

In the protected utility we replace the OpenSSL* implementation of the Advanced Encryption Standard in CTR mode cipher and keyed hash function with Intel Tamper Protection Toolkit crypto library one. Unlike OpenSSL*, the crypto library satisfies all code restrictions to be obfuscated by iprot and can be used from within obfuscated code without further modification. The AES cipher is called inside the scryptenc_file and scryptdec_file to encrypt/decrypt the input file using a key derived from a password. The keyed hash function is called by the export functions (scrypt_ctx_enc_init, scrypt_ctx_dec_init, scryptenc_file and scryptdec_file) to verify the data integrity of a Scrypt context before using it. In the protected utility all the exported functions of the dynamic library are obfuscated with iprot. The Intel Tamper Protection Toolkit helps us achieve the goal to mitigate threats defined in the Password-Based Key Derivation section in [2].

Our solution is a redesigned utility with an iprot obfuscated dynamic library. This is resistant to attacks determined above, and it can be proved that the Scrypt context can be updated only by the export functions because they have the HMAC private key to recalculate the HMAC digest in the context. Also, these functions and the HMAC key are protected against tampering and reverse engineering by the obfuscator. In addition, other sensitive data such as the key produced by Scrypt is protected since it is derived inside obfuscated exported functions scryptenc_file and scryptdec_file. The obfuscation compiler produces code that is self-encrypted at runtime and protected against tampering and debugging.

Let us consider the code about how scrypt_ctx_enc_init protects the Scrypt context. The main executable signals buf_p through a pointer at the same time scrypt_ctx_enc_init is called. If the pointer is equal to null, the function is called for the first time; otherwise it is called the second time. During the first call of the initialization it picks Scrypt parameters, calculates HMAC digest, and returns the amount of memory required for Scrypt computation as shown below:

// Execute for the first call when it returns memory size required by scrypt
	if (buf_p == NULL) {  		
// Pick parameters for scrypt and initialize the scrypt context
		// <...> 

		// Compute HMAC
		itp_res = itpHMACSHA256Message((unsigned char *)ctx_p, sizeof(scrypt_ctx)-								sizeof(ctx_p->hmac),
							hmac_key, sizeof(hmac_key),
							ctx_p->hmac, sizeof(ctx_p->hmac));

		*buf_size_p = (r << 7) * (p + (uint32_t)N) + (r << 8) + 253;
	}

Example 6: The first call of code protecting the Scrypt context

During the second call, the buf_p point to allocated memory is transmitted to the scrypt_ctx_enc_init function. Using an HMAC digest in the context, the function verifies integrity of the context and makes sure that no one has changed it between the first and the second calls. After that it initializes address inside the context with buf_p and recomputes the HMAC digest since the context has changed as shown below:

// Execute for the second call when memory for scrypt is allocated
	if (buf_p != NULL) {
		// Verify HMAC
		itp_res = itpHMACSHA256Message(
(unsigned char *)ctx_p, sizeof(scrypt_ctx)-sizeof(ctx_p->hmac),
			hmac_key, sizeof(hmac_key),
			hmac_value, sizeof(hmac_value));
		if (memcmp(hmac_value, ctx_p->hmac, sizeof(hmac_value)) != 0) {
			return -1;
		}

		// Initialize pointers to buffers for scrypt computation:
// ctx_p->addrs.B0 = …

		// Recompute HMAC
		itp_res = itpHMACSHA256Message(
			(unsigned char *)ctx_p, sizeof(scrypt_ctx)-sizeof(ctx_p->hmac),
			hmac_key, sizeof(hmac_key),
			ctx_p->hmac, sizeof(ctx_p->hmac));
	}

Example 7: Second call of code protecting the Scrypt context

From [2] we know that iprot imposes some restrictions on input code for it to be obfuscatable. It demands no relocations and no indirect jumps. Coding constructs in C with global variables, system functions, and C standard function calls can generate relocations and indirect jumps. The code in Example 7 calls one C standard function memcmp, which causes code incompatibility with iprot. For this reason we implement some of our own C standard functions such as memcmp, memset, and memmove used by the utility. Also, all global variables in the dynamic library are transformed into local variables and take care of the data initialized on the stack.

In addition, we encountered a problem with obfuscation of code with double values that is not covered by tutorials and is not documented in the Intel Tamper Protection Toolkit user guide. As shown below, in pickparams function salsa20/8 the core operation limit has double type and equals 32768. This value is not initialized on the stack, and a compiler puts the value into a data segment of the binary that generates relocation in the code.

	double opslimit;
#if defined(WIN_TP)
	// unsigned char d_32768[] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xE0, 0x40};
	unsigned char d_32768[sizeof(double)];
	d_32768[0] = 0x00;
	d_32768[1] = 0x00;
	d_32768[2] = 0x00;
	d_32768[3] = 0x00;
	d_32768[4] = 0x00;
	d_32768[5] = 0x00;
	d_32768[6] = 0xE0;
	d_32768[7] = 0x40;
	double *var_32768_p = (double *) d_32768;
#endif

	/* Allow a minimum of 2^15 salsa20/8 cores. */
#if defined(WIN_TP)
	if (opslimit < *var_32768_p)
		opslimit = *var_32768_p;
#else
	if (opslimit < 32768)
		opslimit = 32768;
#endif

Example 8: Code for iprot-compatible double variable

We solved this problem by initializing a byte sequence on the stack with a hex dump that matches the hex representation of this double value in memory and creates a double pointer to this sequence.

To obfuscate the dynamic library with iprot, we use the following command:

iprot scrypt-dll.dll scryptenc_file scryptdec_file scrypt_ctx_enc_init scrypt_ctx_dec_init -c 512 -d 2600 -o scrypt_obf.dll

The interface of the protected utility is not changed. Let us compare the unobfuscated code with the obfuscated version. The following shows the disassembled code with significant difference between the two versions.

# non-obfuscated code
scrypt_ctx_enc_init PROC NEAR
        push    ebp                              ; 10030350 _ 55
        mov     ebp, esp                         ; 10030351 _ 8B. EC
        sub     esp, 100                         ; 10030353 _ 83. EC, 64
        mov     dword ptr [ebp-4H], 0  ; 10030356 _ C7. 45, FC, 00000000
        mov     eax, 1                           ; 1003035D _ B8, 00000001
        imul    ecx, eax, 0                      ; 10030362 _ 6B. C8, 00
        mov     byte ptr [ebp+ecx-1CH], 1 ; 10030365 _ C6. 44 0D, E4, 01
        mov     edx, 1                           ; 1003036A _ BA, 00000001
        shl     edx, 0                           ; 1003036F _ C1. E2, 00
        mov     byte ptr [ebp+edx-1CH], 2 ; 10030372 _ C6. 44 15, E4, 02
        mov     eax, 1                           ; 10030377 _ B8, 00000001
        shl     eax, 1                           ; 1003037C _ D1. E0
        mov     byte ptr [ebp+eax-1CH], 3 ; 1003037E _ C6. 44 05, E4, 03
        mov     ecx, 1                           ; 10030383 _ B9, 00000001
<…>

# obfuscated code with default parameters
scrypt_ctx_enc_init PROC NEAR
        mov     ebp, esp                     ; 1000100E _ 8B. EC
        sub     esp, 100                     ; 10001010 _ 83. EC, 64
        mov     dword ptr [ebp-4H], 0        ; 10001013 _ C7. 45, FC, 00000000
        mov     eax, 1                       ; 1000101A _ B8, 00000001
        imul    ecx, eax, 0                  ; 1000101F _ 6B. C8, 00
        mov     byte ptr [ebp+ecx-1CH], 1    ; 10001022 _ C6. 44 0D, E4, 01
        push    eax                          ; 10001027 _ 50
        pop     eax                          ; 1000102D _ 58
        lea     eax, [eax+3FFFD3H]           ; 1000102E _ 8D. 80, 003FFFD3
        mov     dword ptr [eax], 608469404   ; 10001034 _ C7. 00, 2444819C
        mov     dword ptr [eax+4H], -124000508 ; 1000103A _ C7. 40, 04, F89BE704
        mov     dword ptr [eax+8H], -443981569 ; 10001041 _ C7. 40, 08, E58960FF
        mov     dword ptr [eax+0CH], 1633409 ; 10001048 _ C7. 40, 0C, 0018EC81
        mov     dword ptr [eax+10H], -477560832 ; 1000104F _ C7. 40, 10, E3890000
<…>

Example 9: Disassembled codes for non-obfuscated and obfuscated versions

Obfuscation degrades performance, and dynamic library size is significantly increased. The obfuscator allows developers to balance security versus performance using cell size and mutation distance. The current obfuscation uses 512-byte cell size and 2600-byte mutation distance. A cell is instruction subsequence from original binary. A cell in obfuscated code is encrypted until the instruction pointer is about to enter it. A decrypted cell gets encrypted back when it is fully executed.

The source code for the utility that the Intel Tamper Protection Toolkit helps protect will soon be available at GitHub.

Acknowledgments

We thank Raghudeep Kannavara for originating the idea about to apply Intel Tamper Protection Toolkit to the scrypt encryption utility and Andrey Somsikov for many helpful discussions.

References

K. Grasman. getopt_port on GitHub https://github.com/kimgr/getopt_port/
R. Kazantsev, D. Katerinskiy, and L. Thaddeus. Understanding Intel® Tamper Protection Toolkit and Scrypt Encryption Utility, Intel Developer Zone, 2016.
C. Percival. The Scrypt Encryption Utility. http://www.tarsnap.com/scrypt/scrypt-1.1.6.tgz
C. Percival and S. Josefsson (2012-09-17). The Scrypt Password-Based Key Derivation Function. IETF.
W. Shawn. Freebsd sources on GitHub https://github.com/lattera/freebsd

About the Authors

Roman Kazantsev works in the Software & Services Group at Intel Corporation. Roman has 7+ year of professional experience in software engineering. His professional interests are focused on cryptography, software security, and computer science. Currently he occupies a position of Software Engineer where his ongoing mission is to deliver cryptographic solutions and expertise for content protection across all Intel platforms. He received his Bachelor and Masters in Computer Science with honors at Nizhny Novgorod State University, Russia.

Denis Katerinskiy works in the Software & Service Group at Intel Corporation. He has 2 years of experience in software development. His main interests are programming, performance optimization, algorithm development, mathematics, and cryptography. In his current role as a Software Developer Engineer Denis develops software simulators for Intel architecture. Denis Katerinskiy currently pursues Bachelor in Computer Science at Tomsk State University.

Thaddeus Letnes works in the Software & Services Group at Intel Corporation. He has 15+ year of professional experience in software development. His main interests are low level systems, languages, and engineering practices. In his current role as a Software Engineer developing software development tools Thaddeus works closely with software developers, architects, and project managers to produce high quality development tools. Thaddeus holds a Bachelor’s degree in Computer Science from Knox College.