Optimizations that You Have to Wring from WebAssembly

Shao Voon Wong

0/5 (0 vote)

Aug 11, 2020

CPOL

3 min read

5238

Optimizations that you have to make to wring the performance out of WebAssembly

Introduction

First of all, I want to put up a disclaimer: I am not a WebAsssembly expert. All the five tips mentioned in this article are gained from my C++ OpenGL slideshow application. I have to admit I have no working knowledge on Blazor, an implementation of Webassembly for C# and .NET. And I am not exactly sure if the tips are relevant to Blazor, most likely they do not apply to Blazor. Let us get started now.

Inline Shader Code with C++11 string Literals

In a typical OpenGL project, shader code is stored separately in the file from the C++ source code. For the uninitiated, WebGL 1.0 standard is based on OpenGL 2.0 ES and these two are very similar in that you can translate OpenGL 2.0 ES calls into WebGL 1.0 one to one parity. For OpenGL 2.0 ES and its corresponding WebGL 1.0, there are two types of shaders, namely vertex shader and fragment shader. In Direct3D terminology, fragment shader's counterpart is pixel shader but this name is not exactly right because this shader does not operate on pixel but visible texels (short for texture pixels). But most people prefer the name pixel shader as opposed to fragment shader. Let me show you a simple vertex shader followed by the fragment shader.

attribute vec2 a_texCoord;
varying vec2 v_texCoord;
void main()
{
    gl_Position = WorldViewProjection * vec4(a_position, 1.0);
    v_texCoord = a_texCoord;
}

This is a simple fragment shader with a float variable called s_alpha which controls the transparency of texel.

varying vec2 v_texCoord;
uniform sampler2D s_texture;
uniform float s_alpha;
void main()
{
    vec4 color = texture2D( s_texture, v_texCoord );
    color.a = color.a * s_alpha;
    gl_FragColor = color;
}

These are the same previous shaders stored in vert_shader and frag_shader variables in classic C++ string literals. Notice every line is enclosed in quotes and ended with a newline!

const char* vert_shader = 
"vert(uniform mat4 WorldViewProjection;                                \n"
"attribute vec3 a_position;                                            \n"
"attribute vec2 a_texCoord;                                            \n"
"varying vec2 v_texCoord;                                              \n"
"void main()                                                           \n"
"{                                                                     \n"
"    gl_Position = WorldViewProjection * vec4(a_position, 1.0);        \n"
"    v_texCoord = a_texCoord;                                          \n"
"}                                                                     \n";

const char* frag_shader = 
"frag(varying vec2 v_texCoord;                                         \n"
"uniform sampler2D s_texture;                                          \n"
"uniform float s_alpha;                                                \n"
"void main()                                                           \n"
"{                                                                     \n"
"   vec4 color = texture2D( s_texture, v_texCoord );                   \n"
"   color.a = color.a * s_alpha;                                       \n"
"   gl_FragColor = color;                                              \n"
"}                                                                     \n";

These are the same two shaders stored in vert_shader and frag_shader variables, this time in modern C++11 string literals. You can see the code is cleaner. By inlining the shader code, the application does not have to download and handle the shader code separately, you save two download connections for every OpenGL object which can add up to many. For my application, I am saving 50 downloads. Why inlining the shader into C++ code? Most of the time, when you change the shader code, more often than not, you also have to modify the C++ code that interacts with it. The only downside I could see to inlining is when there are many lines of shader code and a compilation error occur, the developer may have a hard time discerning out which is the offending line number.

const char* vert_shader = 
R"vert(uniform mat4 WorldViewProjection;
attribute vec3 a_position;
attribute vec2 a_texCoord;
varying vec2 v_texCoord;
void main()
{
    gl_Position = WorldViewProjection * vec4(a_position, 1.0);
    v_texCoord = a_texCoord;
}
)vert";

const char* frag_shader = 
R"frag(varying vec2 v_texCoord;
uniform sampler2D s_texture;
uniform float s_alpha;
void main()
{
    vec4 color = texture2D( s_texture, v_texCoord );
    color.a = color.a * s_alpha;
    gl_FragColor = color;
}
)frag";

Run on GPU Whenever Possible

sinewave

Whenever it is possible, write as much code to run on the GPU, instead of the CPU. My animating sinewave is calculated on the GPU. Asm.js, at the time of writing, was strictly only single-threaded and for GPUs, even the low-ended ones have lots of simple threads to spread the floating-point calculations among themselves. See the amount of bytes generated each second below: When the sinewave movement calculation is done on CPU, a total of 1,536,000 bytes has to be sent to the GPU on every second.

1 float = 4 bytes
1 vertex = 3 floats = 12bytes
1 quad = 4 vertex = 12 floats = 64bytes
1 quad = 64bytes/2 = 32bytes
800 quad = 800*32 = 25600 bytes
60 frames per second = 60 * 25600 = 1536000

Video Link of Sinewave

Compression

Compress your assets/wasm file with gzip compression. Do not compress files that are already compressed, like JPEG and PNG. Images can be appended into one big file and loaded into memory if the image library supports loading from memory with the correct offsets.

In-situ Decryption

To protect the assets from being stolen, use an encryption algorithm that can decrypt files quickly and in place, instead of decrypting into a new destination file. Note simple encryption, though fast, can only keep the casual end-user at bay, it does not totally prevent determined hackers from reverse engineering your source code to find out the encryption key to steal your assets.

Not Using STL Streams

Do not include headers like iostream and sstream. It bloats your executable size by 100KB in wasm and 400KB in asm.js. Replace calls to std::cout and std::ostringstream with printf and sprintf respectively.

History

11^th August, 2020: Initial version