Click here to Skip to main content
14,365,441 members

Introduction to Web Assembly with C/C++

Rate this:
5.00 (8 votes)
Please Sign up or sign in to vote.
5.00 (8 votes)
8 Nov 2019CPOL
An introduction to web assembly using the C/C++ language, part 1. In this part, I introduce web assembly, walk you through setting up the development tools, and go through a couple of introductory programs.

Image 1

I've been taking advantage of Web Assembly lately. It is supported by all the major browsers, allows one to make use of already existing useful code that has been written for other environments, and provides some performance benefits over JavaScript. Web Assembly has a lot of potential and support and I'd like to introduce other developers to it. I'm going to be using C++ in this post. But by no means is this the only language in which someone can make use of Web Assembly. In this post, I talk about why someone might want to consider web assembly and how to get a development environment setup.

What is Web Assembly?

Web Assembly is a specification for a virtual machine that runs in the browser. Compared to the highly dynamic JavaScript, Web Assembly can achieve much higher performance. Contrary to popular misconception though, Web Assembly doesn't completely replace JavaScript. You will probably use the two together. Web Assembly is based on LLVM (Low Level Virtual Machine), a stack based virtual machine that compilers can target. If someone wanted to make a new programming language, they could have the compiler for their language produce LLVM code and then use an already existing tool chain to compile it to platform specific code. A person building a compiler for a new language wouldn't need to make completely separate systems for different CPU architectures. Web Assembly being LLVM based could run code that was written by a variety of languages. Currently, there isn't support for garbage collection yet which restricts the languages that target it presently. C/C++, C#, and Rust are a few languages that can be used with Web Assembly presently with more expected in the future.

What Other Languages Can I Use?

  • C/C++ - I'll be using that language in this article
  • C#/.NET - I've got interest in this one and will write about it in the future.
  • Elixir
  • Go
  • Java
  • Python
  • Rust - This is a newer language

Why Use Web Assembly?

I suggest Web Assembly primarily for the performance benefits in computationally expensive operations. The binary format it uses is much more strict than JavaScript and it is more suitable for computationally intensive operations. There is also a lot of existing and tested code for work such as cryptography or video decoders that exist in C/C++ that one might want to use in a page. Despite all its flexibility, interpreted JavaScript code doesn't run as fast as a native binary. For some types of applications, this difference in performance isn't important (such as in a word processor). For other applications, differences in performance translate into differences in experiences.

While the demand for performance is a motivation to make a native binary, there are also security considerations. Native binaries may have access to more system resources than a web implemented solution. There may be more concern with ensuring that a program (especially if it is from a third party) doesn't do anything malicious or access resources without permission. Web Assembly helps bridge the gap between these two needs; it provides a higher performance execution environment within a sandbox.

WebAssemblySupport

C++? Can't I Cause a Buffer Overflow With That?

Sure. But only within the confines of the sandbox in which the code will run. It could crash your program, but it can't cause arbitrary execution of code outside the sandbox. Also note that presently Web Assembly doesn't have any bindings to Host APIs. When you target Web Assembly, you don't have an environment that allows you to bypass the security restrictions in which JavaScript code will run. There's no direct access to the file system, there's no access to memory outside of your program, you will still be restricted to communicating with WebSockets and HTTP request that don't violate CORS restrictions.

How Do I Setup a Developer Environment

There are different versions of instructions on the Internet for installing the Web Assembly tools. If you are running Windows 10, you may come across a set of instructions that start with telling you to install the Windows Subsystem for Linux. Don't use those instructions; I personally think they are unnecessarily complex. While I have the Windows Sub System for Linux installed and running for other purposes that's not where I like to compile my Web Assembly code.

Using your operating system of choice (Windows 10/8/7, macOS, Linux) clone the Emscripten git repository, run a few scripts from it, and you are ready to go. Here are the commands to use. If you are on Windows, omit the ./ at the beginning of the commands.

git https://github.com/emscripten-core/emsdk.git
cd emsdk
git pull
./emsdk install latest
./emsdk activate latest

With the tools installed, you will also want to set the some environment variables. There is a script for doing this. On Windows 10, run:

emsdk_env.bat

For the other operating systems, run:

source emsdk_env.sh

The updates that this makes to environment variables isn't persistent; it will need to be run again with the next reboot. For an editor, I suggest using Visual Studio Code. I'll be compiling from the command line in this article. Feel free to use the editor of your choice.

Web Assembly Explorer

I don't use it in this tool within this article, but Web Assembly Explorer is available as an online tool for compiling C++ into Web Assembly and is an option if you don't have the tools installed. https://mbebenita.github.io/WasmExplorer/

Hello World

Now that we have the tools installed, we can compile and run something. We will do a hello world program. Type the following source code and save it in hello.cpp.

#include 
int main(int argc, char**argv) 
{
     printf("Hello World!\n");
    return 0;
}

To compile the code from the command line, type the following:

emcc hello.cpp -o hello.html

After the compiler runs, you will have three new files:

  • hello.wasm - the compiled version of your program
  • hello.html - an HTML page for hosting your web assembly
  • hello.js - JavaScript for loading your web assembly into the page

If you try to open the HTML file directly, your code probably will not run. Instead, the page will have to be served through an HTTP server. If you have node installed, use the node http-server. You can install the http-server with:

npm install  http-server -g

Then, start the server from the directory with your hello.html:

http-server . -p 81

Here, I've instructed the http-server to run on port 81. You can use the port of your choice here provided nothing else is using it. Remember to substitute the port that you chose throughout the rest of these instructions.

Open a browser and navigate to http://localhost:81/hello.html. You'll see your code run. If you view the source for the page, there is a lot of "noise" in the file. Much of that noise is from the displayed images being embedded within the HTML. That's fine for playing around. But you will want to have something customized to your own needs.

We can provide a shell or template file for the compiler to use. Emscripten has a minimal file available at https://github.com/emscripten-core/emscripten/blob/master/src/shell_minimal.html. Download that file. It will be used as our starting point. It is convenient for the sake of distribution for everything to be in one file. But I don't like the CSS and JavaScript being embedded within the file. The CSS here isn't needed and is being deleted. I'm moving the JavaScript to its own file and added a script references to it in my HTML. There are several items within the HTML and the script that are not necessarily needed. Let's look at the script first and start making this minimal file even more minimalist.

At the top of the script, there are three variables to page elements to indicate download and progress. Those are not absolutely necessary. I'm deleting them. I need to delete references to them too. Lower in the JavaScript is a method named setStatus . I'm replacing its body with a call to console.log() to print the text that is passed to it. The first set of programs that I'm going to write won't use a canvas. The element isn't needed for now; I'm commenting it out instead of deleting it so that I can use it later. Having deleted the first three lines of this file and code that references them, I'm returning to the HTML. Most of it is being deleted. I've commented out the canvas reference. There is a line in the HTML file with the text {{{ SCRIPT }}}. The compiler will take this file as a template and replace {{{ SCRIPT }}} with the reference to the script specific to our Web Assembly file.

<!doctype html>
<html lang="en-us">
  <head>
    <meta charset="utf-8">
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Emscripten-Generated Code</title>
    <link rel="stylesheet" href="./styles/emscripten.css" />
    <script src="scripts/emscripten.js"></script>
     </head>
  <body>
    <!--
    <div class="emscripten_border">
      <canvas class="emscripten" id="canvas"
      oncontextmenu="event.preventDefault()"></canvas>
    </div>
    -->
    <textarea class="emscripten" id="output" rows="8"></textarea>
    {{{ SCRIPT }}}
  </body>
</html>

When the Web Assembly program executes a printf(), the text will be written to the textarea element. I place my hello.cpp file among these files and then compile it with the following command.

emcc hello.cpp --shell-file shell_minimal.html -o hello.html

The --shell-file argument indicates what file to use as a template. The -o parameter tells the name of the HTML file to write to. If you look at hello.html, you can see it is almost identical to the input template. Run the site now and you'll see the same result, but with a much cleaner interface. Run the program again and you will see the same result with a much cleaner interface.

Binding Functions

I earlier mentioned that Web Assembly doesn't have any bindings to any operating system functions. It also doesn't have bindings do the browser. Nor does it have access to the DOM. It is up to the page that loads the web assembly to expose functions to it. In emscripten.js, the Modules object defines a number of functions that are going to be made available to the Web Assembly. When the C/C++ code calls printf, it will be passed through the JavaScript function defined here of the same name. It isn't a requirement that the names be the same, but it is easier to keep track of function associations if they are.

Calling C/C++ From JavaScript

But what if you have your own functions that you wish to bind so that your JavaScript code can call the C++ code? The Module object has a function named ccall that can be used to call C/C++ code from JavaScript and another function named cwrap to build a function object that we can hold onto for repeated calls to the same function. To use these functions, some additional compile flags will be needed.

To demonstrate the use of both of these methods of calling C/C++ code from JavaScript, I'm going to declare three new functions in the C++ code.

  • void testCall() - accepts no parameters and returns no value. This method only prints a string so that we know that our call to it was successful.
  • void printNumber(int num) - accepts an integer argument and prints it. This lets us know that our value was successfully called.
  • int square(int c) - accepts an integer and returns the square of that integer. This lets us see that a value can be returned back from the code.

The C++ language performs what is called name mangling; the names of the functions in the compiled code is different than the uncompiled code. For the functions that we want to use from outside the C++ code, we need to wrap declarations for the functions in an extern "C" block. If our code were being written in C instead of C++, this wouldn't be necessary. I still prefer C++ because of some of the features that the language offers. Normally, I would have a declaration such as this in a header file. But for now, my C++ program is in a single file. Close to the top of the program, I make the following declarations:

extern "C" {
    void testCall();
    void printNumber(int f);
    int square(int c);
}

The implementation for the functions is what you would expect.

void testCall() 
{
    printf("function was called!\n");
}

void printNumber(int f) {
    printf("Printing the number %d\n", f);
}

int square(int c)
{
    return c*c;
}

There's a change to my main method too. I've had to include a new header file, emscripten.h, because I am about to use one of the functions that it provides. In main, added the following line.

EM_ASM ( InitWrappers());

It will result in a JavaScript function named InitWrappers() to get called. I will talk about how EM_ASM works in a following section. I'm adding a third <script /> element to my HTML file. The first element contains code that was provided my Emscripten. The second is the one that is inserted where {{{ SCRIPT }}} exists within the template. The third script tag follows. The third script tag references the JavaScript that contains the InitWrappers function.

var testCall;
var printNumber;
var square;

function InitWrappers() {
  testCall = Module.cwrap('testCall', 'undefined');
  printNumber = Module.cwrap('testCall', 'undefined', ['number']);
  square= Module.cwrap('square', 'number', ['number']);      
}

I've declared three variables that will be used to hold the function objects. They are populated by the return values of the cwrap calls. In the first cwrap call, the arguments are the name of the C/C++ function to call and the return type. This function isn't returning any value which is why its return type is set to 'undefined'. In the second call, an additional argument is passed; the types of each of the arguments in a list. This function only takes one argument and needs a list with only one element. In the third call, the argument for the return time is set to 'number' since this method will return a numerical value. To call the functions, I'm adding some JavaScript to the onclick events.

The compile statement is different for this code. A few of these changes are optional. But I will explain all of them.

emcc hello.cpp --std=c++11  --shell-file shell_minimal.html 
--emrun -o hello.html -s NO_EXIT_RUNTIME=1 
-s EXPORTED_FUNCTIONS="['_testCall', '_printNumber','_square','_main']" 
-s EXTRA_EXPORTED_RUNTIME_METHODS="['cwrap','ccall']"  -s WASM=1
  • --std=c++11 - I'm using this argument from hereon to enable C++ 11 language features
  • --shell-file shell_minimal.html - the name of the shell HTML file to use
  • --emrun
  • -o hello.html - the name of the output html file to produce
  • -s NO_EXIT_RUNTIME=1 - prevents the runtime from shutting down when the main function exits.
  • -s EXPORTED_FUNCTIONS="['_testCall', '_printNumber', '_square']" - These are the names of the methods that will be added to the Module object from our code.
  • -s EXTRA_EXPORTED_RUNTIME_METHODS="['cwrap','ccall']" - these are the names of runtime methods that will be added to the Modules object
  • -s WASM=1 emit Web Assembly. Setting this to 0 will cause ASM.js to be emitted instead (something not discussed here).

Calling JavaScript from C/C++

We've already been calling JavaScript from C/C++ implicitly. But let's look at how to explicitly call JavaScript from C/C++. There are two ways of doing this; you can embed JavaScript code directly within your C/C++ code or you can use the function emscripten_run_script(). If you've ever embedded assembly language in C++ code, then first of these two methods will not look completely foreign to you.

If there is a block of JavaScript code that you want to repeatedly use within your C++ code, you can write a function in JavaScript using EM_JS.

EM_JS(void,myAlert,(), {
     alert('hey, I am alerting you!');
     console.log('you have been alerted.')'
});

int main() { 
   myAlert();
   return 0;
}

A new function named myAlert() is made available because of this call. If JavaScript code is being defined to only be used once, it can be written inline using EM_ASM:

int main() { 
   EM_ASM(
     alert('hey, I am alerting you!');
     console.log('you have been alerted.')'
   );
   return 0;
}

I would advise against embedding a lot of code within your C/C++. It may be better at most to embed a JavaScript function call; if code needs to be updated, it will be easier to update the JavaScript function than to make the change in the C/C++ code and rebuild.

Sun Position in C++

I wanted to show an example that was doing something non-trivial before closing Part 1 of this article. I've got an interest in astronomical calculations. I've decided to take a C++ routine for calculating the sun position and use it in a web page. After a quick Google search, I found this:

I've got to make some changes to use it, but not a lot. The original routine gathered input directly in main. I don't need to do much of anything in main. I also don't want to use the cin object; it results in the input dialog displaying. Instead, I want the parameters to be passed in via a routine. I will leave the cout calls in place; they

The main function will only initialize the wrappers in my modification of the code. I've made a new main function that calls the JavaScript function to perform initialization.

int main(void){
    EM_ASM ( InitWrappers());
    return 0;
}

What had been the main function is being renamed to getSunInformation. I'm passing in the latitude, longitude, and time zone information and am deleting the previous usage of cin to prompt the user for this information.

void getSunInformation(double latit, double longit, double tzone);

I need to also get information out of this call. While there is more than one way to do this, I'm going to take an easy option for now; I'll have the C++ code call JavaScript code passing the parameters. I can use EM_ASM to do this. In the earlier use of this function, I was invoking functions. Now I need to pass data. The JavaScript declared within EM_JS is in a different scope than the C++. It has no visibility on the variables within the C++ code. Any information that we want passed to the JavaScript can be passed in parameters. This information is available in the JavaScript through variables that start with a dollar sign followed by a number for the position parameter. The first parameter is $0, the next $1, the third $2, and so on.

EM_ASM (
    sunParameters($0,$1,$2, $3, $4, $4, $5, $6, $7);
    sunNoonParams($8, $9);
    sunCurrentPosition($10,$11);
    ,year,m,day, jd, latit, longit, tzone, delta*degs, daylen
    ,noont, altmax,
    azim*degs,altit
);

I am using three functions we haven't seen yet. The functions sunParameters, sunNoonParameters, and sunCurrentPosition haven't been defined yet. I made a new JavaScript file that will contain these. The emscripten generated JavaScript file is named azimalt.js. My JavaScript file will be appended to this one; I've named it azimAltPost.js. In this file, I define the InitWrappers function and the three sun functions that were earlier mentioned. For now, the sun functions will write the parameters that they receive to the console. The two values passed to getSunInformation are a latitude and longitude for the Atlanta, Georgia, USA area. If you run the code yourself, you may want to change these.

var getSunInformation;

function InitWrappers() { 
    getSunInformation = Module.cwrap('getSunInformation', 
                        'undefined', ['number','number','number']);
    getSunInformation(-84,34,-5);
}

function sunParameters(year, month, day, julianDate, latitude, longitude, 
                       timeZone, delta, dayLength) {
    console.log(`${year}-${month}-${day}`);
    console.log(`Julian Date:${julianDate}`);
    console.log(`Latitude:${latitude}, Longitude:${longitude}`);
    console.log(`Time Zone: ${timeZone}`);
    console.log('Delta: ${delta})');
    console.log('Daylength: ${dayLength}')
}

function sunNoonParams(noont,altmax) {
    console.log(`Noont: ${noont}, Altitude Max:${altmax}`);
}

function sunCurrentPosition(azim, alt) {
   console.log(`Azimuth: ${azim} Altitude:${alt}`);
}

I'm introducing a new compile parameter here, --post-js. This parameter names a JavaScript file whose content is to be run after the emscripten generated code. I'll be passing my JavaScript file as the value for this argument. The full command line that I'm using to compile this follows:

emcc azimalt.cpp -o azimalt.html --post-js azimaltPost.js 
-s NO_EXIT_RUNTIME=1 -s EXPORTED_FUNCTIONS="['_getSunInformation', '_main']" 
-s EXTRA_EXPORTED_RUNTIME_METHODS="['cwrap','ccall']" -s WASM=1

Open the HTML file (make sure it is being served from a web server) and look at the output console. You should see the information on sun rise and sun set for your area.

Customizing the Sunrise and Sunset Presentation

The program works, but let's add something graphical to it. I want to add a 24-hour analog clock that at a glance shows the sunrise, sunset, and current position of the sun with respect to the two. I also want a display for a compass to show the azimuth and a graphic to show the altitude angle. I'll use SVG for the graphics. Many of the elements can be declaratively. I came up with the following UI.

SunriseClock

Before integrating it with the web assembly, I added a few range sliders and JavaScript to make sure that the elements moved the way that I expected them to. Once it was working, I copied the WASM scripts into my new HTML file and it graphically showed me the sunrise and sunset times for today. If you would like to see it yourself, it is available at https://j2i.net/apps/sunrise/. When I tried hosting it on a web server, it initially failed because the server software didn't recognize the WASM extension. If you encounter such a problem, register the mime type for the extension. It is application/wasm.

Other Supported Libraries

But either running emcc --show-ports or looking at https://github.com/emscripten-ports, you can see some of the other APIs that the enscripten compiler supports. At the time of writing, this is the output from running the command on the command terminal.

c:\shares\sdks\emsdk>emcc --show-ports
Available ports:
Boost headers v1.70.0 (USE_BOOST_HEADERS=1; Boost license)
icu (USE_ICU=1; Unicode License)
zlib (USE_ZLIB=1; zlib license)
bzip2 (USE_BZIP2=1; BSD license)
libjpeg (USE_LIBJPEG=1; BSD license)
libpng (USE_LIBPNG=1; zlib license)
SDL2 (USE_SDL=2; zlib license)
SDL2_image (USE_SDL_IMAGE=2; zlib license)
SDL2_gfx (zlib license)
ogg (USE_OGG=1; zlib license)
vorbis (USE_VORBIS=1; zlib license)
SDL2_mixer (USE_SDL_MIXER=2; zlib license)
bullet (USE_BULLET=1; zlib license)
freetype (USE_FREETYPE=1; freetype license)
harfbuzz (USE_HARFBUZZ=1; MIT license)
SDL2_ttf (USE_SDL_TTF=2; zlib license)
SDL2_net (zlib license)
cocos2d
regal (USE_REGAL=1; Regal license)

c:\shares\sdks\emsdk>

When you use one of these libraries, the emscripten compiler will retrieve the library, built it locally, and link it to your project.

What's in Part 2

I have some ideas about what to write about in part 2 of this series such as how to deal with complex data types and binding to classes. But I want to hear from you. What would you like to see? Share your ideas and questions in the comments; I'd like to respond to some of them in the next post in this series.

History

  • 8th November, 2019 - Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Joel Ivory Johnson
Software Developer
United States United States
I attended Southern Polytechnic State University and earned a Bachelors of Science in Computer Science and later returned to earn a Masters of Science in Software Engineering. I've largely developed solutions that are based on a mix of Microsoft technologies with open source technologies mixed in. I've got an interest in astronomy and you'll see that interest overflow into some of my code project articles from time to time.



Twitter:@j2inet

Instagram: j2inet


Comments and Discussions

 
QuestionComplex arguments Pin
olxil13-Nov-19 11:31
professionalolxil13-Nov-19 11:31 
AnswerRe: Complex arguments Pin
Joel Ivory Johnson14-Nov-19 9:15
professionalJoel Ivory Johnson14-Nov-19 9:15 
Questionpath https://github.com/emscripten/ Pin
steppeman12-Nov-19 9:16
membersteppeman12-Nov-19 9:16 
AnswerRe: path https://github.com/emscripten/ Pin
Joel Ivory Johnson13-Nov-19 5:20
professionalJoel Ivory Johnson13-Nov-19 5:20 
Questiongetting data Pin
chris Bruner11-Nov-19 9:07
memberchris Bruner11-Nov-19 9:07 
AnswerRe: getting data Pin
Joel Ivory Johnson14-Nov-19 8:52
professionalJoel Ivory Johnson14-Nov-19 8:52 
GeneralRe: getting data Pin
chris Bruner14-Nov-19 9:12
memberchris Bruner14-Nov-19 9:12 
Questiongood introduction Pin
chris Bruner11-Nov-19 9:05
memberchris Bruner11-Nov-19 9:05 
AnswerRe: good introduction Pin
Joel Ivory Johnson11-Nov-19 9:42
professionalJoel Ivory Johnson11-Nov-19 9:42 
QuestionGreat job Pin
tgueth11-Nov-19 8:39
professionaltgueth11-Nov-19 8:39 
AnswerRe: Great job Pin
Joel Ivory Johnson11-Nov-19 9:41
professionalJoel Ivory Johnson11-Nov-19 9:41 
GeneralRe: Great job Pin
tgueth11-Nov-19 9:56
professionaltgueth11-Nov-19 9:56 
PraiseVery cool article Pin
Richard FR11-Nov-19 8:31
memberRichard FR11-Nov-19 8:31 
GeneralRe: Very cool article Pin
Joel Ivory Johnson11-Nov-19 9:40
professionalJoel Ivory Johnson11-Nov-19 9:40 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Article
Posted 8 Nov 2019

Tagged as

Stats

2.9K views
13 bookmarked