Before you start
Although not a necessity, it is advised that you first read the article I wrote about Workflow 4 persistence and tracking (WF4Extensions.aspx). This will allow you to compare between self hosting (discussed in that article) and AppFabric hosting (discussed in this article).
Also, in order to get full advantage of this article, you have to download and setup the attached projects on your machine (VS2010 is needed). The article content relies on you having the projects up and running.
Basic knowledge of Windows Workflow Foundation and Windows Communication Foundation is assumed. Also, basic familiarity with IIS 7 and WAS is required.
You should also be comfortable dealing with Visual Studio and setting up projects and solutions.
AppFabric installation and configuration is an easy wizard-process, and will not be covered here. The focus of this article is seeing AppFabric in action and discovering its bits and pieces in running examples.
All examples discussed in this article require VS2010 and Windows Server AppFabric.
If you have ever done composite web applications development, then you are definitely familiar with the associated problems. Consider this scenario: you want to build an ASP.NET application calling into a business process built using WF services (a WF business process published as a WCF Service). Now, let's check the challenges and requirements that you will usually face with the WF service:
- Hosting: in WCF, you have various hosting options. Granted building a robust and reliable host is a non-trivial task, so you usually will rely on IIS and WAS. However, while WAS provides enhanced hosting experience for your services, you still want more control than WAS can provide.
- Management: Managing your service via configuration files is common but error prone. The ability to manage a service in a UI on top of IIS and WAS is a very appealing requirement.
- Monitoring: Your WF service is a business process, and you would definitely like to be able to step in and monitor the progress and business flow of a certain instance.
- Tracking: Tracking instance variables in various stages of the business process gives you an insight of the ongoing progress
- Persistence: When designing long running processes, persistence becomes quite a common requirement. At certain stages of a business process, you want to persist (serialize and store in a medium) the state in SQL Server, for example. This way, you can unload the process - which is waiting for an input, for example - and reload it again from the same medium.
- Scaling: Persistence gives the ability to scale your process. An instance running on machine A can be persisted in SQL Server, only later to be reloaded again on machine B, and continue execution from the last persistence point.
- Caching: Caching in web applications has a common problem: scalability. Caching is an in-memory property of a single machine. So if a load balancer bounces a user request from machine A to machine B, any in-memory data in machine A will be lost in machine B. Solving this scalability issue has been only possible via third party solutions so far.
Windows Server AppFabric provides a set of Windows Server extensions that act as an infrastructure for building composite web applications. In more practical terms, these extensions help tackle all the above requirements.
The remaining of the article will show AppFabric in action and, through demos, will tackle and elaborate on each and every one of these requirements.
Presenting the First Demo: Hello WF Service
The first demo (attached as HelloWFService.zip) includes three projects:
- HelloService: A simple WCF Service that exposes a single operation which accepts a string parameter and returns a string value.
- HelloWFService: This is the workflow service containing the business process. It is built using the WCF Workflow Service Application project template.
- TestClient: Console application that calls into the workflow service.
Let's examine the business process from the HelloWFService project:
The TestClient console application calls into the service operation SubmitOrder, passing a string parameter. SubmitOrder returns a message to the console application and its job ends there. The workflow service then continues execution and calls into the activity ProcessOrder, which calls the WCF service (HelloService), passing in the same parameter submitted by the console application.
The ProcessOrder activity is created automatically in the VS Toolbox when you add a service reference to the WCF service and build the project.
AppFabric First Look
When you deploy the above solution into your machine, you will end up with two WCF services hosted on IIS; the HelloService WCF Service and the HelloWFService WCF Workflow Service.
Open IIS manager (alternatively, with AppFabric installed, you can go to All Programs -> Windows Server AppFabric -> IIS Manager). You will notice a new AppFabric tab as follows:
The tab contains three icons:
- AppFabric Dashboard: the place to monitor all your service activities...you will get to know this place well
- Endpoints: a place to configure the WCF endpoints exposed in your services
- Services: a place to configure the service files of your WCF projects
If you click on "Default Web Site" as shown in the figure, then the data you see in the AppFabric tab will correspond to all the services hosted under the default website. If, however, you click on a single service, then the data corresponds only to that particular service.
Now, let's start configuring the services. Right click HelloWFService and select "Manage WCF and WF Services -> Configure".
Select the Monitoring tab. Here you specify the level of monitoring you want for your service and the source where to store the monitoring events. In the image below, I have selected the database created in the AppFabric configuration wizard and selected Health Monitoring level which is enough for the level of monitoring I want in this example. Be careful in this step as the higher the monitoring level, the more overhead you have in your application. Select Troubleshooting level when in development and testing phase. Health Monitoring or even Errors Only should be enough for production environments.
In the Workflow Persistence tab, select Custom or None. We will cover persistence later.
No more configuration is required for the sake of this discussion; some other configuration options will be covered as we go on.
Now let's run the example: build the solution and fire the console application "TestClient", and wait until you get the message "Your order is under process". Let's quickly recap what has just happened: the console application called into the workflow service and got the message back; behind the scenes - something which is not visible to the console application client - the workflow service must have called the WCF Service.
Open the AppFabric dashboard and examine the stats:
So in total, we have two WCF calls and one WF Service call. The two WCF calls are actually one for the HelloService WCF Service and one for the HelloWFService workflow service. The WF service call is the HelloWFService (the WF Instance History is a subset of the WCF Call History).
AppFabric has tracked for us service call activities. You can also see if there are any errors and the completed vs. the non completed service calls. You can also drill into more details; right click the WF Service in the dashboard and select "Tracked WF Instances" as shown below:
In the result screen, you will see more details about the WF instance as shown below:
What you have seen so far is great reporting, but recall that we have configured our service for monitoring also (using the Health Monitoring level). To see that in action, right click "Service1" and select "View Tracked Events" (you can also access monitoring from the first dashboard page). You will now get the screen below:
As you can see, here you get to monitor the flow of the business process by examining the name of the shapes and order of execution.
Persistence is a very important concept when building long running business processes. At certain points in your service, you want to persist (serialize and store into a medium - usually database) the instance state so that in case of failure, you can resume the instance from the last persistence point.
In this example, we will see persistence in action. Another variation of the same concept is "Unloading". While Persistence "alone" means persist but keep the instance in memory, Unload means persist and remove the instance from memory. This, of course, opens the possibility of scaling your service because an instance can be flushed out of memory on one machine only to pick up execution on another machine. Unloading will be covered in a later example.
So in order to see persistence in action, go back to Visual Studio and open "Service1.xamlx" and do this change: check the "
PersistBeforeSend" property of the "SendResponse" shape. This simply means that just before sending the response (to the console application), the instance state will be persisted; however, as just explained, the instance itself will keep executing.
The other change is in IIS: select the HelloService application and click "Stop Application" in the "Manage WCF and WF Services" section.
Finally, we need to configure our WF Service for persistence. From IIS, right click "HelloWFService" and select "Manage WCF and WF Services -> Configure". From the Persistence tab, select the default persistence database which you have configured using the AppFabric Configuration Wizard, as shown below:
Hint: when you are in the development/testing phase, you might find that pilling up stats in AppFabric will make it difficult for you to focus on a certain scenario. If you need to (as I always do in development), you can clean up the AppFabric databases in order to start fresh. This post (http://thedotnethub.blogspot.com/2010/05/clean-appfabric-databases.html) on my blog shows how to do so.
With everything set, build the solution and run the console application again.
What will happen in this case? The console client calls the WF Service which - just before sending a response - persists the state and continues execution. Next, the WF Service tries to call the WCF Service which is made down so an error should be thrown. Now, let's examine the AppFabric dashboard and see what's going on:
The failed call to the WCF Service is logged in the Failures section of the WF Instance History, and is set in the Non Recovered state. The other important thing to notice is the Persisted WF Instances section; the WF Service instance is persisted. Right click the suspended instance and select "Persisted WF Instances", as shown below:
Next, you will get to see the persisted WF instance in the "Suspended" state. The great thing about this is that you can right click on this instance and select "Resume", as shown below:
However, just before doing that, restart the WCF application. From IIS, select the HelloService application and click "Start Application".
Now resume the WF instance. Wait for a couple of seconds (until the Windows Service kicks in) and refresh to see that the instance has resumed execution and finished successfully. Since the last persisted point (the only one actually) was just before sending the response to the console application - after which the fail happened when calling the WCF service - the instance picks up from there and tries to recall the WCF application. Since we have just restarted the WCF application, this time, the call succeeds and the WF instance completes successfully. If you notice the WF Instance History section, you will see that the instance has moved from the "Not Recovered" to the "Recovered" state.
Presenting the Second Demo: Hello WF Service
The second demo (attached as SaleService.zip) was originally provided as part of the AppFabric Beta 2 Samples, but I tweaked it a little bit for the sake of this article. It contains three projects:
- SaleService: The WF service containing the business process built using the WCF Workflow Service Application project template.
- TestClient: A Console client that consumes the WF Service (in order to display catalog info - the complete process will be explained next).
- TestClient2: A Console client that also consumes the WF Service (to book or cancel the catalog order).
The business process is shown below (it's just too large to expand it all and view in one shot, but you can expand each section by double clicking on it):
The process starts when a client (TestClient console application) asks to browse through a set of catalog information (no database here, the information is just hardcoded in the process itself). After that, the process waits for a minute; during this time, another client (TestClient2 console application) has to reply to confirm the purchase. If the one minute passes by without any invocation from TestClient2, then the business process terminates.
Like I said at the start, workflow development is not covered here, and basic knowledge is required, so I am not covering the details of the business process shapes. However, using the description I just gave, going over the process, and viewing the shapes should be enough for you to understand what is going on in details.
Build and deploy the solution; you will get an IIS application by the name of "SaleService" as configured via the Web tab in Visual Studio project properties.
Just like we did in the first demo, configure the project for Health Monitoring. As for persistence, we will do something new here. In the first example, we just configured the persistence store through AppFabric and used the WF designer (by setting the
PersistBeforeSend property). Here, we will additionally use AppFabric to set an unloading time for our WF instance. First set the persistence store as shown below:
Then set the unloading time as shown below:
We have just instructed our workflow service to unload itself after 20 seconds of inactivity. Now the following will happen: the first console client will issue a request to browse the catalog. The WF process is configured to wait for one minute until it receives a second request from the second console to confirm the request. If no second request is issued, the process will terminate itself - which is what we will do in this example.
Run the first console application, and you will see a list of products as shown below:
Now switch to the AppFabric dashboard and you should see your requests logged and the process is in the "In Progress" state. Moreover, wait for 20 seconds and you will see the persistence instance also logged in the dashboard. Why? Because we have configured our service to unload itself after 20 idle seconds. This is shown below:
If you click the WF instance, you will see that it's in the idle state:
Now wait for another 40 seconds (for the 1 minute to pass by) and refresh the dashboard; you will see that the persistence instance has disappeared and the WF instance has completed execution and terminated itself:
Understanding Persistence vs. Unloading
We have so far experienced one demo showing persistence and another one showing unloading. I have already described the difference between the two from a technical perspective. But what are the business cases where you should use persistence vs. those where you should use unloading?
Consider a scenario where a WF business process accepts purchase requests from clients; the process must check the client bank credit via a WCF call and reply in a real-time fashion to the client. The scenario itself is not long running, and will finish in a matter of seconds. However, what if the WCF bank service is down? This is something that you cannot predict but must take precautions against nonetheless. In such a scenario, it makes sense to persist your business process just before calling the WCF bank service; this way, if you detect that the bank service is down and an exception is thrown, you always have a persisted point to go back to, and from there, you can try to resend the message to the WCF bank service... until it is up again. Well, this is analogous to our first demo.
Now consider a second scenario where a PO WF business process accepts requests from clients to browse the product catalog. However, clients can take their time deciding if they want to carry on with the order; they can, for example, take a day to decide. In this case, you do not want to keep the WF process in memory; rather, you want to unload it and wake it up again when clients send their decisions. Well, this was exactly the scenario shown in the second demo.
So in short, you use persistence when you want to be in the safe side and have some point to go back to in case of failure. You use unloading, on the other hand, when you want to free up resources and remove your process from memory, typically in long running processes. Finally, design your persistence (and unloading) carefully because serializing and storing an instance state in the database comes with a performance hit.
Tracking is the ability to step inside a running workflow service instance and peek into variable values during the instance lifetime.
Before configuring tracking from AppFabric, there are some concepts that you need to know. In a nutshell, when you deal with WF 4.0 tracking, you have to understand three concepts:
- Tracking records: these are the information records emitted by your workflow. There are four derived classes that correspond to the four types of tracking records:
WorkflowInstanceQuery: events about the workflow instance state. For example: started, suspended, unloaded, etc...
ActivityStateQuery: events about activities inside the workflow.
CustomTrackingQuery: any custom information you want to track.
BookmarkResumptionQuery: bookmark name you want to track whenever this bookmark is resumed (bookmarks are discussed in my WF article mentioned at the start).
- Tracking profiles: act like filters over tracking records in order to track only required information.
- Tracking participants: the medium where tracking information will be written to. WF 4.0 comes with a participant which writes to ETW (Event Tracking for Windows). Custom participants can be easily developed to write tracking data into SQL Server, for example.
From IIS, click on the "SaleService" application and double click the "Services" icon form the AppFabric section. Right click the service name and select Configure, as shown below:
From the resulting screen, select "Monitoring" and click "Configure" as shown below:
From the drop down list Tracking profile, select "Sale Service Order Tracking", as shown above.
Now, let's switch to the solution and link all this stuff together. Open the web.config file of the SaleService project and locate the "Tracking" section. This section defines a tracking profile with the name of "Sale Service Order Tracking"; the same name we just configured inside AppFabric.
The tracking profile defines the tracking records we want to observe. For example, the tracking profile states that we want to track all the states of the workflow instance. The corresponding section is:
<state name="*" />
It also states that we want to track the variable "
StatusText" when the activity (i.e., workflow shape) called "Assign Catalog Expired Status" becomes in the "Closed" state. The corresponding section is:
<activitystatequery activityname="Assign Catalog Expired Status" />
<state name="Closed" />
<variable name="StatusText" />
Similarly, the variables "
PurchaseTotal", and "
OrderId" will be tracked when the activity called "Process New Order" becomes in the "Closed" state. The corresponding section is:
<activitystatequery activityname="Process New Order" />
<state name="Closed" />
<variable name="StatusText" />
<variable name="NewPurchaseOrder" />
<variable name="PurchaseTotal" />
<variable name="OrderId" />
Time to run the program again; this time, however, we will bring the second console client application into play.
Run "TestClient" and copy the GUID that appears on the console... this is the ID assigned to the current order. Now, before 1 minute passes by (time configured before the process terminates itself), run "TestClient2" and paste the GUID. This will confirm the order as shown below:
Now switch back to the AppFabric dashboard; the view should be familiar to you by now. Two WCF calls and a WF instance call have been recorded.
Select to track the events of the WF instance as shown below:
You will again be taken to the events monitored through AppFabric; however, now you can also track the variables as configured in the web.config file. To do so, scroll until you reach the "Process New Order" activity, and you will notice that the four variables are recorded at that particular time, as shown below:
You can even now search through your workflow instances based on the variable values. Use the Query Summary table to set your search query as shown below:
This is not related to AppFabric, but still worth mentioning for the sake of completion, and since it's a particularly important concept in workflow services development.
Let's revisit the demo 2 scenario: a client sends a request to the business process to view the catalog; later it sends another message confirming the initial catalog request. Well, what if we have client A and client B issuing catalog requests? Now we have two process instances waiting (persisted and saved in the DB) to get the confirmation messages. What will happen when client B, for example, sends the confirmation request? How will the engine know to what waiting instance should the message be routed to?
The concept used to solve such cases is called correlation; the act of relating both messages (the catalog request and the request confirmation) via a single unique ID - called the correlation ID. In our example, the GUID you copied from the first console to the second console is the correlation ID; what you did was assign a GUID to a particular client so that both requests (simulated by two different console applications) are correlated using this unique ID.
Correlation is configured within the WF designer in the "Correlations" section of the relevant WCF shapes (again, workflow development is not covered here).
Note: Caching is a fairly large topic by itself, so a full discussion here is not possible. What follows just scratches the surface of AppFabic caching. Dedicated posting about caching will hopefully follow.
Caching speeds up applications by storing frequently accessed information in memory and thus reducing database access time. Scaling cached data, however, is a common problem. Having data stored in memory makes it machine specific, and having all cached data in a single machine quickly creates an application bottleneck.
Distributed caching allows spreading data across multiple machines, and this is now part of AppFabric. The initiative was released well before AppFabric, and had a project name "Velocity".
A full discussion of distributed caching needs a complete article by itself; in the "More Resources" section, I will point you to some resources that do just that. However, in summary, distributed caching works as follows: you have a cache client application (for example, your ASP.NET application, or in our case, the WF business process) that accesses a cache cluster configured on multiple machines. All machines joined by a cache cluster can have data spread and duplicated across them, which provides highly available cached data.
The API is straightforward, and in this section, we will see how to use it to store and retrieve data.
Recall that in the last example, you had to copy the GUID from the first to the second client; here, we will utilize AppFabric caching and its API to store and retrieve the GUID instead.
Open the "Program.cs" file of the TestClient console application. Uncomment the following lines:
The above methods set up the cache configuration and store the GUID in the cache.
Next, open the "Program.cs" file of the TestClient2 console application. Comment the following line:
string catalogId = Console.ReadLine();
And uncomment the following line:
The above changes instruct the second client to get the GUID from the cache (which was set by the first client) instead of getting it from the console interface.