Google Search

Google
 

Tuesday, May 27, 2008

How to consume REST services with WCF

As you are probably aware by now, Windows Communication Foundation (WCF) 3.5 introduced a new binding called WebHttpBinding to create and to consume REST based services. If you are new to the WCF Web Programming model then see here for more details.

There have been many articles and blogs on how to host a RESTful service. However there doesn’t seem to be much written work on how to consume these services so I thought to write a few lines on this topic.

The new WebHttpBinding is used to configure endpoints that are exposed through HTTP requests instead of SOAP messages. So you can simply call into a service by using a URI. The URI usually includes segments that are converted into parameters for the service operation.

So the client of a service of this type requires 2 abilities: (1) Send an HTTP request, (2) Parse the response. The default response message format supported out of the box with the WebHttpBinding is “Plain old XML” (POX). It also supports JSON and raw binary data using the WebMessageEncodingBindingElement.

One way of consuming these services is by manually creating a HTTP request. The following example is consuming the ListInteresting operation from Flickr:

WebRequest request = WebRequest.Create("http://api.flickr.com/services/rest/?method=flickr.interestingness.getList&api_key=*&extras=");

WebResponse ws = request.GetResponse();

XmlSerializer s = new XmlSerializer(typeof(PhotoCollection));

PhotoCollection photos = (PhotoCollection)s.Deserialize(ws.GetResponseStream());

The idea is simple:

- Do the HTTP request and include all the parameters as part of the URI

- Get the response that is in XML format

- Either parse it or deserialize it into an object

The above code works but it is not elegant: We are not using the unified programming model offered by WCF and the URL is hacked together using string concatenation. The response is also manually deserialized into an object. With WCF and the WebHttpBinding we can automate most of this.

The first step is to define our service contract:

[ServiceContract]

[XmlSerializerFormat]

public interface IFlickrApi

{

[OperationContract]

[WebGet(

BodyStyle = WebMessageBodyStyle.Bare,

ResponseFormat = WebMessageFormat.Xml,

UriTemplate = "?method=flickr.interestingness.getList&api_key={apiKey}&extras={extras}")]

PhotoCollection ListInteresting(string apiKey, string extras);

}

As you can see, I am specifically instructing WCF to use the XML Serializer Formatter for this. The next step is to set the client endpoint. I decided to do this inside the config file:


























In order to be able to use the XML Serializer Formatter, I need XML Serializable types:

[XmlRoot("photos")]

public class PhotoCollection

{

[XmlAttribute("page")]

public int Page { get; set; }



...



[XmlElement("photo")]

public Photo[] Photos { get; set; }



}



public class Photo

{

[XmlAttribute("id")]

public string Id { get; set; }



[XmlAttribute("title")]

public string Title { get; set; }



...

}

The final step is to create an instance of the client proxy:

ChannelFactory factory =

new ChannelFactory("FlickrREST");

var proxy = factory.CreateChannel();

var response = proxy.ListInteresting("xxxx", "yyyy");

((IDisposable)proxy).Dispose();

If you don’t like using ChannelFactory directly then you can create your proxy by deriving from ClientBase<>:

public partial class FlickrClient :

ClientBase, IFlickrApi

{

public FlickrClient()

{

}



public FlickrClient(string endpointConfigurationName) :

base(endpointConfigurationName)

{

}



public FlickrClient(

string endpointConfigurationName,

string remoteAddress) :

base(endpointConfigurationName, remoteAddress)

{

}



public FlickrClient(string endpointConfigurationName,

EndpointAddress remoteAddress) :

base(endpointConfigurationName, remoteAddress)

{

}



public FlickrClient(Binding binding,

EndpointAddress remoteAddress) :

base(binding, remoteAddress)

{

}



public PhotoCollection ListInteresting(string apiKey, string extras)

{

return base.Channel.ListInteresting(extras);

}

}

Now the client code will look similar to the following:

FlickrClient proxy = new FlickrClient();

var response = proxy.ListInteresting("xxxxxx","yyyyyy");

((IDisposable)proxy).Dispose();

Monday, May 26, 2008

creating multiple versions of workflows

Creating Multiple Versions of Workflows
In the recent past, a question that has surfaced quite often is regarding the versioning of workflows and the errors that show up when trying to do so. It is quite natural to expect an extreme level of dynamism and out-of-box features from Microsoft, but one needs to take a pragmatic view of WF.
There is a clear distinction between a Workflow Definition and a Workflow Instance. A Workflow Definition is a Template and a Workflow Instance is a runtime representation of that template. We can closely relate to the Workflow Definition and a Workflow Instance to a Class and an object respectively.
A Workflow can be authored programmatically in C# or exclusively in XAML or using a combination of both. When a workflow instance is created and persisted, it is persisted with its state and also the type information of workflow and activities used. During this cycle of instance passivation, if the definition of the workflow (or the activity used in the workflow) is changed and the assembly with the old definition is removed, then the workflow runtime throws an error. This is an expected behavior since Binary Serialization is type dependent.
Workflow Foundation does have a solution to this problem in the form of Dynamic Workflow Updates using the WorkflowChanges API. The following is one of the possible ways of mitigate the problem of workflow versioning:
1. The assembly with the new workflow definition (or activity definition) is added to the GAC (or a location where you reference your assemblies from). This will sit beside the assembly (older version) that contained the older definition. The assembly with the older definition(s) MUST be present in the GAC (or a location where you reference your assemblies from).
2. When the workflow is to be rehydrated, identify the changes between the older definition and newer definition by getting hold of the corresponding definitions and apply changes to the workflow using the WorkflowChanges API. Once the changes are done, the workflow can be resumed. Two ways in which this can be done:
- This can be done on-demand or can be a background process. When we say on-demand, the applying of changes is done when the workflow is rehydrated by the user or application and it is resumed immediately.
- If it is a background process, the workflow instances corresponding to this workflow can be loaded into the memory; changes can be applied and the instances can be dehydrated back to the persistence store. So, when the workflow instance is next invoked by the application, it is loaded and resumed directly. This process can be accomplished using a windows service. However, the synchronization aspect should be taken care of here.
But the question definitely remains as to why there isn't a mechanism which can change the instances when there is a change to the definition of the workflow? Here is one possible reason:
A defintion is a "Stateless" entitiy while an instance has a "State". A workflow definition is either a .NET assembly or an XAML file which is used to create an instance. Any change to the definition means a change to the stateless entity and it happens at design-time. To get this reflected on to a run-time entity needs a program to do so. Moreover, the Workflow Persistence Service can be customized and this makes it an even greater challenge.
For a small set of workflows, the issue of versioning may not really be a show stopper. However, when planning a large scale use of WF, it is necessary to take an appropriate view of the versioning required and move ahead. It is important to set the expectations right before dealing with situations like these.

Mars - Transactions and Debugging

Multiple Active Result Sets (MARS) is a new SQL Server 2005 feature that allows the user to run more than one SQL batch on an open connection at the same time. In my previous article about MARS I explained what MARS is and how to use it. In this article I'll discuss how transactions in MARS work and how you can debug MARS connections. -->
.csharpcode, .csharpcode pre
{
/* font-size: small;
color: black;
font-family: consolas, "Courier New", courier, monospace;
background-color: #ffffff; */
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
Transactions in MARS
Before MARS transactions were a pretty straight forward thing. You executed a command which was associated with a transaction and that was it. This is not so simple anymore.
A good example are Transaction Savepoints. Transaction savepoints are points in a transaction to which you can partially rollback to.
For example:BEGIN TRAN
-- create a table
CREATE TABLE t1 (id INT, title VARCHAR(20) )
-- insert some data
INSERT INTO t1
SELECT 1, 'name 1' UNION ALL
SELECT 2, 'name 2' UNION ALL
SELECT 3, 'name 3'
SELECT * FROM t1
-- save transaction to a savepoint
SAVE TRAN savepoint1
-- insert some more data
INSERT INTO t1
SELECT 5, 'name 5'
SELECT * FROM t1
-- whoops, we don't want that nasty 5 in there,
-- roll it back to the savepoint
ROLLBACK TRAN savepoint1
-- insert a nice 4
INSERT INTO t1
SELECT 4, 'name 4'
SELECT * FROM t1
COMMIT
Under MARS, setting savepoints, rolling back to savepoints and committing transactions isn't allowed when there is more than one request which is actively running under a transaction. Let's see why with some code. Note that both requests are running under the same transaction. string connString = @"server=MyServer; database=testDB;
trusted_connection=yes;
MultipleActiveResultSets=true";
using (SqlConnection conn = new SqlConnection(connString))
{
// Command 1 represents the First Request/Batch
SqlCommand cmd1 = new SqlCommand();
cmd1.Connection = conn;
cmd1.CommandText = @"INSERT INTO t1
SELECT 1, 'name 1' UNION ALL
SELECT 2, 'name 2' UNION ALL
SELECT 3, 'name 3';
Select * from t1;";
// Command 2 represents the Second Request/Batch
SqlCommand cmd2 = new SqlCommand();
cmd2.Connection = conn;
cmd2.CommandText = "UPDATE t1 SET title = 'other name 2' WHERE id = 2";

conn.Open();

// Start the transaction
// Both request run under the same transaction
SqlTransaction tran = conn.BeginTransaction("mainTran");
cmd1.Transaction = tran;
cmd2.Transaction = tran;

try
{
// Time T1 – run the insert and the select
SqlDataReader rdr = cmd1.ExecuteReader();
while (rdr.Read())
{
// Time T2
// The execution will fail at this point because Transaction Savepoints aren't
// allowed in MARS'ed environment
tran.Save("savepoint1");
cmd2.ExecuteNonQuery();
}
// Time T3 - executes in the first batch
cmd1.CommandText = "INSERT INTO t1 SELECT 4, 'name 4';";
// Time T4 - this will fail.
cmd2.CommandText = "UPDATE t1 SET id = 'other name 5' WHERE id = 5;";
// run the statements
cmd1.ExecuteNonQuery();
cmd2.ExecuteNonQuery();

tran.Commit();
}
catch (Exception ex)
{
// this is the error message we get when trying to set the Transaction Savepoint:
// The transaction operation cannot be performed because there
// are pending requests working on this transaction.
Console.WriteLine(ex.Message);
}
}
At first glance this code looks OK. But let's examine it closely. Everything is great until the rollback to savepoint1 in the second request. What happens here is 3 statements execute since setting the savepoint. First the update to the table in request 2, then insert into the table in request 1 and finally the update to the table in request 2. But since the second update fails and rolls back to the savepoint, the insert in request 1 will also be rolled back which is unwanted behaviour.
These kinds of problems are hard to find and debug and are the reason why savepoints and committing aren't allowed under MARS when more than one request is run under a transaction.
In .Net only one transaction can be set per connection. This means that this kind of code isn't possible:private void MarsConcurrentTransactions()
{
string connString = @"server=MyServer;
database=testDB;
trusted_connection=yes;
MultipleActiveResultSets=true";
using (SqlConnection conn = new SqlConnection(connString))
{
// Command 1 represents the First Request/Batch
SqlCommand cmd1 = new SqlCommand();
cmd1.Connection = conn;
cmd1.CommandText = "SELECT * FROM t1 WHERE id IS NULL";
// Command 2 represents the Second Request/Batch
SqlCommand cmd2 = new SqlCommand();
cmd2.Connection = conn;
cmd2.CommandText = " SELECT * FROM t1 WHERE id IS NOT NULL";
conn.Open();
// Start the transactions
SqlTransaction tran1 = conn.BeginTransaction("tran1");
// this fails - can't have 2 concurrent transaction on the same connection
SqlTransaction tran2 = conn.BeginTransaction("tran2");
cmd1.Transaction = tran1;
cmd2.Transaction = tran2;
// ... more code ...
}
}
Nor is this one:string connString = @"server=MyServer;
database=testDB;
trusted_connection=yes;
MultipleActiveResultSets=true";
using (SqlConnection conn = new SqlConnection(connString))
{
// Command 1 represents the First Request/Batch
SqlCommand cmd1 = new SqlCommand();
cmd1.Connection = conn;
cmd1.CommandText = "SELECT title FROM t1";
// Command 2 represents the Second Request/Batch
SqlCommand cmd2 = new SqlCommand();
cmd2.Connection = conn;
cmd2.CommandText = "UPDATE t1 SET id = id + 5 WHERE title = @title";
cmd2.Parameters.Add(new SqlParameter("@title", SqlDbType.VarChar, 20));
conn.Open();
// Start the transactions
SqlTransaction tran1 = conn.BeginTransaction("tran1");
cmd1.Transaction = tran1;
using (SqlDataReader rdr1 = cmd1.ExecuteReader())
{
while (rdr1.Read())
{
cmd2.Parameters[0].Value = rdr1["title"].ToString();
// this will FAIL because we can't mix sql trasaction with
// implicit transaction in which the update runs by default.
cmd2.ExecuteNonQuery();
}
}
tran1.Rollback();
}
That's because we still have 2 transactions. One explicit (SqlTransaction) and one implicit (the Sql Server's in which the update runs)
What we can and should do is put all SqlCommands under the same SqlTransaction while not setting savepoints:string connString = @"server=MyServer;
database=testDB;
trusted_connection=yes;
MultipleActiveResultSets=true";
using (SqlConnection conn = new SqlConnection(connString))
{
// Command 1 represents the First Request/Batch
SqlCommand cmd1 = new SqlCommand();
cmd1.Connection = conn;
cmd1.CommandText = "SELECT title FROM t1";
// Command 2 represents the Second Request/Batch
SqlCommand cmd2 = new SqlCommand();
cmd2.Connection = conn;
cmd2.CommandText = "UPDATE t1 SET id = id + 5 WHERE title = @title";
cmd2.Parameters.Add(new SqlParameter("@title", SqlDbType.VarChar, 20));
conn.Open();
// Start the transactions
SqlTransaction tran1 = conn.BeginTransaction("tran1");

cmd1.Transaction = tran1;
cmd2.Transaction = tran1;
using (SqlDataReader rdr1 = cmd1.ExecuteReader())
{
while (rdr1.Read())
{
cmd2.Parameters[0].Value = rdr1["title"].ToString();
cmd2.ExecuteNonQuery();
}
}
tran1.Commit();
}
I've shown four ways how a developer might try to use transactions with MARS. However only the one in which all SqlCommands are under one transaction is the correct one as long as you're not setting any transaction savepoints. To truly understand MARS execution a developer must have a good understanding of its possibilities.
Debugging and monitoring MARS
With MARS the "old-school" type of monitoring isn't adequate anymore. Why? Because in SQL Server 2000 we could simply say or at least assume that the SPID (SQL Server Process ID) identifies a request (a batch). This way you can simply get the executing SQL Statement for the SPID of your choice. SysProcesses helped with debugging more than once with it's SPIDs and accompanying execution statistics.
All of this has been changed. Of course sysprocesses still shows process information, but with the introduction of Dynamic Management Views it was "replaced" with a few of those. These new DMV's are sys.dm_exec_sessions, sys.dm_exec_connections and sys.dm_exec_requests.
sys.dm_exec_sessions
Returns one row per authenticated session on Microsoft SQL Server. A SPID is equal to session_id column. Interesting columns are last_request_start_time and last_request_end_time, which show the begining of the last request including the currently running request and completion time of the last request in a session.
sys.dm_exec_connections
Returns information about the connections established to this instance of SQL Server and the details of each connection. Here we get into the new waters. This view shows us physical and logical connections to the SQL Server. SPID is again equal to session_id column. Logical connections are a kind of virtual connections in a physical connection in which MARS requests run. For logical connections the parent_connection_id is not null. Parent_connection_id identifies the primary physical connection that the MARS requests are using.
sys.dm_exec_requests
Returns information about each request that is executing within SQL Server. SPID is again equal to session_id. Each session can have MARS requests and each of these requests has a unique id under a session in the request_id column. connection_id provides the physical connection on which the MARS request runs on.
Now a SPID is equal to a request_id, so 2 MARS requests on a single connection have 2 different SPID's. SQL Server 2005 also has a new function called current_request_id() which returns the request currently executing under a session.
When debugging this query might come in handy:SELECT r.session_id, r.request_id,
c.connection_id, c.parent_connection_id, c.connect_time, c.net_transport,
s.HOST_NAME, s.program_name, s.nt_domain, s.login_name,
s.last_request_start_time, s.last_request_end_time, s.transaction_isolation_level
FROM sys.dm_exec_requests r
JOIN sys.dm_exec_sessions s ON r.session_id = s.session_id
JOIN sys.dm_exec_connections c ON s.session_id = c.session_id
It show us the needed information for each request in a session, the connection it belongs, the start and end times, transaction isolation level, who ran it, etc.

MARS

SQL Server 2005 has so many new features that in my opinion if you read only BOL for a year you'd find something new every day. One of those is Multiple Active Result Sets or MARS. Multiple Active Result Sets is a new SQL Server 2005 feature that, putting it simply, allows the user to run more than one SQL batch on an open connection at the same time. -->
Pre-SQL 2005 era
In SQL Server's prior to 2005, you could only run one batch per connection. This means simply that you could only do this:private void MARS_Off()
{
SqlConnection conn = new SqlConnection("Server=serverName;
Database=adventureworks;Trusted_Connection=yes;");
string sql1 = "SELECT * FROM [Person].[Address]";
string sql2 = "SELECT * FROM [Production].[TransactionHistory]";
SqlCommand cmd1 = new SqlCommand(sql1, conn);
SqlCommand cmd2 = new SqlCommand(sql2, conn);
cmd1.CommandTimeout = 500;
cmd2.CommandTimeout = 500;
conn.Open();
SqlDataReader dr1 = cmd1.ExecuteReader();
// do stuff with dr1 data
conn.Close();
conn.Open();
SqlDataReader dr2 = cmd2.ExecuteReader();
// do stuff with dr2 data
conn.Close();
}
And the accompanying profiler trace:
This example shows that you could use the same connection with the second SqlDataReader only when you finished using the connection with first one. The connection must be closed and reopened as it is shown with Audit Login and Audit Logout events. Opening and closing a connection is an expensive operation so this can hurt performance, even if your connection is stored in the connection pool.
If you for instance wanted to do some processing of the data in your data reader and updating the processed data back to the database you had to use another connection object which again hurts performance. There was no way to use the same opened connection easily for more than one batch at the time. There are of course server side cursors but they have drawbacks like performance and ability to operate only on a single select statement at the time.
SQL 2005 era
SQL Server 2005 team recognized the above mentioned drawback and introduced MARS. So now it is possible to use a single opened connection for more than one batch. A simple way of demonstrating MARS in action is with this code:private void MARS_On()
{
SqlConnection conn = new SqlConnection("Server= serverName;Database=adventureworks;
Trusted_Connection=yes;MultipleActiveResultSets=true;");

string sql1 = "SELECT * FROM [Person].[Address]";
string sql2 = "SELECT * FROM [Production].[TransactionHistory]";
SqlCommand cmd1 = new SqlCommand(sql1, conn);
SqlCommand cmd2 = new SqlCommand(sql2, conn);
cmd1.CommandTimeout = 500;
cmd2.CommandTimeout = 500;
conn.Open();
SqlDataReader dr1 = cmd1.ExecuteReader();
SqlDataReader dr2 = cmd2.ExecuteReader();
conn.Close();
}
And the accompanying profiler trace:
MARS is disabled by default on the Connection object. You have to enable it with the addition of MultipleActiveResultSets=true in your connection string.
How MARS Works
The MARS architecture is based on multiplexing or interleaving. Because I'm from an electrical engineering background I'll be using the term multiplexing as it is closer to my heart. Simply put multiplexing means that the input signals are processed one by one and not in parallel based on a clock count. Same applies to MARS connections, only the clock count is replaced with well defined points. Most statements must be run atomically in a batch, which means that they must fully complete before another statement can be run. Statements that don't need to run atomically can be multiplexed before they finish thus enabling another MARS statement to execute.
These multiplexing-enabled statements are:
SELECT
FETCH
RECEIVE
READTEXT
BULK INSERT / BCP
ASYNC CURSOR POPULATION
The best way to describe is is with an example: Say you are retrieving 1 million rows. In the middle of the retrieval an INSERT statement comes in via MARS. Because a SELECT can be multiplexed the retrieval is stopped and the INSERT is performed. Because an INSERT can't be multiplexed, it must fully complete. When it does the SELECT is resumed. A little later in the retrieval an UPDATE statement comes in. So again the SELECT is stopped and the UPDATE is fully executed since it also can't be multiplexed. After the UPDATE fully completes the SELECT is able to finish.
However if we are updating 1 million rows first and in comes a SELECT via MARS, the UPDATE will fully finish before the SELECT can be started.
There are a few exceptions worth mentions in the multiplexing-enabled statements in the list above:
RECEIVE can be multiplexed when the rows begin to arrive and if the SET XACT ABORT ON is specified for the batch. If the RECEIVE is in the waiting state then it can't be multiplexed.
BULK INSERT / BCP can be multiplexed if the SET XACT ABORT ON is specified for the batch and the execute triggers on the target table has been disabled
Managed code (stored procedures, functions, triggers) can't be multiplexed.
Is MARS really that great?
Like any technology it can benefit you or you can shoot yourself in the foot with it. MARS uses "firehose mode" to retrieve data. Firehose mode means that the server will produce data as fast as possible. This also means that your client application must receive inbound data at the same speed as it comes in. If it doesn't the data storage buffers on the server will fill up and the processing will stop until those buffers empty.
So what? You may ask... But as long as the processing is stopped the resources on the SQL server are in use and are tied up. This includes the worker thread, schema and data locks, memory, etc. So it is crucial that your client application consumes the inbound results as quickly as they arrive.
Also the important thing to realize is that multiplexed execution DOES NOT mean parallel execution.