SQL Server 2005 has so many new features that in my opinion if you read only BOL for a year you'd find something new every day. One of those is Multiple Active Result Sets or MARS. Multiple Active Result Sets is a new SQL Server 2005 feature that, putting it simply, allows the user to run more than one SQL batch on an open connection at the same time. -->
Pre-SQL 2005 era
In SQL Server's prior to 2005, you could only run one batch per connection. This means simply that you could only do this:private void MARS_Off()
{
SqlConnection conn = new SqlConnection("Server=serverName;
Database=adventureworks;Trusted_Connection=yes;");
string sql1 = "SELECT * FROM [Person].[Address]";
string sql2 = "SELECT * FROM [Production].[TransactionHistory]";
SqlCommand cmd1 = new SqlCommand(sql1, conn);
SqlCommand cmd2 = new SqlCommand(sql2, conn);
cmd1.CommandTimeout = 500;
cmd2.CommandTimeout = 500;
conn.Open();
SqlDataReader dr1 = cmd1.ExecuteReader();
// do stuff with dr1 data
conn.Close();
conn.Open();
SqlDataReader dr2 = cmd2.ExecuteReader();
// do stuff with dr2 data
conn.Close();
}
And the accompanying profiler trace:
This example shows that you could use the same connection with the second SqlDataReader only when you finished using the connection with first one. The connection must be closed and reopened as it is shown with Audit Login and Audit Logout events. Opening and closing a connection is an expensive operation so this can hurt performance, even if your connection is stored in the connection pool.
If you for instance wanted to do some processing of the data in your data reader and updating the processed data back to the database you had to use another connection object which again hurts performance. There was no way to use the same opened connection easily for more than one batch at the time. There are of course server side cursors but they have drawbacks like performance and ability to operate only on a single select statement at the time.
SQL 2005 era
SQL Server 2005 team recognized the above mentioned drawback and introduced MARS. So now it is possible to use a single opened connection for more than one batch. A simple way of demonstrating MARS in action is with this code:private void MARS_On()
{
SqlConnection conn = new SqlConnection("Server= serverName;Database=adventureworks;
Trusted_Connection=yes;MultipleActiveResultSets=true;");
string sql1 = "SELECT * FROM [Person].[Address]";
string sql2 = "SELECT * FROM [Production].[TransactionHistory]";
SqlCommand cmd1 = new SqlCommand(sql1, conn);
SqlCommand cmd2 = new SqlCommand(sql2, conn);
cmd1.CommandTimeout = 500;
cmd2.CommandTimeout = 500;
conn.Open();
SqlDataReader dr1 = cmd1.ExecuteReader();
SqlDataReader dr2 = cmd2.ExecuteReader();
conn.Close();
}
And the accompanying profiler trace:
MARS is disabled by default on the Connection object. You have to enable it with the addition of MultipleActiveResultSets=true in your connection string.
How MARS Works
The MARS architecture is based on multiplexing or interleaving. Because I'm from an electrical engineering background I'll be using the term multiplexing as it is closer to my heart. Simply put multiplexing means that the input signals are processed one by one and not in parallel based on a clock count. Same applies to MARS connections, only the clock count is replaced with well defined points. Most statements must be run atomically in a batch, which means that they must fully complete before another statement can be run. Statements that don't need to run atomically can be multiplexed before they finish thus enabling another MARS statement to execute.
These multiplexing-enabled statements are:
SELECT
FETCH
RECEIVE
READTEXT
BULK INSERT / BCP
ASYNC CURSOR POPULATION
The best way to describe is is with an example: Say you are retrieving 1 million rows. In the middle of the retrieval an INSERT statement comes in via MARS. Because a SELECT can be multiplexed the retrieval is stopped and the INSERT is performed. Because an INSERT can't be multiplexed, it must fully complete. When it does the SELECT is resumed. A little later in the retrieval an UPDATE statement comes in. So again the SELECT is stopped and the UPDATE is fully executed since it also can't be multiplexed. After the UPDATE fully completes the SELECT is able to finish.
However if we are updating 1 million rows first and in comes a SELECT via MARS, the UPDATE will fully finish before the SELECT can be started.
There are a few exceptions worth mentions in the multiplexing-enabled statements in the list above:
RECEIVE can be multiplexed when the rows begin to arrive and if the SET XACT ABORT ON is specified for the batch. If the RECEIVE is in the waiting state then it can't be multiplexed.
BULK INSERT / BCP can be multiplexed if the SET XACT ABORT ON is specified for the batch and the execute triggers on the target table has been disabled
Managed code (stored procedures, functions, triggers) can't be multiplexed.
Is MARS really that great?
Like any technology it can benefit you or you can shoot yourself in the foot with it. MARS uses "firehose mode" to retrieve data. Firehose mode means that the server will produce data as fast as possible. This also means that your client application must receive inbound data at the same speed as it comes in. If it doesn't the data storage buffers on the server will fill up and the processing will stop until those buffers empty.
So what? You may ask... But as long as the processing is stopped the resources on the SQL server are in use and are tied up. This includes the worker thread, schema and data locks, memory, etc. So it is crucial that your client application consumes the inbound results as quickly as they arrive.
Also the important thing to realize is that multiplexed execution DOES NOT mean parallel execution.
Google Search
Monday, May 26, 2008
MARS
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment