Saturday, April 28, 2012

Logical Query Processing Phases in Brief


         Observing true experts in different fields, you find a common practice that they all share—mastering the basics. One way or another, all professions deal with problem solving. All solutions to problems, complex as they may be, involve applying a mix of fundamental techniques. If you want to master a profession, you need to build your knowledge upon strong foundations. Put a lot of effort into perfecting your techniques, master the basics and you’ll be able to solve any problem.

          So I can’t think of a better way to start writing my bogs without the fundamentals of logical query processing in SQL Server, Not just because it covers the essentials of query processing but also because SQL programming is conceptually very different than any other sort of programming.

          SQL programming has many unique aspects, such as thinking in sets, the logical processing order of query elements, and three-valued logic. Trying to program in SQL without this knowledge is a straight path to lengthy, poor-performing code that is difficult to maintain.

Logical Query Processing

            Following list contains a general form of a query, along with step numbers assigned according to the order in which the different clauses are logically processed.

(5) SELECT (5-2) DISTINCT (5-3) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate>
| (1-A) <left_table> <apply_type> APPLY <right_table_expression> AS <alias>
| (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias>
| (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias>
(2) WHERE <where_predicate>
(3) GROUP BY <group_by_specification>
(4) HAVING <having_predicate>
(6) ORDER BY <order_by_list>;


Flow diagram logical query processing

(1)  FROM : The FROM phase identifies the query’s source tables and processes table operators. Each table operator applies a series of sub phases. For example, the phases involved in a join are (1-J1) Cartesian product, (1-J2) ON Filter, (1-J3) Add Outer Rows. The FROM phase generates virtual table VT1.

(1-J1) Cartesian Product : This phase performs a Cartesian product (cross join) between the two tables involved in the table operator, generating VT1-J1.

(1-J2) ON Filter : This phase filters the rows from VT1-J1 based on the predicate that appears in the ON clause (<on_predicate>). Only rows for which the predicate evaluates to TRUE are inserted into VT1-J2.

(1-J3) Add Outer Rows : If OUTER JOIN is specified (as opposed to CROSS JOIN or INNER JOIN), rows from the preserved table or tables for which a match was not found are added to the rows from VT1-J2 as outer rows, generating VT1-J3.

 (2) WHERE : This phase filters the rows from VT1 based on the predicate that appears in the WHERE clause (<where_predicate>). Only rows for which the predicate evaluates to TRUE are inserted into VT2.

 (3) GROUP BY: This phase arranges the rows from VT2 in groups based on the column list specified in the GROUP BY clause, generating VT3. Ultimately, there will be one result row per group.

 (4) HAVING : This phase filters the groups from VT3 based on the predicate that appears in the HAVING clause (<having_predicate>). Only groups for which the predicate evaluates to TRUE are inserted into VT4.

(5) SELECT : This phase processes the elements in the SELECT clause, generating VT5.

(5-1) Evaluate Expressions : This phase evaluates the expressions in the SELECT list, generating VT5-1.

(5-2) DISTINCT : This phase removes duplicate rows from VT5-1, generating VT5-2.

(5-3) TOP : This phase filters the specified top number or percentage of rows from VT5-2 based on the logical ordering defined by the ORDER BY clause, generating the table VT5-3.

(6) ORDER BY : This phase sorts the rows from VT5-3 according to the column list specified in the ORDER BY clause, generating the cursor VC6.