Flashcards in Chapter 14 Using Tools to Analyze Performance Deck (43)
What is the relational engine?
The relational engine is an internal component that works on a logical level. It parses the query which is then processed by the Query Optimizer. The Query Optimizer generates the execution plan.
What is the storage engine?
The storage engine carries out the physical operations and performs the actual execution. This is where processes such as locking, index maintenance, and transactions occur.
What are the 4 query execution phases?
(1) Parsing, (2) Binding, (3) Optimization, (4) Execution.
What occurs during the parsing phase?
During the parsing phase, the relational engine checks whether your query is syntactically correct. The result of this phase, if the query passed the syntax check, is a tree of logical operators known as a parse tree. The parse tree represents the logical steps necessary to execute the query that has been requested.
What occurs during the binding phase?
In the binding phase, the relational engine resolves the object names (tables, columns, and types) in the query and binds them to logical operators. The result of this phase is the algebrized tree, which is a tree of logical operators bound to actual objects. The algebrized tree is passed on to the query optimizer.
What occurs during the optimization phase?
In the optimization phase, using the algebrized tree and the statistics about the data, the query optimizer (in the relational engine) works out what it thinks will be the optimal way to execute the query - that is, it generates an execution plan. The optimizer figures out how best to implement the request represented by the T-SQL query submitted. It decides if the data can be accessed through indexes, what types of joins to use and much more. The decisions made by the optimizer are based on what it calculates to be the cost of a given execution plan - in terms of the required CPU processing and I/O. This is known as a cost-based plan. The SQL Server Query Optimizer is known as a cost-based optimizer.
The optimizer will generate and evaluate many plans (unless there is already a cached plan) and generally speaking, it will choose the lowest-cost plan.
The result of this phase is the actual execution plan which is a single tree with physical operators.
What occurs during the execution phase?
The actual execution of the selected plan is performed by the storage engine which carries out the physical operations.
The result of the execution phase is the desired result set. In addition, the result of the execution phase might also be a cached plan.
How does the SQL Server optimizer calculate the cost?
SQL Server calculates the cost of an operation by determining the algorithm used by a physical operator and by estimating the number of rows that have to be processed. The estimation of the number of rows is called "cardinality estimation". The estimation of the number of rows processed by each physical operator comes from "optimizer statistics".
The cost expresses usage of physical resources such as the amount of disk I/O, CPU time, and memory needed for execution. After the optimizer gets the cost for all operators in a plan, it can calculate the cost of the whole plan.
A higher cost means a more complex plan, and a more complex plan, means a slower query.
What is the plan cache?
Caching the selected execution plan in the "plan cache" can speed up the next execution of the same query or an equivalent query from the execution perspective. SQL Server can cache an execution plan in order to have it ready for the next execution, thus avoiding doing the optimization.
SQL Server actually tries to parameterize your queries in order to have one plan for multiple equivalent queries. Equivalent queries are queries that can be executed in the same way.
SELECT col1 FROM TableA WHERE col2=3;
SELECT col1 FROM TableA WHERE col2 = 5;
SQL Server caches the execution plan separately from the actual value (the execution context). This way, SQL Server can reuse the same execution plan multiple times.
Why might using a cached plan not always be the best solution?
The number of rows in a table might grow substantially. All plans that include scans of that table, which might be fast enough for a small table, could suddenly become suboptimal.
Plans in cache can also become obsolete because metadata changes in a database. For example, an index could be added to a table or a constraint could be altered.
What is parameter sniffing?
The Query Optimizer sometimes has to guess the cardinality estimation because it cannot detect for sure what it is from your parameters. This problem is known as a parameter sniffing problem. Parameter sniffing is a process where SQL Server tries to guess (or sniff) the current parameter value during compilation and passes it to the Query Optimizer.
What are optimizer statistics?
SQL Server maintains statistics
about the total number of rows and distribution of the number of rows over key values of
an index for each index. It's "data about the data" which provides the information necessary for the optimizer to make it's cost calculations.
What are the reasons a suboptimal plan might be selected?
-- The selected plan is not the best because the search space of the execution plans was too big.
-- Statistical information is not present or update which leads to wrong cardinality estimation.
-- A cached plan is suboptimal for the current parameter value.
-- Parameter sniffing leads to inaccurate cardinality estimation.
-- The Query Optimizer underestimates or overestimates the cost of an algorithm implemented in a physical operator.
-- Hardware changes could better accommodate a different plan. For example, someone could add CPUs to a box and a plan that uses more CPU time could be more appropriate.
What are Extended Events?
SQL Server Extended Events is a very lightweight performance monitoring system where you can correlate data from SQL Server with data from the operating system and application.
An Extended Events package is a container for all extended events objects. These objects include:
-- Events - These are your points of interest for monitoring. You can use events for monitoring or to trigger actions.
-- Targets - These are event consumers. You can use targets that write to a file, store event data in a memory buffer, or aggregate event data.
-- Actions - These are responses to an event. They are bound to an event. Actions can capture a stack dump and inspect data, store information in a local variable, aggregate event data, or even append data to event data, e.g. the execution plan detection action detects execution plans.
-- Predicates - These are sets of logical rules to filter captured events. In order to minimize the impact of a monitoring session, it is important that you capture only events you need.
-- Types - These help interpret the data collected. The data is actually a collection of bytes, and types give this data context. A type is provided for events, actions, targets, predicates, and types themselves.
-- Maps - These are SQL Server internal tables that map internal numeric values to meaningful strings.
What is SQL Trace?
SQL Trace is an internal SQL Server mechanism for capturing events. SQL Trace is deprecated in future versions. This means that it will still be available in the life cycle of SQL Server 2012 and the next version of SQL Server; however, after the next version, SQL Trace might be discontinued.
You can create traces through a set of SQL Server system stored procedures. You can create traces manually or through the SQL Server Profiler UI. You trace SQL Server events. A source for a trace event can be a TSQL batch or some other SQL Server event, such as a deadlock. After an event occurs, the trace gathers the event information. Event information is then passed to a queue. Before passing to the queue, events are filtered according to your filters. From the queue, the trace information can go to a file or a SQL Server table, or it can be used by applications, such as SQL Server Profiler.
What is SQL Server Profiler?
SQL Server Profiler is a rich application that serves as a UI for SQL Trace. With SQL Server Profiler, you can create and manage traces, and you can analyze results of your traces. You can replay events from a saved trace step by step. To start a server-side trace, you can script a trace you created through the SQL Server Profiler UI, and then execute the script directly on your SQL Server instance.
What are some drawbacks to using SQL Server Profiler?
-- You increase the monitoring impact on your SQL Server instance compared to when using SQL Trace only, due to the overhead the SQL Server Profiler UI is producing.
-- When you use the SQL Server Profiler UI on a computer with the SQL Server instance you are monitoring, SQL Server Profiler competes for the same resources.
-- When you use SQL Server Profiler remotely, all events must travel over a network which slows down other network operations.
-- SQL Server Profiler shows events in a grid which can consume a lot of memory when you capture many events.
-- You or someone else might inadvertently close the Profiler and stop the trace when you need to capture the events for a longer time.
What are the objects used by SQL Trace and SQL Server Profiler?
-- Event - An event is an action within SQL Server. For example, an action can be a logon
failure, T-SQL batch, start of a stored procedure, and more.
-- Event Class - This is the type of an event. The event class defines the data that an event
-- Event Category - Event categories define groupings of events.
-- Data Column - A data column is an attribute of an event. If you save a trace to a table,
an event is represented by a row in the table, and attributes of events are columns in
-- Template - A template is a saved trace definition. SQL Server Profiler comes with a
couple of predefined templates that can speed up the creation of a trace.
-- Filter - Filters limit the events traced. You can put a filter on any event column. In
order to minimize the impact of monitoring on your SQL Server instance, you should
filter out any event you do not need in your current trace.
-- Trace - A trace is a collection of events, columns, filters, and data returned.
Why should you use Extended Events over SQL Trace and SQL Server Profiler?
Because Extended Events is more lightweight and SQL Trace and SQL Server Profiler are deprecated in future versions of SQL Server.
What are the actions of the optimization phase of query execution?
Generation of candidate plans and selection of the best candidate plan.
In which phase of query execution does SQL Server check whether the objects referred to by the query exist?
SQL Server resolves object names and binds them to logical operators in the binding phase.
What are pages?
SQL Server stores data on pages. A page is a physical unit on disk inside a SQL Server database. The size of a pages is fixed to 8 KB. A page belongs to a single object only such as a single table, index, or indexed view.
What are extents?
Pages are further grouped into logical groups of 8 pages called extents. An extent can be "mixed", if pages on this extent belong to multiple objects or "uniform" when all pages from this extent belong to a single object only.
What is one of the goals of optimization?
To lower disk I/O and lower the number of pages SQL Server has to read.
What is STATISTICS IO?
STATISTICS IO gives you information about the number of pages per table accessed by queries.
You can turn on STATISTICS IO by using the SET STATISTICS IO T-SQL command.
STATISTICS IO is a session level option.
SET STATISTICS IO ON;
SELECT * FROM Sales.Customers;
What is a "session level option"?
A session level option means that the option stays unchanged for the complete session, until you disconnect from SQL Server, unless you turn it off.
What type of information does STATISTICS IO return?
(1) Scan count - The number of index or table scans performed.
(2) Logical reads - The number of pages read from the data cache. When you read a whole table, this number gives you an estimate about the table size.
This gives you a rough estimate of the effectiveness of a query. However, you should not use this information without thinking and additional knowledge.
SQL Server counts every touch of a table, even if pages are already cached. However, when pages are in the cache, touching them is not expensive.
(3) Physical reads - The number of pages read from disk. This number is lower than the actual number of pages because many pages are cached.
(4) Read-ahead reads - The number of pages SQL Server reads ahead.
(5) Lob logical reads - The number of large object (LOB) pages read from the data cache.
(6) Lob physical reads - The number of large object-type pages read from disk.
(7) Lob read-ahead reads - The number of large object-type pages SQL Server reads ahead.
What are LOBs?
LOBs (or large object) are columns of type VARCHAR(MAX), NVARCHAR(MAX)m VARBINARY(MAX), TEXT, NTEXT, IMAGE, XML, etc.
What does DBCC DROPCLEANBUFFERS do?
DBCC DROPCLEANBUFFERS clears data from the cache. SQL Server caches data besides query and procedure plans. In order to show the IO statistics, it is good to have no data in the cache.
SQL Server caches data in order to speed up queries; because a piece of data is cached, the next time it is needed, SQL Server can retrieve it from memory and not from disk, and thus execute a query that needs data much faster.