SQL performance tuning is a huge topic, and you need to know many detailed info of behavior with different version of OS and SQL. In these posts I will share my general approach of such work.
Before Tuning
The first thing you should keep in mind is end users don’t care SQL, they care the overall application response time. You have to know your application. Sometime we need to tune how application call database. One example is you should cache your blog post from dynamical page(query from database every time) to statics page.
The second thing is database performance is related to application behavior(queries). And application behavior could change from time to time. So tuning is not one time action. You should create your performance baseline and keep on monitoring your environment. The usage of performance baseline is to tell you how SQL server run when the application response time is acceptable and to shorten your time to find out what’s abnormal. Unlike other product(AD, Exchange), SQL and IIS are platform to run your application and each SQL has its own performance baseline. You can’t simply relay on one general baseline, you should create for each of the critical application. You could tune SQL without baseline but it will take you longer time to find out what’s abnormal.
The baseline should cover
1. OS level performance counter like CPU, memory, disk and network etc.
2. SQL instance level performance counter like buffer pool, access methods etc.
3. SQL instance level DMV info like top n database, top n wait stats and top n query.
4. (optional) Application level key workload count.
Load test before RTM is very important, you could build the initial baseline here. Although it’s hard to simulate all the end user query pattern, it’s useful to find out what’s different between load test and real production loading later on. We will discuss performance counter meaning in 3rd post.
The third thing is set your tuning target and prioritize the findings. My idea is no need to change code immediately if the overall application response time is acceptable. You could put the finding into potential issue list. But you need to keep on monitoring the performance data and plan your human resource to fix these potential issues before they become real problem. One example is you find out full scan number is high but database size is small and physical memory is sufficient. Most of the read is logical read than physical read. Then there is no performance issue at this moment. But you should know that logical read will increase when table size growing, and logical read will become physical read once database size being larger than memory buffer pool size.
Tuning Workflow
Typical tuning workflow is like:
1. Data collection. There are multiple information could be captured by different tools like msinfo32, eventvwr, SQL error log, perfmon and profiler. I will share the tips in 2nd post.
2. Perfmon analysis – OS. We will find out if any hardware resource is bottleneck here.
3. Perfmon analysis – SQL. We will find out how SQL Server consume hardware resource here.
4. Top N databases by bottleneck resource. Using profiler output or DMV to find out how databases consume resource here. Move them to different instance or tune their Top N queries.
5. Top N queries by bottleneck resource. Using profiler output or DMV to find out top n queries, then tune the single execution cost or reduce the execution number.
6. Wait stats. SQL Query Duration = CPU Time + Wait Time. Both duration and CPU time could be found from profiler output and DMV, then you can calc the wait time. If wait time is high, no doubt that you should dig into wait stat.
7. Check top n queries’ execution plans, tune them case by case. There are 2 performance tuning scenarios, 1st is single query slow, 2nd is concurrency related. This step is for single query slow scenario. The latter is more difficult to tune. I plan to cover wait stat and concurrency in 4th post of this tuning serial.