I recently built a new SQL server for one of my client with
72 cores, It was an SQL 2014 enterprise edition and was an upgrade from Amazon Web
services to Physical infrastructure with higher processing power, Memory and
faster disk. The SQL Version was the same (SQL 2014 SP1 on AWS and Physical
Infrastructure) but surprising the physical infrastructure was slower than AWS
in some instances. Some queries were taking long time in the new server and
waits pointed to CPU Schedulers.
Thanks to our Performance Auditors(Mike Walsh from Linchpin
People) who found out the following message from the SQL error log,
“SQL Server detected 2 sockets with 18 cores per socket and
36 logical processors per socket, 72 total logical processors; using 40 logical
processors based on SQL Server licensing. This is an informational message; no
user action is required.”
We found the client was licenced SQL for only 40 cores even
though the hardware had 72 cores SQL can use only the licenced 40 cores. This was
still an non-issue as this was still more processing power to AWS and that’s when
Mike pointed me to Glenn Berry's Blog http://www.sqlskills.com/blogs/glenn/balancing-your-available-sql-server-core-licenses-evenly-across-numa-nodes/
As mention in the above blog post, I ran
-- SQL Server NUMA
node information
SELECT node_id,
node_state_desc, memory_node_id, processor_group,
online_scheduler_count
FROM sys.dm_os_nodes WITH (NOLOCK)
WHERE node_state_desc <> N'ONLINE DAC'
OPTION (RECOMPILE);
node_id
|
node_state_desc
|
memory_node_id
|
processor_group
|
online_scheduler_count
|
0
|
ONLINE
|
0
|
0
|
36
|
1
|
ONLINE
|
1
|
1
|
4
|
As you can see in the above result, 36 Schedulers are in use on the first NUMA node with the first 36
logical processors and then 4 are in use on the second NUMA node. This is not an
optimal configuration If we have had SQL licensed for all the 72 cores we would
have had equal balance with the schedulers on the NUMA nodes. This might be a performance bottleneck.
I followed Glenn's blog again and ran
-- SQL Server NUMA
node and cpu_id information
SELECT parent_node_id, scheduler_id,
cpu_id
FROM sys.dm_os_schedulers WITH (NOLOCK)
WHERE [status] =
N'VISIBLE ONLINE';
From the
above query result I found the cpu id’s in use were from 0 to 19 and 64 to 83
after which I ran the below ALTER query as suggested in the blog with right CPU id.
ALTER SERVER CONFIGURATION
SET PROCESS AFFINITY CPU = 0 TO 19, 64 TO 83;
Now I ran first query again to find NUMA node information,
-- SQL Server NUMA
node information
SELECT node_id,
node_state_desc, memory_node_id, processor_group,
online_scheduler_count
FROM sys.dm_os_nodes WITH (NOLOCK)
WHERE node_state_desc <> N'ONLINE DAC'
OPTION (RECOMPILE);
node_id
|
node_state_desc
|
memory_node_id
|
processor_group
|
online_scheduler_count
|
active_worker_count
|
0
|
ONLINE
|
0
|
0
|
20
|
41
|
1
|
ONLINE
|
1
|
1
|
20
|
56
|
As you can see now we reached a balance now with CPU
schedulers and the NUMA nodes. This change resolved our slowness issue and the
server is now much faster than our lower hardware system.