欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

SQL Server 列存储索引性能总结(10)——行组的大小影响

程序员文章站 2022-07-15 10:37:28
...

接上文SQL Server 列存储索引性能总结(9)——重建和重组聚集列存储索引所需的内存我们知道,为了更好的性能,行组(row group)的大小最好是1045678行,或者小于10万行。如果没有办法达到最佳大小,在读取大量数据的时候,就很难用到列存储的优点。

前言

   在列存储索引中,最重要的概念就是行组和片段,它们分别代表了数据存储在行存储和列存储中。在片段中,不管你存了1行还是100万行数,读取的时候都是每个页或者区来读取,所以如果行数太少,是挺浪费的。
   如果对未排序的列使用筛选条件,那可能会调用很多额外的片段,因为片段最好还好是已排序。未排序的数据可能会分不到多个片段中。我们知道页越多,最终的性能就越差。对于小型表(百万/千万级别),当然很难只读一个片段,也不可能只读一个页/区,不过这种规模并不是非常影响。但是如果是数十亿行的表,不必要的片段将会成为性能杀手。

环境搭建

   接下来继续用ContosoRetailDW来做演示,并把兼容级别设置到150也就是使用SQL Server 2019的特性。:


USE [master]
GO
ALTER DATABASE [ContosoRetailDW] SET COMPATIBILITY_LEVEL = 150
GO

-- 创建聚集列存储索引:
create clustered columnstore Index CCI  on dbo.FactOnlineSales;


select * into dbo.FactOnlineSales_SmallGroups_Test from dbo.FactOnlineSales;

   接下来的技巧要注意了,我把SQL Server的Max Server Memory降低,比如300MB(只能在你自己的实验环境下测试,毕竟300MB内存在任何企业环境下都会导致系统缓慢甚至无法响应),用来强制只用少量的行创建行组:

EXEC sys.sp_configure N'show advanced options', N'1'  RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'max server memory (MB)', N'300'
GO
RECONFIGURE WITH OVERRIDE
GO
EXEC sys.sp_configure N'show advanced options', N'0'  RECONFIGURE WITH OVERRIDE
GO

   接下来创建聚集列存储索引到测试表上,由于内存原因,需要跑一段时间,大概3分钟左右:

create clustered columnstore index CCI on dbo.FactOnlineSales_SmallGroups_test;

   然后对比一下空间大小:

exec sp_spaceused '[dbo].[FactOnlineSales]';
exec sp_spaceused '[dbo].[FactOnlineSales_SmallGroups_test]';

SQL Server 列存储索引性能总结(10)——行组的大小影响

   两者有所差距,但是大小不是非常明显,源表占了163MB的空间,测试表有189MB。但是一旦行组的数量非常多的时候,这个差异将会非常明显。我们来细化一下两个表的行组信息:

SELECT object_name(i.object_id) as TableName, count(*) as RowGroupsCount
	FROM sys.indexes AS i
	INNEr JOIN sys.column_store_row_groups AS rg with(nolock)
		ON i.object_id = rg.object_id
	AND i.index_id = rg.index_id 
	WHERE object_name(i.object_id) in ( 'FactOnlineSales','FactOnlineSales_SmallGroups_test')
	group by object_name(i.object_id)
	ORDER BY object_name(i.object_id);

SQL Server 列存储索引性能总结(10)——行组的大小影响
   可以看出行组的数量差异很大。测试表有79个行组但是源表只有15个,差了快6倍。接下来看看查询(打开实际执行计划)的效果:

dbcc freeproccache;
dbcc dropcleanbuffers;
set statistics io on
set statistics time on

select prod.ProductName, sum(sales.SalesAmount)
	from dbo.FactOnlineSales sales
		inner join dbo.DimProduct prod
			on sales.ProductKey = prod.ProductKey
		inner join dbo.DimCurrency cur
			on sales.CurrencyKey = cur.CurrencyKey
		inner join dbo.DimPromotion prom
			on sales.PromotionKey = prom.PromotionKey
	where cur.CurrencyName = 'USD' and prom.EndDate >= '2004-01-01' 
	group by prod.ProductName;
--清空缓存以免受影响
dbcc freeproccache;
dbcc dropcleanbuffers;

select prod.ProductName, sum(sales.SalesAmount)
	from dbo.FactOnlineSales_SmallGroups_test sales
		inner join dbo.DimProduct prod
			on sales.ProductKey = prod.ProductKey
		inner join dbo.DimCurrency cur
			on sales.CurrencyKey = cur.CurrencyKey
		inner join dbo.DimPromotion prom
			on sales.PromotionKey = prom.PromotionKey
	where cur.CurrencyName = 'USD' and prom.EndDate >= '2004-01-01' 
	group by prod.ProductName;

   执行计划看上去没有明显差异,均占据开销50%。
   从这些信息来看,第一个执行比第二个要慢,从CPU Time(CPU时间源表小于测试表)和Escaped Time(源表大于测试表)可以看出。如果查看Statistics IO的结果,可以看到总逻辑读还是有点差异的(源表:19,640,测试表:22,808)。另外从执行时间来看:源表15个行组1371 ms,测试表29个行组 1253 ms,没有非常大的差异。



(2516 行受影响)
Table 'FactOnlineSales'. Scan count 4, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 6420, lob physical reads 33, lob page server reads 0, lob read-ahead reads 13220, lob page server read-ahead reads 0.
Table 'FactOnlineSales'. Segment reads 15, segment skipped 0.
Table 'DimProduct'. Scan count 5, logical reads 370, physical reads 1, page server reads 0, read-ahead reads 123, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimPromotion'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimCurrency'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 行受影响)

 SQL Server Execution Times:
   CPU time = 756 ms,  elapsed time = 618 ms.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.


(2516 行受影响)
Table 'FactOnlineSales_SmallGroups_Test'. Scan count 4, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 8609, lob physical reads 158, lob page server reads 0, lob read-ahead reads 14199, lob page server read-ahead reads 0.
Table 'FactOnlineSales_SmallGroups_Test'. Segment reads 75, segment skipped 0.
Table 'DimProduct'. Scan count 5, logical reads 370, physical reads 1, page server reads 0, read-ahead reads 126, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimCurrency'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'DimPromotion'. Scan count 5, logical reads 4, physical reads 1, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 行受影响)

 SQL Server Execution Times:
   CPU time = 931 ms,  elapsed time = 322 ms.

   接下来用下面的命令再做进一步分析:

SELECT i.name, object_name(p.object_id) tablename, p.index_id, i.type_desc 
   	,sum(p.rows)/count(seg.segment_id) as 'rows'
	,sum(seg.on_disk_size) as 'size in Bytes'
	,cast( sum(seg.on_disk_size) / 1024. / 1024. / 1024 as decimal(8,3)) as 'size in GB'
	,count(distinct seg.segment_id) as 'Segments'
	,count(distinct p.partition_id) as 'Partitions'
	FROM sys.column_store_segments AS seg 
		INNER JOIN sys.partitions AS p 
			ON seg.hobt_id = p.hobt_id 
		INNER JOIN sys.indexes AS i 
			ON p.object_id = i.object_id
	WHERE i.type in (5, 6)
	GROUP BY i.name, p.object_id, p.index_id, i.type_desc;

   结果如下,片段的多少并不是非常影响整体体积,毕竟是使用了高效的列式压缩。
SQL Server 列存储索引性能总结(10)——行组的大小影响
   还有字典的情况:

select 
	OBJECT_NAME(t.object_id) as 'Table Name',
	sum(dict.on_disk_size)/1024./1024 as DictionarySizeMB
	from sys.column_store_dictionaries dict
	inner join sys.partitions as p 
		ON dict.partition_id = p.partition_id
	inner join sys.tables t
		ON t.object_id = p.object_id
	inner join sys.indexes i
		ON i.object_id = t.object_id
	where i.type in (5,6) -- Clustered 和 Nonclustered Columnstore
	group by t.object_id

   在字典层面,测试表占了更大的字典大小。另外如果检查每个列的字典数量和类型,可以看到下面结果:
SQL Server 列存储索引性能总结(10)——行组的大小影响

select t.name as 'Table Name'
	,dict.column_id
	,col.name
	,tp.name
	,case dict.dictionary_id
		when 0 then 'Global Dictionary'
		else 'Local Dictionary'
	end as 'Dictionary Type'
	,count(dict.type) as 'Count'
	,sum(dict.on_disk_size) as 'Size in Bytes'
	,cast(sum(dict.on_disk_size) / 1024.0 / 1024 as Decimal(16,3)) as 'Size in MBytes'
	from sys.column_store_dictionaries dict
	inner join sys.partitions as p 
		ON dict.partition_id = p.partition_id
	inner join sys.tables t
		ON t.object_id = p.object_id
	inner join sys.all_columns col
		on col.column_id = dict.column_id and col.object_id = t.object_id
	inner join sys.types tp 
		ON col.system_type_id = tp.system_type_id AND col.user_type_id = tp.user_type_id   
	where t.[is_ms_shipped] = 0 
		and col.name in ('SalesAmount','ProductKey','CurrencyKey','PromotionKey')
	group by t.name,
			 case dict.dictionary_id
				when 0 then 'Global Dictionary'
				else 'Local Dictionary'
			 end, 
			 col.name,
			 tp.name,
			 dict.column_id
	order by dict.column_id, t.name;

SQL Server 列存储索引性能总结(10)——行组的大小影响

   对比Size的话,实际上两者差距还是挺大的。特别是Local Dictionary,接近10倍的差距。

总结

   从上面的结果看出,小型行组跟大型行组在某些指标上各有优势,所以我们不能一概而论,还是那句话:具体问题具体分析。
   对于这种行组数量差异,只要对聚集列存储索引rebuild一下即可。可以看到其实微软还是希望你使用大型行组的。毕竟rebuild是经常需要用到的维护操作,一旦rebuild成功,行组就会恢复差不多的水平。
SQL Server 列存储索引性能总结(10)——行组的大小影响
   最后记得把Max Server Memory调回去。
   下一文: