在oracle的优化中,可能大家有接触过sql tuning advisor(sql调优顾问,sta),类似的mogdb/opengauss的索引推荐(index-advisor)功能也可以对你的查询进行分析,并提出合理的创建索引的建议。oracle的sta输出是以一种意见或者建议的形式,以及对每一项建议和期望效益的理由。该建议涉及对象的统计收集,新索引的创建,sql语句的重组,或sql概要的创建。你可以选择该建议来完成sql语句的调优。mogdb/opengauss的索引推荐(index-advisor)在这也是比较类似,但可能结果不如oracle的sta的优化报告详尽。
如下为我对mogdb/opengauss的索引推荐(index-advisor)功能的使用测试,包括单条sql查询索引推荐、workload级别索引推荐(针对一批sql语句的索引推荐)等。
一、测试数据导入
postgres=# create database ysla;
create database
postgres=# \c ysla
non-ssl connection (ssl connection is recommended when requiring high-security)
you are now connected to database "ysla" as user "omm".
ysla=# create table tab_ysl_1 (col1 int, col2 int, col3 text);
create table
ysla=# insert into tab_ysl_1 values(generate_series(1, 3000),generate_series(1, 3000),repeat( chr(int4(random()*26) 65),4));
insert 0 3000
ysla=# analyze tab_ysl_1;
analyze
ysla=# create table tab_ysl_2 (col1 int, col2 int);
create table
ysla=# insert into tab_ysl_2 values(generate_series(1, 1000),generate_series(1, 1000));
insert 0 1000
ysla=# analyze tab_ysl_2;
analyze
二、单条sql查询索引推荐
如下面所示,用gs_index_advise函数即可使用索引推荐,结果中包含表和可以创建索引的列。
1.测试where
ysla=# select * from gs_index_advise('select * from tab_ysl_1 where col1 = 10');
table | column
----------- --------
tab_ysl_1 | (col1)
(1 row)
2.测试join
ysla=# select * from gs_index_advise('select * from tab_ysl_1 join tab_ysl_2 on tab_ysl_1.col1 = tab_ysl_2.col1');
table | column
----------- --------
tab_ysl_1 | (col1)
tab_ysl_2 |
(2 rows)
3.测试多表
ysla=# select * from gs_index_advise('select count(*), tab_ysl_2.col1 from tab_ysl_1 join tab_ysl_2 on tab_ysl_1.col2 = tab_ysl_2.col2 where tab_ysl_2.col2 > 2 group by tab_ysl_2.col1 order by tab_ysl_2.col1');
table | column
----------- --------
tab_ysl_1 | (col2)
tab_ysl_2 | (col1)
(2 rows)
4.测试order by
ysla=# select * from gs_index_advise('select *, col2 from tab_ysl_1 order by 1, 3');
table | column
----------- --------
tab_ysl_1 |
(1 row)
ysla=# select * from gs_index_advise('select * from tab_ysl_1 where col1 > 10 order by 1,col2');
table | column
----------- --------
tab_ysl_1 |
(1 row)
5.测试过长字符串
ysla=# select * from gs_index_advise('select * from tab_ysl_1 where col3 in (''aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'',''bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb'',''ccccccccccccccccccccccccccccccccccccccc'',''ddddddddddddddddddddddddddddddddddddddd'',''ffffffffffffffffffffffffffffffffffffffff'',''ggggggggggggggggggggggggggggggggggggggggggggggggggg'',''ttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt'',''vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv'',''ggmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm'')');
error: index_advisor.cpp : 983 : the parameter destmax is too small or parameter count is larger than macro parameter securec_string_max_len. the second case only occures in functions strncat_s/strncpy_s.
三、workload级别索引推荐
这种方式可以针对多条sql,可以将待优化的sql写到文件里,通过脚本获得推荐索引。
脚本目录在安装目录的bin/dbmind/index_advisor下边,我的目录为
/opt/gaussdb/app/bin/dbmind/index_advisor/index_advisor_workload.py
将待优化的sql放到文件里
[omm@node1 index_advisor]$ cat 1.sql
select * from tab_ysl_1 where col1 = 10;
select count(*), tab_ysl_2.col1 from tab_ysl_1 join tab_ysl_2 on tab_ysl_1.col2 = tab_ysl_2.col2 where tab_ysl_2.col2 > 2 group by tab_ysl_2.col1 order by tab_ysl_2.col1;
select * from tab_ysl_1 join tab_ysl_2 on tab_ysl_1.col1 = tab_ysl_2.col1;
使用如下方式调用脚本,可以批量获取推荐索引,26000为我的数据库端口,ysla为我的数据库名,1.sql为我待优化的sql存放的文件
[omm@node1 index_advisor]$ pwd
/opt/gaussdb/app/bin/dbmind/index_advisor
[omm@node1 index_advisor]$ python3 ./index_advisor_workload.py 26000 ysla 1.sql
###### ############################################################## generate candidate indexes
table: tab_ysl_1 columns: col1
table: tab_ysl_1 columns: col2
table: tab_ysl_2 columns: col1
###### ############################################################### determine optimal indexes
create index ind0 on tab_ysl_1(col1);
四、索引效率查看
这里验证下索引推荐给我们推荐的索引究竟是否起到优化作用。
[omm@node1 index_advisor]$ cat 1.sql
select * from tab_ysl_1 where col1 = 10;
[omm@node1 index_advisor]$ time gsql -d ysla -p 26000 -f 1.sql
col1 | col2 | col3
------ ------ ------
10 | 10 | ssss
(1 row)
total time: 35 ms
real 0m0.050s
user 0m0.007s
sys 0m0.002s
可以看到上边未优化的sql执行时间为0m0.050s
[omm@node1 index_advisor]$ python3 ./index_advisor_workload.py 26000 ysla 1.sql
###### ############################################################## generate candidate indexes
table: tab_ysl_1 columns: col1
###### ############################################################### determine optimal indexes
create index ind0 on tab_ysl_1(col1);
通过index-advisor获取推荐索引。并创建索引
ysla=# create index ind0 on tab_ysl_1(col1);
create index
可以看到查询的时间明显减少。
[omm@node1 index_advisor]$ time gsql -d ysla -p 26000 -f 1.sql
col1 | col2 | col3
------ ------ ------
10 | 10 | ssss
(1 row)
total time: 0 ms
real 0m0.016s
user 0m0.009s
sys 0m0.000s
一般在加索引时,会堵塞dml(不过pg支持并发加索引,不堵塞dml) 。只有索引真正能起到优化作用,我们建立索引才是有意义的。虚拟索引是一个很有用的东西,没有副作用,只是虚拟的索引,建立虚拟索引后,可以通过explain来查看加索引后的成本估算,判断是否加索引cost会降低。
可以用虚拟索引检验索引的效果,根据效果可选择是否创建真实的索引优化查询。
#测试建立虚拟索引(hypopg_create_index)
ysla=# select * from hypopg_create_index('create index on tab_ysl_1(col1)');
indexrelid | indexname
------------ -----------------------------
41453 | <41453>btree_tab_ysl_1_col1
(1 row)
#显示所有创建的虚拟索引信息(enable_hypo_index)
ysla=# select * from hypopg_display_index();
indexname | indexrelid | table | column
----------------------------- ------------ ----------- --------
<41454>btree_tab_ysl_1_col1 | 41454 | tab_ysl_1 | (col1)
(1 row)
ysla=# set enable_hypo_index = on;explain select * from tab_ysl_1 where col1 = 100;
set
query plan
----------------------------------------------------------------------------------------------
index scan using <41453>btree_tab_ysl_1_col1 on tab_ysl_1 (cost=0.00..8.27 rows=1 width=13)
index cond: (col1 = 100)
(2 rows)
#测试删除指定虚拟索引(hypopg_display_index)
使用函数hypopg_drop_index删除指定oid的虚拟索引
ysla=# select * from hypopg_drop_index(41454);
hypopg_drop_index
-------------------
t
(1 row)
#使用函数hypopg_reset_index一次性清除所有创建的虚拟索引
ysla=# select * from hypopg_reset_index();
hypopg_reset_index
--------------------