一、情形描述
注:这是最近生产遇到的一个特殊场景,因备库空间紧张,需要将备库的一个pdb停了并删除数据文件来释放空间。
如有更好办法的可以一起交流,限于技术水平,难免有纰漏之处,还望多多指正。
有一套19c rac两节点生产库,数据库版本为oracle 19.15,每个节点上有一个实例,该实例对应两个pdb。因最近业务需要,需将新增的多个语种业务导入到其中一个pdb1,另一个pdb2为业务过度库,这期间插入的数据在转移到pdb1后,还会删除部分数据。
这套生产主库在同城另一个机房存在一个单节点备机,备机采用的是本地ssd盘,目前服务器上所有盘插槽已用完,无法再扩展ssd盘,备库和主库设置了三天延迟应用主库日志。
目前由于业务持续不断的往主库插入数据,并产生大量归档,导致备库asm存储空间紧张,将三天延迟改为实时同步,并保留一天归档日志,但没多久空间又产生告警。
在和业务沟通后,业务反馈可以将pdb2不进行dg同步,可在备库将其删除,只要在主库定期对pdb2下的相关业务表进行备份,备库删除pdb2,这样可以为备库腾出将近1.9t的asm存储空间,并希望在备库能在较短的时间恢复pdb2这一pdb。
为此决定在备库采用disable recovery来停pdb2,并将对应数据文件删除,恢复采用restore pluggable database,以下是在生产上的测试过程。
二、问题处理
2.1 停备库pdb删数据文件
以下操作均在备库执行。
-- 登陆备库数据库
sqlplus / as sysdba
-- 查询备库同步模式
sql> select distinct recovery_mode from v$archive_dest_status;
recovery_mode
----------------------------------
idle
managed real time apply
-- 停止备库应用日志
sql> alter database recover managed standby database cancel;
database altered.
-- 关闭备库
sql> shutdown immediate;
database closed.
database dismounted.
oracle instance shut down.
-- 启动到mount状态
sql> startup mount;
oracle instance started.
total system global area 3.2212e 10 bytes
fixed size 37471152 bytes
variable size 4026531840 bytes
database buffers 2.8119e 10 bytes
redo buffers 29634560 bytes
database mounted.
-- 查看当前备库各pdb状态
sql> show pdbs
2 pdb$seed mounted
3 pdb1 mounted
4 pdb2 mounted
-- 切换到pdb2
sql> alter session set container=pdb2;
session altered.
-- 将pdb2执行disable
sql> alter pluggable database disable recovery;
pluggable database altered.
sql> show pdbs
4 mlantrans mounted
sql> conn / as sysdba
connected.
-- 重新打开备库
sql> alter database open;
database altered.
sql> show pdbs
2 pdb$seed read only no
3 pdb1 mounted
4 pdb2 mounted
-- 启动pdb1
sql> alter session set container=pdb1;
session altered.
sql> alter database open;
database altered.
sql> conn / as sysdba
connected.
-- 备库开启实时同步
sql> alter database recover managed standby database using current logfile disconnect from session;
database altered.
sql> show pdbs
2 pdb$seed read only no
3 pdb1 read only no
4 pdb2 mounted
sql> set line 500
sql> col source_db_unique_name for a20
sql> col name for a30
sql> col value for a20
sql> col unit for a30
sql> select * from v$dataguard_stats;
0 transport lag day(2) to second(0) interval 03/06/2023 13:09:00 0
0 apply lag day(2) to second(0) interval 03/06/2023 13:09:00 0
0 apply finish time day(2) to second(3) interval 03/06/2023 13:09:00 0
0 estimated startup time 22 second 03/06/2023 13:09:00 0
sql> select con_id, name, open_mode, recovery_status from v$pdbs order by con_id, name;
select con_id, name from v$datafile order by con_id; 2 pdb$seed read only enabled
3 pdb1 read only enabled
4 pdb2 mounted disabled
sql> alter session set container=pdb2;
session altered.
sql> set lines 120
sql> set pages 9999
sql> col name for a260
-- 查询pdb2所含的数据文件
sql> select name,status from v$datafile;
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/system.2141.1102524951 sysoff
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/sysaux.2143.1102524967 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/undotbs1.2145.1102524983 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/undo_2.2146.1102524983 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/users.2147.1102525001 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/xxx_core_2005.2149.1102525009 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/t_xxxqb.2152.1102525017 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/t_xxx_idx.2169.1102525185 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/t_xxx.2171.1102525221 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/ogg.2184.1102525807 recover
dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/xxxx_tbs.2358.1102529703 recover
11 rows selected.
此时无法使用sql语句删除pdb2下的表空间,如使用sql语句删除pdb下的表空间会报如下错误:
sql> drop tablespace system including contents and datafiles;
drop tablespace system including contents and datafiles
*
error at line 1:
ora-01109: database not open
此时只能借助物理删除,使用grid用户登录到asm,然后cd 到 dg/xxxcdb/d824f08209b54437e0532310a8c08619/ 目录下,执行:
asmcmd> cd dg/xxxcdb/d824f08209b54437e0532310a8c08619 rm -rf datafile rm -rf tempfile
此时,可查看到备库asm存储剩余空间增长了将近1.9t,此时主备同步正常,并可以在主库测试创建一个表插入数据,在备库查询数据是否同步。
2.2 快速恢复备库pdb
恢复备库pdb的操作也可以在备库端进行,前提是需要在备库oracle用户下$oracle_home/network/admin下tnsnames.ora里配置了连接主库的tns连接串信息,如配置了连接串,可以通过rman restore pluggable database的方式来恢复备库pdb。
-- 备库连接主库tns执行rman 恢复
[oracle@xxx-xxx admin]$ rman target /
recovery manager: release 19.0.0.0.0 - production on mon mar 6 13:21:39 2023
version 19.14.0.0.0
米乐app官网下载 copyright (c) 1982, 2019, oracle and/or its affiliates. all rights reserved.
connected to target database: xxxcdb (dbid=xxxx)
--xxxcdb16是备库tnsnames.ora里配置的连接主库的tns串信息
rman> run{
2> restore pluggable database pdb2 from service xxxcdb16 ;
3> }
starting restore at 2023:03:0613:21:53
using target database control file instead of recovery catalog
allocated channel: ora_disk_1
channel ora_disk_1: sid=1519 device type=disk
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00014 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/system.2141.1102524951
channel ora_disk_1: restore complete, elapsed time: 00:00:25
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00015 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/sysaux.2143.1102524967
channel ora_disk_1: restore complete, elapsed time: 00:02:46
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00016 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/undotbs1.2145.1102524983
channel ora_disk_1: restore complete, elapsed time: 00:10:35
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00017 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/undo_2.2146.1102524983
channel ora_disk_1: restore complete, elapsed time: 00:12:45
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00018 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/users.2147.1102525001
channel ora_disk_1: restore complete, elapsed time: 00:00:07
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00021 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/xxx_core_2005.2149.1102525009
channel ora_disk_1: restore complete, elapsed time: 00:11:55
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00022 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/t_xxxqb.2152.1102525017
channel ora_disk_1: restore complete, elapsed time: 00:02:36
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00023 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/t_xxx_idx.2169.1102525185
channel ora_disk_1: restore complete, elapsed time: 00:03:25
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00024 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/t_xxx.2171.1102525221
channel ora_disk_1: restore complete, elapsed time: 02:15:17
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00029 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/ogg.2184.1102525807
channel ora_disk_1: restore complete, elapsed time: 00:00:07
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: using network backup set from service xxxcdb16
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00053 to dg/xxxcdb/d824f08209b54437e0532310a8c08619/datafile/xxxx_tbs.2358.1102529703
channel ora_disk_1: restore complete, elapsed time: 00:00:02
finished restore at 2023:03:0616:22:05
rman>
-- 备库恢复使用enable recovery
sql> show pdbs
con_id con_name open mode restricted
---------- ------------------------------ ---------- ----------
2 pdb$seed read only no
3 pdb1 read only no
4 pdb2 mounted
sql> alter session set container=cdb$root;
session altered.
sql> shutdown immediate;
database closed.
database dismounted.
oracle instance shut down.
sql> startup mount;
oracle instance started.
total system global area 3.2212e 10 bytes
fixed size 37471152 bytes
variable size 4026531840 bytes
database buffers 2.8119e 10 bytes
redo buffers 29634560 bytes
database mounted.
sql> alter session set container=pdb2;
session altered.
sql> alter pluggable database enable recovery;
pluggable database altered.
sql> alter session set container=cdb$root;
session altered.
sql> alter database recover managed standby database disconnect from session;
database altered.
sql> alter database recover managed standby database cancel;
database altered.
sql> show pdbs
con_id con_name open mode restricted
---------- ------------------------------ ---------- ----------
2 pdb$seed mounted
3 pdb1 mounted
4 pdb2 mounted
sql> alter database open;
database altered.
sql> show pdbs
con_id con_name open mode restricted
---------- ------------------------------ ---------- ----------
2 pdb$seed read only no
3 pdb1 mounted
4 pdb2 mounted
sql> alter session set container=pdb1;
session altered.
sql> alter database open;
database altered.
sql> show pdbs
con_id con_name open mode restricted
---------- ------------------------------ ---------- ----------
3 pdb1 read only no
sql> conn / as sysdba
connected.
sql> alter session set container=pdb2;
session altered.
sql> alter database open;
database altered.
sql> show pdbs
con_id con_name open mode restricted
---------- ------------------------------ ---------- ----------
4 pdb2 read only no
sql> conn / as sysdba
connected.
sql> alter database recover managed standby database disconnect from session;
database altered.
sql> show pdbs
con_id con_name open mode restricted
---------- ------------------------------ ---------- ----------
2 pdb$seed read only no
3 pdb1 read only no
4 pdb2 read only no
sql> alter session set container=pdb2;
session altered.
-- 视数据库大小,如果归档日志较多,需要一段时间恢复
-- 然后一段时间后查询数据文件状态,可以看到pdb2下的数据文件状态都为 online
sql> select name,status from v$datafile;
name
--------------------------------------------------------------------------------
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/system.12488.1130764925 system
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/sysaux.14822.1130764951 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/undotbs1.2929.1130765115 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/undo_2.10835.1130765751 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/users.2202.1130766517 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/mic_core_2005.10290.1130766523 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/t_micqb.14906.1130767239 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/t_pdb1_idx.10384.1130767395 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/t_pdb1.7809.1130767601 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/ogg.7057.1130775717 online
dg/pdb1dg/d824f08209b54437e0532310a8c08619/datafile/xstream_tbs.13789.1130775725 online
11 rows selected.
此时实时同步主备库,使用rman 恢复还可以开启多个channel来加快恢复的速度。
注意,在使用rman 恢复备库pdb时,如果备库tnsnames.ora里配置的tns串带有(.) 符号会导致报错,报错信息如下:
rman> run {
2> restore pluggable database pdb2 from service xxxcdb.host16;
rman-00571: ===========================================================
rman-00569: =============== error message stack follows ===============
rman-00571: ===========================================================
rman-00558: error encountered while parsing input commands
rman-01009: syntax error: found "dot": expecting one of: "allforeign, application, archivelog, asdecrypted, asencrypted, backupset, channel, check, controlfile, database, database root, datafile, device, dump, farsync, file_name_convert, force, foreign, from, frompreplugin, from service, high, pluggable, preview, primary, section, skip preplugin, skip readonly, spfile, standby, tablespace, to restore point, until restore point, until, using, validate, (, ;"
rman-01007: at line 2 column 58 file: standard input
此时可将tns连接串在tnsnames.ora里修改为xxxcdbhost这种不带(.)的才可以。
三、总结
本场景所描述的是一特例,搭建dg还是要初期为主备库规划好相应存储空间,另外对于备库pdb库的恢复,还依赖于备库保留主库的归档日志时间,通过时间太长,归档日志不存在,会导致产生gap,就无法使用文中的恢复方法了,而是要选择归档日志采用不完全恢复了。