如何在含有recover 状态的数据文件环境中,做异机恢复?
背景:
我们在一些恢复测试案例中,会经常遇到一些奇怪的问题,其中有的是源端数据文件不规范而导致恢复过程出错,比较常见的错误有:
- 数据文件名称重复(如:/oradata1/user01.dbf 和 /oradata2/user01.dbf),这种情况下,我们在做恢复测试时候,如果往同一块盘中恢复数据文件,就会报错,需要我们重新rename 重名的datafile。
- 数据文件状态为recover,非online,通常发生在客户使用rac环境,增加数据文件时候,错误的将datafile加到了本地磁盘,而发现错误后,又执行了offline datafile操作。
本文,主要测试,在数据库中,含有recover 状态的数据文件,如果做恢复测试?
[toc]
sql> create tablespace test datafile '/u01/app/oracle/oradata/test/test01.dbf' size 10m autoextend on extent management local;
tablespace created.
sql> alter tablespace test add datafile '/u01/app/oracle/oradata/test/test02.dbf' size 10m;
tablespace altered.
sql> create table sxc.test1 tablespace test as select * from dba_objects;
table created.
sql> select count(*) from sxc.test1 ;
count(*)
----------
78918
3.1 执行offline datafile
sql> set line222
sql> col name for a80
sql> select file#,name,status from v$datafile;
file# name status
---------- -------------------------------------------------------------------------------- --------------
1 /u01/app/oracle/oradata/test/system01.dbf system
2 /u01/app/oracle/oradata/test/sysaux01.dbf online
3 /u01/app/oracle/oradata/test/undotbs01.dbf online
4 /u01/app/oracle/oradata/test/users01.dbf online
5 /u01/app/oracle/oradata/test/example01.dbf online
6 /u01/app/oracle/oradata/test/test01.dbf online
7 /u01/app/oracle/oradata/test/test02.dbf online
sql> alter database datafile 7 offline;
database altered.
sql> select file#,name,status from v$datafile;
file# name status
---------- -------------------------------------------------------------------------------- --------------
1 /u01/app/oracle/oradata/test/system01.dbf system
2 /u01/app/oracle/oradata/test/sysaux01.dbf online
3 /u01/app/oracle/oradata/test/undotbs01.dbf online
4 /u01/app/oracle/oradata/test/users01.dbf online
5 /u01/app/oracle/oradata/test/example01.dbf online
6 /u01/app/oracle/oradata/test/test01.dbf online
7 /u01/app/oracle/oradata/test/test02.dbf recover
7 rows selected.
此时,我们可以看到datafile 7 的状态已经变为recover。
3.2 再次查询,报错
sql> select count(*) from sxc.test1 ;
select count(*) from sxc.test1
*
error at line 1:
ora-00376: file 7 cannot be read at this time
ora-01110: data file 7: '/u01/app/oracle/oradata/test/test02.dbf'
3.3 继续创建测试表
sql> create table sxc.test2 tablespace test as select * from dba_objects;
table created.
sql> select count(*) from sxc.test2 ;
count(*)
----------
78919
3.4 手动删除归档日志
sql> alter system switch logfile ;
system altered.
sql> /
system altered.
sql> alter system checkpoint;
system altered.
-- 手动删除归档日志
[oracle@ora11204 ~]$ rm -rf 1_*
3.5 继续插入数据,切换日志
sql> insert into sxc.test2 select * from dba_objects;
78919 rows created.
sql> select count(*) from sxc.test2 ;
count(*)
----------
157838
sql> alter system switch logfile ;
system altered.
sql> /
system altered.
sql> alter system checkpoint;
system altered.
结论:虽然datafile 7 的状态变成了recover ,但是受影响的只是原来的表test1,而新表test2不受影响,可以继续写入。 其实也就是说,如果offline的datafile中不含有对象,那么对数据的使用,没有影响。
3.6 尝试recover datafile 7
sql> recover datafile 7;
ora-00279: change 4592370 generated at 07/09/2022 12:38:37 needed for thread 1
ora-00289: suggestion : /home/oracle/1_37_1104664055.dbf
ora-00280: change 4592370 for thread 1 is in sequence #37
specify log: {=suggested | filename | auto | cancel}
auto
ora-00308: cannot open archived log '/home/oracle/1_37_1104664055.dbf'
ora-27037: unable to obtain file status
linux-x86_64 error: 2: no such file or directory
additional information: 3
ora-00308: cannot open archived log '/home/oracle/1_37_1104664055.dbf'
ora-27037: unable to obtain file status
linux-x86_64 error: 2: no such file or directory
additional information: 3
结论:这个时候,如果我们想去recover datafile 7,恢复正常的online状态,已经不行了,因为归档日志都已经被手动删除了。
所以正确的做法应该是: 在3.1手动执行offline datafile之后,立即做recover datafile,这样不管到什么时候,都可以online datafile,即使归档日志已经被删除。
此时我们的数据库中就包含了一个recover 状态的数据文件,那么我们接下来看,如何备份恢复?
4.1 rman备份
-- 备份前,先删掉expired archivelog
[oracle@ora11204 backup]$ rman target /
recovery manager: release 11.2.0.4.0 - production on sat jul 9 12:47:26 2022
米乐app官网下载 copyright (c) 1982, 2011, oracle and/or its affiliates. all rights reserved.
connected to target database: lhr11g (dbid=2007947551)
rman> crosscheck archivelog all;
rman> delete expired archivelog all;
do you really want to delete the above objects (enter yes or no)? yes
deleted archived log
archived log file name=/home/oracle/1_18_1104664055.dbf recid=30 stamp=1107010872
deleted archived log
archived log file name=/home/oracle/1_19_1104664055.dbf recid=31 stamp=1107084082
deleted archived log
archived log file name=/home/oracle/1_20_1104664055.dbf recid=32 stamp=1107084317
deleted archived log
archived log file name=/home/oracle/1_21_1104664055.dbf recid=33 stamp=1107248395
deleted archived log
archived log file name=/home/oracle/1_23_1104664055.dbf recid=35 stamp=1107248458
deleted archived log
archived log file name=/home/oracle/1_24_1104664055.dbf recid=36 stamp=1107248477
deleted archived log
archived log file name=/home/oracle/1_25_1104664055.dbf recid=37 stamp=1107248494
deleted archived log
archived log file name=/home/oracle/1_26_1104664055.dbf recid=38 stamp=1107248708
deleted archived log
archived log file name=/home/oracle/1_27_1104664055.dbf recid=39 stamp=1107249210
deleted archived log
archived log file name=/home/oracle/1_28_1104664055.dbf recid=40 stamp=1107249384
deleted archived log
archived log file name=/home/oracle/1_29_1104664055.dbf recid=41 stamp=1107594979
deleted archived log
archived log file name=/home/oracle/1_30_1104664055.dbf recid=42 stamp=1107965313
deleted archived log
archived log file name=/home/oracle/1_31_1104664055.dbf recid=43 stamp=1109430200
deleted archived log
archived log file name=/home/oracle/1_32_1104664055.dbf recid=44 stamp=1109400986
deleted archived log
archived log file name=/home/oracle/1_33_1104664055.dbf recid=45 stamp=1109514747
deleted archived log
archived log file name=/home/oracle/1_34_1104664055.dbf recid=46 stamp=1109515327
deleted archived log
archived log file name=/home/oracle/1_35_1104664055.dbf recid=47 stamp=1109593810
deleted archived log
archived log file name=/home/oracle/1_36_1104664055.dbf recid=48 stamp=1109593862
deleted archived log
archived log file name=/home/oracle/1_37_1104664055.dbf recid=49 stamp=1109594665
deleted archived log
archived log file name=/home/oracle/1_38_1104664055.dbf recid=50 stamp=1109594666
deleted 20 expired objects
-- 正式备份
rman> run{
2> allocate channel c1 device type disk;
3> backup database filesperset 3 format '/home/oracle/backup/db_%d_%t_%u';
4> backup archivelog all format '/home/oracle/backup/arc_%t_%s';
5> backup current controlfile format '/home/oracle/backup/cntrl_%s_%p_%s';
6> release channel c1;
7> }
released channel: ora_disk_1
allocated channel: c1
channel c1: sid=33 device type=disk
starting backup at 2022-07-09 12:48:34
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/app/oracle/oradata/test/system01.dbf
channel c1: starting piece 1 at 2022-07-09 12:48:34
channel c1: finished piece 1 at 2022-07-09 12:48:41
piece handle=/home/oracle/backup/db_lhr11g_20220709_4k1264p2_1_1 tag=tag20220709t124834 comment=none
channel c1: backup set complete, elapsed time: 00:00:07
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00003 name=/u01/app/oracle/oradata/test/undotbs01.dbf
input datafile file number=00005 name=/u01/app/oracle/oradata/test/example01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/test/test02.dbf
channel c1: starting piece 1 at 2022-07-09 12:48:41
channel c1: finished piece 1 at 2022-07-09 12:48:44
piece handle=/home/oracle/backup/db_lhr11g_20220709_4l1264p9_1_1 tag=tag20220709t124834 comment=none
channel c1: backup set complete, elapsed time: 00:00:03
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00002 name=/u01/app/oracle/oradata/test/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/test/users01.dbf
input datafile file number=00006 name=/u01/app/oracle/oradata/test/test01.dbf
channel c1: starting piece 1 at 2022-07-09 12:48:44
channel c1: finished piece 1 at 2022-07-09 12:48:47
piece handle=/home/oracle/backup/db_lhr11g_20220709_4m1264pc_1_1 tag=tag20220709t124834 comment=none
channel c1: backup set complete, elapsed time: 00:00:03
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
including current control file in backup set
including current spfile in backup set
channel c1: starting piece 1 at 2022-07-09 12:48:48
channel c1: finished piece 1 at 2022-07-09 12:48:49
piece handle=/home/oracle/backup/db_lhr11g_20220709_4n1264pf_1_1 tag=tag20220709t124834 comment=none
channel c1: backup set complete, elapsed time: 00:00:01
finished backup at 2022-07-09 12:48:49
starting backup at 2022-07-09 12:48:50
current log archived
channel c1: starting archived log backup set
channel c1: specifying archived log(s) in backup set
input archived log thread=1 sequence=39 recid=51 stamp=1109594808
input archived log thread=1 sequence=40 recid=52 stamp=1109594809
input archived log thread=1 sequence=41 recid=53 stamp=1109594930
channel c1: starting piece 1 at 2022-07-09 12:48:50
channel c1: finished piece 1 at 2022-07-09 12:48:51
piece handle=/home/oracle/backup/arc_1109594930_152 tag=tag20220709t124850 comment=none
channel c1: backup set complete, elapsed time: 00:00:01
finished backup at 2022-07-09 12:48:51
starting backup at 2022-07-09 12:48:51
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
including current control file in backup set
channel c1: starting piece 1 at 2022-07-09 12:48:52
channel c1: finished piece 1 at 2022-07-09 12:48:53
piece handle=/home/oracle/backup/cntrl_153_1_153 tag=tag20220709t124851 comment=none
channel c1: backup set complete, elapsed time: 00:00:01
finished backup at 2022-07-09 12:48:53
released channel: c1
4.2. 恢复db
利用备份做异机恢复(拷贝备份、pfile文件、密码文件等过程省略…)
4.2.1 启动数据库至nomount状态
sql> startup nomount oracle instance started. total system global area 1068937216 bytes fixed size 2260088 bytes variable size 331350920 bytes database buffers 729808896 bytes redo buffers 5517312 bytes
4.2.2 恢复控制文件
[oracle@ora11204 backup]$ rman target /
recovery manager: release 11.2.0.4.0 - production on sat jul 9 12:51:00 2022
米乐app官网下载 copyright (c) 1982, 2011, oracle and/or its affiliates. all rights reserved.
connected to target database: lhr11g (not mounted)
rman> restore controlfile from '/home/oracle/backup/cntrl_153_1_153';
starting restore at 2022-07-09 12:51:01
using target database control file instead of recovery catalog
allocated channel: ora_disk_1
channel ora_disk_1: sid=19 device type=disk
channel ora_disk_1: restoring control file
channel ora_disk_1: restore complete, elapsed time: 00:00:01
output file name=/u01/app/oracle/oradata/test/control01.ctl
finished restore at 2022-07-09 12:51:03
4.2.3 mount数据库
rman> alter database mount;
database mounted
released channel: ora_disk_1
4.2.4 restore database
rman> restore database;
starting restore at 2022-07-09 12:51:23
starting implicit crosscheck backup at 2022-07-09 12:51:23
allocated channel: ora_disk_1
channel ora_disk_1: sid=19 device type=disk
crosschecked 22 objects
finished implicit crosscheck backup at 2022-07-09 12:51:24
starting implicit crosscheck copy at 2022-07-09 12:51:24
using channel ora_disk_1
finished implicit crosscheck copy at 2022-07-09 12:51:24
searching for all files in the recovery area
cataloging files...
no files cataloged
using channel ora_disk_1
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00001 to /u01/app/oracle/oradata/test/system01.dbf
channel ora_disk_1: reading from backup piece /home/oracle/backup/db_lhr11g_20220709_4k1264p2_1_1
channel ora_disk_1: piece handle=/home/oracle/backup/db_lhr11g_20220709_4k1264p2_1_1 tag=tag20220709t124834
channel ora_disk_1: restored backup piece 1
channel ora_disk_1: restore complete, elapsed time: 00:00:03
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00003 to /u01/app/oracle/oradata/test/undotbs01.dbf
channel ora_disk_1: restoring datafile 00005 to /u01/app/oracle/oradata/test/example01.dbf
channel ora_disk_1: restoring datafile 00007 to /u01/app/oracle/oradata/test/test02.dbf
channel ora_disk_1: reading from backup piece /home/oracle/backup/db_lhr11g_20220709_4l1264p9_1_1
channel ora_disk_1: piece handle=/home/oracle/backup/db_lhr11g_20220709_4l1264p9_1_1 tag=tag20220709t124834
channel ora_disk_1: restored backup piece 1
channel ora_disk_1: restore complete, elapsed time: 00:00:03
channel ora_disk_1: starting datafile backup set restore
channel ora_disk_1: specifying datafile(s) to restore from backup set
channel ora_disk_1: restoring datafile 00002 to /u01/app/oracle/oradata/test/sysaux01.dbf
channel ora_disk_1: restoring datafile 00004 to /u01/app/oracle/oradata/test/users01.dbf
channel ora_disk_1: restoring datafile 00006 to /u01/app/oracle/oradata/test/test01.dbf
channel ora_disk_1: reading from backup piece /home/oracle/backup/db_lhr11g_20220709_4m1264pc_1_1
channel ora_disk_1: piece handle=/home/oracle/backup/db_lhr11g_20220709_4m1264pc_1_1 tag=tag20220709t124834
channel ora_disk_1: restored backup piece 1
channel ora_disk_1: restore complete, elapsed time: 00:00:03
finished restore at 2022-07-09 12:51:33
4.2.5 在rman中执行recover database
rman> recover database;
starting recover at 2022-07-09 12:51:49
using channel ora_disk_1
starting media recovery
archived log for thread 1 with sequence 39 is already on disk as file /home/oracle/1_39_1104664055.dbf
archived log for thread 1 with sequence 40 is already on disk as file /home/oracle/1_40_1104664055.dbf
archived log for thread 1 with sequence 41 is already on disk as file /home/oracle/1_41_1104664055.dbf
oracle error:
ora-01547: warning: recover succeeded but open resetlogs would get error below
ora-01152: file 1 was not restored from a sufficiently old backup
ora-01110: data file 1: '/u01/app/oracle/oradata/test/system01.dbf'
rman-00571: ===========================================================
rman-00569: =============== error message stack follows ===============
rman-00571: ===========================================================
rman-03002: failure of recover command at 07/09/2022 12:51:49
rman-06053: unable to perform media recovery because of missing log
rman-06025: no backup of archived log for thread 1 with sequence 38 and starting scn of 4593111 found to restore
rman-06025: no backup of archived log for thread 1 with sequence 37 and starting scn of 4591558 found to restore
结论:我们直接用传统的方式,在rman中执行recover database,会报错,需要37、38 两个已经被删除的archive log,无法完成recover。
4.2.6 在sqlplus中执行recover database
-- 先恢复归档全备中的归档日志
rman> restore archivelog from logseq 39 thread 1;
starting restore at 2022-07-09 13:02:02
using target database control file instead of recovery catalog
allocated channel: ora_disk_1
channel ora_disk_1: sid=1 device type=disk
channel ora_disk_1: starting archived log restore to default destination
channel ora_disk_1: restoring archived log
archived log thread=1 sequence=39
channel ora_disk_1: restoring archived log
archived log thread=1 sequence=40
channel ora_disk_1: restoring archived log
archived log thread=1 sequence=41
channel ora_disk_1: reading from backup piece /home/oracle/backup/arc_1109594930_152
channel ora_disk_1: piece handle=/home/oracle/backup/arc_1109594930_152 tag=tag20220709t124850
channel ora_disk_1: restored backup piece 1
channel ora_disk_1: restore complete, elapsed time: 00:00:01
finished restore at 2022-07-09 13:02:03
-- 执行 recover database操作
sql> recover database using backup controlfile until cancel;
ora-00279: change 4592370 generated at 07/09/2022 12:38:37 needed for thread 1
ora-00289: suggestion : /home/oracle/1_37_1104664055.dbf
ora-00280: change 4592370 for thread 1 is in sequence #37
结论:我们在sqlplus中执行recover database,也会报错,需要37、38 两个已经被删除的archive log,无法完成recover。
4.3. 解决办法
4.3.1 查询数据文件状态
sql> set line222
sql> col name for a60
sql> select a.file#,a.name,a.recover,a.checkpoint_change#,status from v$datafile_header a;
file# name recove checkpoint_change# status
---------- --------------------------------------------------------- ------ ------------------ -------------
1 /u01/app/oracle/oradata/test/system01.dbf 4593921 online
2 /u01/app/oracle/oradata/test/sysaux01.dbf 4593926 online
3 /u01/app/oracle/oradata/test/undotbs01.dbf 4593924 online
4 /u01/app/oracle/oradata/test/users01.dbf 4593926 online
5 /u01/app/oracle/oradata/test/example01.dbf 4593924 online
6 /u01/app/oracle/oradata/test/test01.dbf 4593926 online
7 /u01/app/oracle/oradata/test/test02.dbf 4592370 online
7 rows selected.
sql> select file#,name,status from v$datafile;
file# name status
---------- --------------------------------------------------------- --------------
1 /u01/app/oracle/oradata/test/system01.dbf system
2 /u01/app/oracle/oradata/test/sysaux01.dbf online
3 /u01/app/oracle/oradata/test/undotbs01.dbf online
4 /u01/app/oracle/oradata/test/users01.dbf online
5 /u01/app/oracle/oradata/test/example01.dbf online
6 /u01/app/oracle/oradata/test/test01.dbf online
7 /u01/app/oracle/oradata/test/test02.dbf recover
7 rows selected.
我们可以看到在v$datafile_header视图中,datafile 7 状态仍然为online,在v$datafile中状态为recover。
4.3.2 执行offline datafile 操作
sql> alter database datafile 7 offline drop;
database altered.
sql> set line222
sql> col name for a60
sql> select a.file#,a.name,a.recover,a.checkpoint_change#,status from v$datafile_header a;
file# name recove checkpoint_change# status
---------- ---------------------------------------------------- ------ ------------------ --------------
1 /u01/app/oracle/oradata/test/system01.dbf 4593921 online
2 /u01/app/oracle/oradata/test/sysaux01.dbf 4593926 online
3 /u01/app/oracle/oradata/test/undotbs01.dbf 4593924 online
4 /u01/app/oracle/oradata/test/users01.dbf 4593926 online
5 /u01/app/oracle/oradata/test/example01.dbf 4593924 online
6 /u01/app/oracle/oradata/test/test01.dbf 4593926 online
7 /u01/app/oracle/oradata/test/test02.dbf 4592370 offline
sql> recover database using backup controlfile until cancel;
ora-00279: change 4593921 generated at 07/09/2022 12:48:34 needed for thread 1
ora-00289: suggestion : /home/oracle/1_41_1104664055.dbf
ora-00280: change 4593921 for thread 1 is in sequence #41
specify log: {=suggested | filename | auto | cancel}
auto
ora-00279: change 4593946 generated at 07/09/2022 12:48:50 needed for thread 1
ora-00289: suggestion : /home/oracle/1_42_1104664055.dbf
ora-00280: change 4593946 for thread 1 is in sequence #42
ora-00278: log file '/home/oracle/1_41_1104664055.dbf' no longer needed for
this recovery
ora-00308: cannot open archived log '/home/oracle/1_42_1104664055.dbf'
ora-27037: unable to obtain file status
linux-x86_64 error: 2: no such file or directory
additional information: 3
结论:recover不在需要37、38 两个已经被删除的archive log,正常完成recover 操作。
4.3.3 查看redo logfile & tempfile
-- 确保redo logfile & tempfile路径存在或者rename file
sql> select member from v$logfile;
member
------------------------------------------------------------------------------------------------------------
/u01/app/oracle/oradata/test/redo01.log
/u01/app/oracle/oradata/test/redo02.log
/u01/app/oracle/oradata/test/redo03.log
sql> select name from v$tempfile;
name
------------------------------------------------------------
/u01/app/oracle/oradata/test/temp01.dbf
4.3.4 open数据库
sql> alter database open resetlogs;
database altered.
4.4 验证数据
sql> select count(*) from sxc.test2 ;
count(*)
----------
157838
sql> select count(*) from sxc.test1 ;
select count(*) from sxc.test1
*
error at line 1:
ora-00376: file 7 cannot be read at this time
ora-01110: data file 7: '/u01/app/oracle/oradata/test/test02.dbf'
4.5 尝试删除datafile 7
sql> alter tablespace test drop datafile '/u01/app/oracle/oradata/test/test02.dbf';
alter tablespace test drop datafile '/u01/app/oracle/oradata/test/test02.dbf'
*
error at line 1:
ora-03264: cannot drop offline datafile of locally managed tablespace
结论:datafile 7 删除报错,如果一定要处理的话,可以通过bbed的方式去修复后删除。