About Me

Saturday 8 October 2011

Excessive redolog generation during hot backups


Many of you must have heard or experienced that while taking hot backup ofdatabase LGWR process writes aggressively. Meaning that more redo data has beenwritten to redo log file and consecutively more archive logs gets generated.

Here is the common misconception we have in our mind. If some one ask, whyexcessive redo logs and archive logs are getting generated when we start a hotbackup of database ?? Quickly we answer .. Its simple, when we put tablespacein hot backup mode, Oracle will take a check point of tablespace and data filesbelonging to this tablespace will be freezed. Any user activity happening onobjects belonging to this tablespace wont write data to these datafiles,instead it will write data to redo log files. So obviously there will be moreredo log file generation.

Well, to some extent this is COMPLETELY WRONG !!!
I will straight way come to the point and explain you what happens when weput the tablespace in hot backup mode.
Your first assumption that datafiles belonging to the tablespace in hot backup mode is freezed is wrong. Datafiles are not freezed, only the datafileheaders will be freezed !! So simply imagine that when you put the tablespace in backup mode, Oracle will take a checkpoint and update the datafile headers with checkpoint SCN and there after it is freezed until we take tablespace out of backup mode.
Other datafile (other then header part) remains as normal and data changeshappens continuously to this datafile.

Now you may want to ask me “do I mean to say that datafiles gets updatedcontinuously even when we are coping the same to backup location ?”. The answeris YES. Never think that the datafile you are coping is “Consistent”. No,datafiles gets changed continuously !!!
You might want to ask couple of more questions then.

1) If we say that backup file is not consistent and changes continuously,then how come Oracle is able to recover the database when we restore thatdatafile?
2) If the data changes are anyway happening continuously on data files, thenwhy there is excess redo log generation ?

Thats it !! don’t ask me more then this. Let me explain answers to thesequestions.
Consider a typical case, where an Oracle database is installed on Linux platform. The standard Oracle block size if 8K and lets say that OS level data block size is 1K. Now when we put the tablespace in “Begin Backup” mode checkpoint has happened and datafile header is freezed. You found which are thefiles related to this tablespace and started copying using OS command. Now when you copy a datafile using OS command it is going to copy as per OS block size.Lets say when you start copying it gave 8 blocks to you to copy – that means you are copying 4K (1K X 4) to backup location. That means you are copying halfof Oracle block to backup location. Now this process of copy can be preempted by Server CPU depending on load. Lets say when you started copying after copy of those 8 block (4K, half of Oracle block), your process get preempted by CPU and it has allocated CPU time to some other important process. Mean while DBWRprocess changes that block that you have copied halfway (since datafile is not freezed and only header is freezed, continuous updates can happen to datafile).

After a while CPU returns back and gives you next 4 blocks to copy (rest ofthe half Oracle block). Now here is the problem !!! we copied half of theoracle block taken at time T0 and another half taken at time T1 and in-betweenthe data block got changed. Does this sounds consistent ? Not to me !! Such typeof block is called “Fractured Block”.

Well, since Oracle copies files like this it should do some thing, so thatduring recovery it wont face any problem.

Usually in case of a normal tablespace (which is not in begin backup mode),when a transaction happens oracle generates redo information and puts in redolog file. This is the bare minimum information that oracle generates in orderto redo the information. It does not copy the complete block. Where as in caseof begin backup mode, if a transaction happens and changes any block FOR THEFIST TIME, oracle copies the complete block to redo log file. This happens onlyduring first time. If subsequent transaction updates the same block again,oracle will not copy the complete block to redo, instead it will generateminimum information to redo the changes. Now because oracle has to copy thecomplete block when it changes for the first time in begin backup mode, we saythat excess redo gets generated when we put tablespace in begin backup mode.
Question arises, why Oracle has to copy the complete block to redo logfiles. As you have seen above that during copy of datafile, there can be many fractured blocks, and during restore and recovery its going to put those block back and try to recover. Now assume that block is fractured and oracle hasminimum information that it generates in the redo. Under such condition it wontbe possible for Oracle to recover such blocks. So instead Oracle just copiesthe entire block back from redo log files to datafiles during recovery process.This will make the datafile consistent. So recovery process is very importantwhich takes care of all fractured blocks and makes it possible to recover a database.
I hope this explains above 2 questions.

Now you can easily explain why hot backup is not possible if database is inNOARCHIVELOG mode.
When you take a backup using RMAN, it does not generate excessive redo logs.The reason is simple. RMAN is intelligent. It does not use OS block forcopying, instead it uses oracle blocks for copying datafiles so the files are consistent.

No comments:

Post a Comment