Backup and recovery refers to the various strategies and procedures involved in protecting your database against data loss and reconstructing the database after any kind of data loss.
1.0 Types of Backups
Backup is a copy of whole or part of database that can be used to reconstruct that database. Backups can be broadly divided into physical backups and logical backups.
1.0.1 Physical & Logical Backups
Physical backups are backups of the physical files used in storing and recovering your database, such as datafiles, control files, and archived redo logs. Ultimately, every physical backup is a copy of files storing database information to some other location, whether on disk or some offline storage such as tape. It is also called file system backups. Physical backups are made with RMAN and OS Utilities.
Logical backups contain logical data (for example, tables or stored procedures) which are exported from a database with an Oracle Data Pump Export program (expdp) and stored in a binary file, for later re-importing into a database using the corresponding Data Pump Import program (impdp) utility. Note, RMAN does not do logical backup and recovery.
Physical or logical backup can be a whole or partial backup. In whole backup the copy of whole database is made while in partial backup a copy of the part of db is made.
An Oracle database can run in one of two modes i.e. ARCHIVELOG or NOARCHIVELOG Mode.
In ARCHIVELOG mode the physical or logical backup can be in the form of:
- Online/Hot/Open Backup; Online backup is made while the database is open and online. It is in in-consistent state, because the files being backed up don not contain all the changes made at all the system change numbers (SCNs). In-consistent backups need Oracle recovery to be performed from online and archived redo logs.
- Offline/Cold/Close Backup; Offline backup is made while the database is in shutdown mode i.e. in shutdown in normal mode. It can be both in consistent or in-consistent state. In consistent state all the data and control files are check pointed with the same SCNs. Because, it is in consistent state thus it does not need Oracle recovery.
In NOARCHIVELOG Mode the physical or logical backup can be in the form of:
- Offline/Cold/Close Backup; Again it can be in consistent or in-consistent state. But performing in-consistent backup is not recommended in NOARCHIVELOG mode.
The following figure depicts the backup types graphically.
1.1 Database Recovery Process
Database recovery process involves constructing the contents of all or part of a database from a backup. It is a two phase process:
- Retrieving a copy of the datafile from backup, and
- Reapplying changes to the file since the backup from the archived and online redo logs, to bring the database to a desired SCN since the backup (usually, the present).
To restore a datafile or control file from backup is to retrieve the file onto disk from a backup location on tape, disk or other media, and make it available to the database server.
In this example a full backup of a database (copies of its datafiles and control file) is taken at SCN 100. Redo logs generated during the operation of the database capture all changes that occur between SCN 100 and SCN 500. Along the way, some logs fill and are archived. At SCN 500, the datafiles of the database are lost due to a media failure. The database is then returned to its transaction-consistent state at SCN 500, by restoring the datafiles from the backup taken at SCN 100, then applying the transactions captured in the archived and online redo logs and undoing the un-committed transactions.
1.1.1 Data Recovery Types
The preceding scenario outlined the basics of the restore-and-recovery process. Several variants on this scenario are important to your backup and recovery work. The following forms of data recovery scenarios can be used:
- Datafile Media Recovery
- Complete, Incomplete and Point-In-Time Recovery
- Crash Recovery
1.1.1.1 Datafile Media Recovery
Datafile media recovery (often simply called media recovery) is the most basic form of user-initiated data recovery. It can be used to recover from a lost or damaged current datafile, SPFILE or control file and apply the changes that were recorded in the redo logs or archived redo logs but not in the datafiles for a tablespace that went offline without the OFFLINE NORMAL option.
The need to restore a datafile from backup is not detected automatically. The first step in performing media recovery is to manually restore the datafile by copying it from a backup. Once a datafile has been restored from backup, however, the database does automatically detect that this datafile is out of date and must undergo media recovery.
Situations that force you to perform media recovery are:
- You restore a backup of a datafile.
- You restore a backup control file (even if all datafiles are current).
- A datafile is taken offline (either by you or automatically by the database) without the OFFLINE NORMAL option.
1.1.1.2 Complete, Incomplete and Point-In-Time Recovery
Complete recovery is recovering a database to the most recent point in time, without the loss of any committed transactions. Generally, the term "recovery" refers to complete recovery.
Occasionally, however, you need to return a database to its state at a past point in time. For example, to undo the effect of a user error, such as dropping or deleting the contents of a table, you may want to return the database to its contents before the delete occurred. In incomplete recovery, also known as point-in-time recovery, the goal is to restore the database to its state at some previous target SCN or time.
Point-in-time recovery is one possible response to a data loss caused by, for instance, a user error or logical corruption that goes unnoticed for some time.
Point-in-time recovery is also your only option if you have to perform a recovery and discover that you are missing an archived log covering time between the backup you are restoring from and the target SCN for the recovery. Without the missing log, you have no record of the updates to your datafiles during that period. Your only choice is to recover the database from the point in time of the restored backup, as far as the unbroken series of archived logs permits, then perform an OPEN RESETLOGS and abandon all changes in or after the missing log. (If you discover that you have lost archived logs and your database is still up, you should do a full backup immediately.)
1.1.1.3 Crash Recovery
The crash recovery process is a special form of recovery, which happens the first time an Oracle database instance is started after a crash (or SHUTDOWN ABORT). In crash recovery, the goal is to bring the datafiles to a transaction-consistent state, preserving all committed changes up to the point when the instance failed.
Unlike the forms of recovery performed manually after a data loss, crash recovery uses only the online redo log files and current online datafiles, as left on disk after the instance failure. Archived logs are never used during crash recovery, and datafiles are never restored from backup.
The database applies any pending updates in the online redo logs to the online datafiles of your database. The result is that, whenever the database is restarted after a crash, the datafiles reflect all committed changes up to the moment when the haven't said failure occurred. (After the database opens, any changes that were part of uncommitted transactions at the time of the crash are rolled back.)
The duration of crash recovery is a function of the number of instances needing recovery, amount of redo generated in the redo threads of crashed instances since the last checkpoint, and user-configurable factors such as the number and size of redo log files, checkpoint frequency, and the parallel recovery setting. You can set parameters in the database server that can tune the duration of crash recovery. You can also tune checkpointing to optimize recovery time.
1.1.2 When Recovery is Required from Backup?
While there are several types of problem that can halt the normal operation of an Oracle database or affect database I/O operations, only two typically require DBA intervention and media recovery:
- User Errors
- Media Failure
- Application Errors
1.1.2.1 User Errors
User errors occur when, either due to an error in application logic or a manual misstep; data in your database is changed or deleted incorrectly. Data loss due to user error includes such missteps as dropping important tables or deleting or changing the contents of a table.
While user training and careful management of privileges can prevent most user errors, your backup strategy determines how gracefully you recover the lost data when user error does cause data loss.
1.1.2.2 Media Failure
A media failure is the failure of a read or write of a disk file required to run the database, due to a physical problem with the disk such as a head crash. Any database file can be vulnerable to a media failure. The appropriate recovery technique following a media failure depends on the files affected and the types of backup available.
1.1.2.3 Application Errors
Sometimes a software malfunction can corrupt data blocks. In a physical corruption, which is also called a media corruption, the database does not recognize the block at all: the checksum is invalid, the block contains all zeros, or the header and footer of the block do not match. If the corruption is not extensive, then you can often repair it easily with block media recovery.
User errors occur when, either due to an error in application logic or a manual misstep; data in your database is changed or deleted incorrectly. Data loss due to user error includes such missteps as dropping important tables or deleting or changing the contents of a table.
While user training and careful management of privileges can prevent most user errors, your backup strategy determines how gracefully you recover the lost data when user error does cause data loss.
1.1.2.2 Media Failure
A media failure is the failure of a read or write of a disk file required to run the database, due to a physical problem with the disk such as a head crash. Any database file can be vulnerable to a media failure. The appropriate recovery technique following a media failure depends on the files affected and the types of backup available.
1.1.2.3 Application Errors
Sometimes a software malfunction can corrupt data blocks. In a physical corruption, which is also called a media corruption, the database does not recognize the block at all: the checksum is invalid, the block contains all zeros, or the header and footer of the block do not match. If the corruption is not extensive, then you can often repair it easily with block media recovery.
1.2 Instance Memory Used in Backups
In this section, we look at the memory areas that we need to be concerned with in relationship to RMAN. The principal memory structure that we are concerned with in terms of RMAN and backup and recovery is the System Global Area (SGA).
The particular substructures of SGA used by RMAN are the shared pool and the large pool. RMAN uses several Oracle PL/SQL packages. These packages must be loaded into the shared pool. The large pool is used by RMAN in specific cases and is not used by default, even if it is configured.
1.3 Database Structures Used in Database Recovery
- Datafiles and Data Blocks
- Redo Logs
- Control Files
- Undo Segments
1.4 Oracle Recovery Manager
RMAN is a stand-alone application that makes a client connection to the Oracle database to access internal backup and recovery packages. It is, at its very core, nothing more than a command interpreter that takes simplified commands you type and turns those commands into remote procedure calls (RPCs) that are executed at the database.
- Incremental backups; which provide more compact backups (storing only changed blocks) and faster datafile media recovery (reducing the need to apply redo during datafile media recovery)
- Block media recovery; in which a datafile with only a small number of corrupt data blocks can be repaired without being taken offline or restored from backup
- Unused block compression; where RMAN can in some cases skip unused datafile blocks during backups
- Binary compression; which uses a compression mechanism integrated into the Oracle database server to reduce the size of backups
- Encrypted backups; which uses encryption capabilities integrated into the Oracle database to store backups in an encrypted format
RMAN also reduces the administration work associated with your backup strategy. RMAN keeps an extensive record of metadata about backups, archived logs, and its own activities, known as the RMAN repository. In restore operations, RMAN can use this information to eliminate the need for you to identify backup files for use in restores in most situations. You can also generate reports of backup activity using the information in the repository.
1.4.1 Files That RMAN Can Back Up
RMAN can back up all database files needed for efficient recovery in the event of a failure. RMAN supports backing up the following types of files:
- Datafiles, and image copies of datafiles
- Control files, and image copies of control files
- Archived redo logs
- The current server parameter file
- Backup pieces, containing other backups created by RMAN
Note: Although the database depends on other types of files for operation, such as network configuration files, password files, and the contents of the Oracle home, these files cannot be backed up with RMAN. Likewise, some features of Oracle, such as external tables or the BFILE datatype, store data in files other than those listed here. RMAN cannot back up those files. You must use some non-RMAN backup solution for any files not in the preceding list.
1.4.2 RMAN Backup Destinations
RMAN can create and manage backups on disk and on tape, back up backups originally created on disk to tape, and restore database files from backups on disk or tape.
1.4.3 Full and Incremental Backups
Full backups are backups which include datafiles in their entirety. Full backups can be created with Recovery Manager or with operating system-level file copy commands.
1.4.4 Image Copies, Backup Sets and Backup Pieces
0 comments:
Post a Comment