Home >>DBMS Tutorial >DBMS Data Recovery

DBMS Data Recovery

DBMS Data Recovery

Crash Recovery

With hundreds of transactions being executed every second, DBMS is a highly complex system. A DBMS 's durability and robustness depend on its complex architecture and the hardware and system software underlying it. If it fails or crashes in the middle of transactions, it is assumed some kind of algorithm or techniques to recover lost data will be followed by the system.

Failure Classification

We make generalizations a failure into various categories, as follows, to see where the problem has occurred.

Transaction failure

When it fails to execute or when it reaches a point from which it can't go any further, a transaction has to abort. This is called a failure of a transaction where only a few transactions or processes are hurt.

Reasons for a failure of a transaction may be-

  • Logical errors- that do not complete a transaction because there is a code error or an internal error condition.
  • System errors- If an active transaction is terminated by the database system itself because the DBMS is unable to execute it, or because of any system condition, it has to stop. For example, the system aborts an active transaction in the event of deadlock or resource unavailability.

System Crash

Within the system, there are problems that can cause the system to stop suddenly and cause the system to crash. For instance, power supply interruptions can cause the failure of underlying hardware or software failure.

Examples may include errors with the operating system.

Disk Failure

It was a common issue in the early days of technological evolution, where hard disk or storage drives used to fail frequently.

Disk failures include bad sector creation, disk lack of access, disk head crash, or any other failure that destroys all or part of the disk storage.

Storage Structure

We have the storage system mentioned already. In brief, the structure of storage can be split into two categories:

  • Volatile storage-A volatile storage does not survive system crashes, as the name suggests. Volatile storage devices are placed very close to the CPU; usually they are embedded into the chipset itself. Examples of volatile storage, for instance, are main memory and cache memory. They are fast but can store only a small amount of information.
  • Non-volatile storage-Such memories are made for system crashes to survive. They are huge in ability for data storage, but slower in accessibility. Hard-disks, magnetic tapes, flash memory and non-volatile (battery backed up) RAM may be examples.

Recovery and Atomicity

When a system crashes, multiple transactions may be executed and different files may be opened for them to modify the data items. Transactions are made of different operations, which are atomic in nature. However, according to DBMS' ACID properties, the atomicity of transactions as a whole must be retained, i.e. if all or none of the operations are performed.

When a DBMS recovers from a crash, the following should be maintained:

There are two types of techniques, which can aid a DBMS in recovering as well as preserving the atomicity of a transaction −

  • It should verify the states of all the transactions, which were being executed.
  • A transaction may be in the middle of any operation; in this case, the DBMS must ensure that the transaction is atomic.
  • It should check if the transaction can be done now or it needs to be rolled back.
  • There will be no transactions allowed to leave the DBMS in an inconsistent state.
  • Maintaining the logs of each transaction, and writing them into some stable storage before actually modifying the database.
  • Maintain shadow paging, where modifications are made to a volatile memory, and the actual database is updated later.

Log-based Recovery

Log is a sequence of records, which maintains the records of actions performed by a transaction. It is important that the logs are written prior to the actual modification and stored on a stable storage media, which is failsafe.

Recovery based on logs works as follows-

<Tn, Start>
<Tn, X, V1, V2>

It reads Tn has changed the value of X, from V1 to V2.

<Tn, commit>
  • The log file is stored on a stable media storage system.
  • When a transaction enters and begins execution of the method, a log is written about it.
  • When the transaction modifies an item X, it write logs as follows −
  • When the transaction finishes, it logs −

Two approaches can be used to modify the database.

  • Deferred database modification:All logs are written to the stable storage and when a transaction commits, the database is modified.
  • Immediate Database Modification-A real database modification follows each log. That is, immediately after every operation, the data can be accessed.

Recovery with Concurrent Transactions

The logs are interleaved when more than one transaction is executed in parallel. It would become difficult for the recovery system to backtrack all logs at the time of recovery, and then start recovering. Most modern DBMS use the 'checkpoints' concept to ease this situation.

Checkpoint

All the memory space available in the system can be used out by keeping and storing logs in real time and in various systems. The log file may grow too big to be handled at all as time passes. Checkpoint is a process that eliminates from the system all previous logs and permanently stores them in a storage disk. Checkpoint declares that the DBMS was in a consistent state until the point at which all transactions were committed.

Recovery

When a system crashes and recovers with concurrent transactions, it acts in the following way:

  • From the end to the last checkpoint, the recovery system reads the logs backwards.
  • Two lists, an undo-list and a redo-list, are held
  • If the recovery system sees a log with < Tn, Start > and < Tn, Commit > or only < Tn, Commit >, it places the transaction in the redo-list.
  • If the recovery system sees a < Tn, Start > log, but no commit or abort log has been found, the transaction will be undo-listed.

All the undo-list transactions are then undone and their logs are removed. Before saving their logs, all the transactions in the redo-list and their previous logs are removed and then redone.