Saturday, November 11, 2023

DBMS 2 notes

Database management system imp

(click on read more)

1. Relational Database Design 

  • View
View is one of the database objects in SQL. It logically represents subsets of data from one or more table. We can presents logical subset of data by creating views of tables. A view is a logical table based on table or another view. A view is a window of table .View always depends on base table. The view is stored as SELECT statement in data dictionary.

Advantages

  1. To restrict data access.
  2. To make complex query easy.
  3. To provide data independencies.
  4. To represents different views of same data.

  • Syntax 
CREATE [OR REPLACE] [FORCE|NOFORCE] VIEW view [(alias[, alias]...)]
AS SUBQUERY
[WITH CHECK OPTION [CONSTRAINT constraint]]
[WITH READ ONLY [CONSTRAINT constraint]]

  • Example
CREATE VIEW sal5
AS SELECT employee_id ID_NUMBER, last_name NAME ,salary*12 ANN_SALARY
FROM emp
WHERE dept_id=50;

  • Introduction to the PL/pgSQL Exception clause

When an error occurs in a block, PostgreSQL will abort the execution of the block and also the surrounding transaction.

To recover from the error, you can use the exception clause in the begin...end block.

*The following illustrates the syntax of the exception clause:


<<label>>

declare

begin

    statements;

exception

    when condition [or condition...] then

       handle_exception;

   [when condition [or condition...] then

       handle_exception;]

   [when others then

       handle_other_exceptions;

   ]

end;

  • Handling exception examples

We’ll use the film table from the sample database for the demonstration.

Film table

1) Handling no_data_found exception example

The following example issues an error because the film with id 2000 does not exist.

do

$$

declare

 rec record;

 v_film_id int = 2000;

begin

 -- select a film 

 select film_id, title 

 into strict rec

 from film

 where film_id = v_film_id;

end;

$$

language plpgsql;

Code language: PostgreSQL SQL dialect and PL/pgSQL (pgsql)

Output:

ERROR: query returned no rows

CONTEXT: PL/pgSQL function inline_code_block line 6 at SQL statement

SQL state: P0002

Trigger:

An SQL trigger allows you to specify SQL actions that should be executed automatically when a specific event occurs in the database. For example, you can use a trigger to automatically update a record in one table whenever a record is inserted into another table.

Triggers are, in fact, written to be executed in response to any of the following events −

  • A database manipulation (DML) statement (DELETE, INSERT, or UPDATE)A
  •  database definition (DDL) statement (CREATE, ALTER, or DROP). 
  • database operation (SERVERERROR, LOGON, LOGOFF, STARTUP, or SHUTDOWN).
Benefits of Triggers
Triggers can be written for the following purposes 
  • Generating some derived column values automatically
  • Enforcing referential integrity
  • Event logging and storing information on table access
  • Auditing
  • Synchronous replication of tables
  • Imposing security authorizations
  • Preventing invalid transactions
Creating Triggers
The syntax for creating a trigger is −

CREATE [OR REPLACE ] TRIGGER trigger_name  
{BEFORE | AFTER | INSTEAD OF }  
{INSERT [OR] | UPDATE [OR] | DELETE}  
[OF col_name]  
ON table_name  
[REFERENCING OLD AS o NEW AS n]  
[FOR EACH ROW]  
WHEN (condition)   
DECLARE 
   Declaration-statements 
BEGIN  
   Executable-statements 
EXCEPTION 
   Exception-handling-statements 
END; 

  • Example
To start with, we will be using the CUSTOMERS table we had created and used in the previous chapters −

Select * from customers;  

+----+----------+-----+-----------+----------+ 
| ID | NAME | AGE | ADDRESS | SALARY | 
+----+----------+-----+-----------+----------+ 
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 | 
| 2 | Khilan | 25 | Delhi | 1500.00 | 
| 3 | kaushik | 23 | Kota | 2000.00 | 
| 4 | Chaitali | 25 | Mumbai | 6500.00 | 
| 5 | Hardik | 27 | Bhopal | 8500.00 | 
| 6 | Komal | 22 | MP | 4500.00 | 
+----+----------+-----+-----------+----------+ 
The following program creates a row-level trigger for the customers table that would fire for INSERT or UPDATE or DELETE operations performed on the CUSTOMERS table. This trigger will display the salary difference between the old values and new values −
CREATE OR REPLACE TRIGGER display_salary_changes 
BEFORE DELETE OR INSERT OR UPDATE ON customers 
FOR EACH ROW 
WHEN (NEW.ID > 0) 
DECLARE 
   sal_diff number; 
BEGIN 
   sal_diff := :NEW.salary  - :OLD.salary; 
   dbms_output.put_line('Old salary: ' || :OLD.salary); 
   dbms_output.put_line('New salary: ' || :NEW.salary); 
   dbms_output.put_line('Salary difference: ' || sal_diff); 
END; 
/ 

When the above code is executed at the SQL prompt, it produces the following results

Trigger created.

Cursors

cursor is a pointer to this context area. PL/SQL controls the context area through a cursor. A cursor holds the rows (one or more) returned by a SQL statement. The set of rows the cursor holds is referred to as the active set.

You can name a cursor so that it could be referred to in a program to fetch and process the rows returned by the SQL statement, one at a time. There are two types of cursors −

  • Implicit cursors
  • Explicit cursors
Implicit Cursors
Implicit cursors are automatically created by Oracle whenever an SQL statement is executed, when there is no explicit cursor for the statement. Programmers cannot control the implicit cursors and the information in it.

Whenever a DML statement (INSERT, UPDATE and DELETE) is issued, an implicit cursor is associated with this statement. For INSERT operations, the cursor holds the data that needs to be inserted. For UPDATE and DELETE operations, the cursor identifies the rows that would be affected.

In PL/SQL, you can refer to the most recent implicit cursor as the SQL cursor, which always has attributes such as %FOUND, %ISOPEN, %NOTFOUND, and %ROWCOUNT. The SQL cursor has additional attributes, %BULK_ROWCOUNT and %BULK_EXCEPTIONS, designed for use with the FORALL statement.

Explicit Cursors

Explicit cursors are programmer-defined cursors for gaining more control over the context area. An explicit cursor should be defined in the declaration section of the PL/SQL Block. It is created on a SELECT Statement which returns more than one row.

The syntax for creating an explicit cursor is −

CURSOR cursor_name IS select_statement; 

Working with an explicit cursor includes the following steps −

  • Declaring the cursor for initializing the memory
  • Opening the cursor for allocating the memory
  • Fetching the cursor for retrieving the data
  • Closing the cursor to release the allocated memory
Declaring the Cursor
Declaring the cursor defines the cursor with a name and the associated SELECT statement. For example −

CURSOR c_customers IS 
   SELECT id, name, address FROM customers; 

.Opening the Cursor
Opening the cursor allocates the memory for the cursor and makes it ready for fetching the rows returned by the SQL statement into it. For example, we will open the above defined cursor as follows −

OPEN c_customers; 

Fetching the Cursor
Fetching the cursor involves accessing one row at a time. For example, we will fetch rows from the above-opened cursor as follows −

FETCH c_customers INTO c_id, c_name, c_addr; 

Closing the Cursor
Closing the cursor means releasing the allocated memory. For example, we will close the above-opened cursor as follows −

CLOSE c_customers;

Example
Following is a complete example to illustrate the concepts of explicit cursors &minua;

DECLARE 
   c_id customers.id%type; 
   c_name customers.name%type; 
   c_addr customers.address%type; 
   CURSOR c_customers is 
      SELECT id, name, address FROM customers; 
BEGIN 
   OPEN c_customers; 
   LOOP 
   FETCH c_customers into c_id, c_name, c_addr; 
      EXIT WHEN c_customers%notfound; 
      dbms_output.put_line(c_id || ' ' || c_name || ' ' || c_addr); 
   END LOOP; 
   CLOSE c_customers; 
END; 
/

When the above code is executed at the SQL prompt, it produces the following result −

1 Ramesh Ahmedabad  
2 Khilan Delhi  
3 kaushik Kota     
4 Chaitali Mumbai  
5 Hardik Bhopal   
6 Komal MP  

PL/SQL procedure successfully completed. 


2. Transaction concepts 



A transaction can be defined as a group of tasks. A single task is the minimum processing unit which cannot be divided further.

Let’s take an example of a simple transaction. Suppose a bank employee transfers Rs 500 from A's account to B's account. This very simple and small transaction involves several low-level tasks.

A’s Account

Open_Account(A)

Old_Balance = A.balance

New_Balance = Old_Balance - 500

A.balance = New_Balance

Close_Account(A)

B’s Account

Open_Account(B)

Old_Balance = B.balance

New_Balance = Old_Balance + 500

B.balance = New_Balance

Close_Account(B)

  • ACID Properties

A transaction is a very small unit of a program and it may contain several lowlevel tasks. A transaction in a database system must maintain Atomicity, Consistency, Isolation, and Durability − commonly known as ACID properties − in order to ensure accuracy, completeness, and data integrity.


  • Atomicity − This property states that a transaction must be treated as an atomic unit, that is, either all of its operations are executed or none. There must be no state in a database where a transaction is left partially completed. States should be defined either before the execution of the transaction or after the execution/abortion/failure of the transaction.


  • Consistency − The database must remain in a consistent state after any transaction. No transaction should have any adverse effect on the data residing in the database. If the database was in a consistent state before the execution of a transaction, it must remain consistent after the execution of the transaction as well.


  • Durability − The database should be durable enough to hold all its latest updates even if the system fails or restarts. If a transaction updates a chunk of data in a database and commits, then the database will hold the modified data. If a transaction commits but the system fails before the data could be written on to the disk, then that data will be updated once the system springs back into action.


  • Isolation − In a database system where more than one transaction are being executed simultaneously and in parallel, the property of isolation states that all the transactions will be carried out and executed as if it is the only transaction in the system. No transaction will affect the existence of any other transaction.

Serializability

When multiple transactions are being executed by the operating system in a multiprogramming environment, there are possibilities that instructions of one transactions are interleaved with some other transaction.

Schedule − A chronological execution sequence of a transaction is called a schedule. A schedule can have many transactions in it, each comprising of a number of instructions/tasks.

Types of schedule 


Serial Schedules:

Schedules in which the transactions are executed non-interleaved, i.e., a serial schedule is one in which no transaction starts until a running transaction has ended are called serial schedules.

Example: Consider the following schedule involving two transactions T1 and T2.


T1 T2

R(A) 

W(A) 

R(B) 

W(B)

R(A)

R(B)

where R(A) denotes that a read operation is performed on some data item ‘A’

This is a serial schedule since the transactions perform serially in the order T1 —> T2


Non-Serial Schedule:

This is a type of Scheduling where the operations of multiple transactions are interleaved. This might lead to a rise in the concurrency problem. The transactions are executed in a non-serial manner, keeping the end result correct and same as the serial schedule. Unlike the serial schedule where one transaction must wait for another to complete all its operation, in the non-serial schedule, the other transaction proceeds without waiting for the previous transaction to complete. This sort of schedule does not provide any benefit of the concurrent transaction. It can be of two types namely, Serializable and Non-Serializable Schedule.

The Non-Serial Schedule can be divided further into Serializable and Non-Serializable.


Serializable:

This is used to maintain the consistency of the database. It is mainly used in the Non-Serial scheduling to verify whether the scheduling will lead to any inconsistency or not. On the other hand, a serial schedule does not need the serializability because it follows a transaction only when the previous transaction is complete. The non-serial schedule is said to be in a serializable schedule only when it is equivalent to the serial schedules, for an n number of transactions. Since concurrency is allowed in this case thus, multiple transactions can execute concurrently. A serializable schedule helps in improving both resource utilization and CPU throughput. These are of two types:

1.Conflict Serializable:

A schedule is called conflict serializable if it can be transformed into a serial schedule by swapping non-conflicting operations. Two operations are said to be conflicting if all conditions satisfy:

They belong to different transactions

They operate on the same data item

At Least one of them is a write operation

2. View Serializable:

A Schedule is called view serializable if it is view equal to a serial schedule (no overlapping transactions). A conflict schedule is a view serializable but if the serializability contains blind writes, then the view serializable does not conflict serializable.

 Non-Serializable:

The non-serializable schedule is divided into two types, Recoverable and Non-recoverable Schedule.

1.Recoverable Schedule:

Schedules in which transactions commit only after all transactions whose changes they read commit are called recoverable schedules. In other words, if some transaction Tj is reading value updated or written by some other transaction Ti, then the commit of Tj must occur after the commit of Ti.

Example – Consider the following schedule involving two transactions T1 and T2.

T1 T2

R(A) 

W(A) 

W(A)

R(A)

commit 

commit

This is a recoverable schedule since T1 commits before T2, that makes the value read by T2 correct.

There can be three types of recoverable schedule:

  a.Cascading Schedule:

Also called Avoids cascading aborts/rollbacks (ACA). When there is a failure in one transaction and this leads to the rolling back or aborting other dependent transactions, then such scheduling is referred to as Cascading rollback or cascading abort.

 Example:


b.Cascadeless Schedule:

Schedules in which transactions read values only after all transactions whose changes they are going to read commit are called cascadeless schedules. Avoids that a single transaction abort leads to a series of transaction rollbacks. A strategy to prevent cascading aborts is to disallow a transaction from reading uncommitted changes from another transaction in the same schedule.

In other words, if some transaction Tj wants to read value updated or written by some other transaction Ti, then the commit of Tj must read it after the commit of Ti.


Example: Consider the following schedule involving two transactions T1 and T2.


T1 T2

R(A) 

W(A) 

W(A)

commit 

R(A)

commit

This schedule is cascadeless. Since the updated value of A is read by T2 only after the updating transaction i.e. T1 commits.

c.Strict Schedule:

A schedule is strict if for any two transactions Ti, Tj, if a write operation of Ti precedes a conflicting operation of Tj (either read or write), then the commit or abort event of Ti also precedes that conflicting operation of Tj.

In other words, Tj can read or write updated or written value of Ti only after Ti commits/aborts.

Example: Consider the following schedule involving two transactions T1 and T2.


T1 T2

R(A) 

R(A)

W(A) 

commit 

W(A)

R(A)

commit

This is a strict schedule since T2 reads and writes A which is written by T1 only after the commit of T1.

2.Non-Recoverable Schedule:

Example: Consider the following schedule involving two transactions T1 and T2.

T1 T2

R(A) 

W(A) 

W(A)

R(A)

commit

abort 

T2 read the value of A written by T1, and committed. T1 later aborted, therefore the value read by T2 is wrong, but since T2 committed, this schedule is non-recoverable.

Serial Schedule It is a schedule in which transactions are aligned in such a way that one transaction is executed first. When the first transaction completes its cycle, then the next transaction is executed. Transactions are ordered one after the other. This type of schedule is called a serial schedule, as transactions are executed in a serial manner.

Equivalence Schedules

An equivalence schedule can be of the following types −

Result Equivalence

If two schedules produce the same result after execution, they are said to be result equivalent. They may yield the same result for some value and different results for another set of values. That's why this equivalence is not generally considered significant.

View Equivalence

Two schedules would be view equivalence if the transactions in both the schedules perform similar actions in a similar manner.

For example −

  • If T reads the initial data in S1, then it also reads the initial data in S2.
  • If T reads the value written by J in S1, then it also reads the value written by J in S2.
  • If T performs the final write on the data value in S1, then it also performs the final write on the data value in S2.

Conflict Equivalence

Two schedules would be conflicting if they have the following properties −

  • Both belong to separate transactions.
  • Both accesses the same data item.
  • At least one of them is "write" operation.

Two schedules having multiple transactions with conflicting operations are said to be conflict equivalent if and only if −

  • Both the schedules contain the same set of Transactions.
  • The order of conflicting pairs of operation is maintained in both the schedules.

Note − View equivalent schedules are view serializable and conflict equivalent schedules are conflict serializable. All conflict serializable schedules are view serializable too.

States of Transactions

A transaction in a database can be in one of the following states −



Transaction States

  • Active − In this state, the transaction is being executed. This is the initial state of every transaction.

  • Partially Committed − When a transaction executes its final operation, it is said to be in a partially committed state.

  • Failed − A transaction is said to be in a failed state if any of the checks made by the database recovery system fails. A failed transaction can no longer proceed further.

  • Aborted − If any of the checks fails and the transaction has reached a failed state, then the recovery manager rolls back all its write operations on the database to bring the database back to its original state where it was prior to the execution of the transaction. Transactions in this state are called aborted. The database recovery module can select one of the two operations after a transaction aborts −

  1. Re-start the transaction
  2. Kill the transaction

  • Committed − If a transaction executes all its operations successfully, it is said to be committed. All its effects are now permanently established on the database system.
Note – It can be seen that:

Cascadeless schedules are stricter than recoverable schedules or are a subset of recoverable schedules.
Strict schedules are stricter than cascadeless schedules or are a subset of cascadeless schedules.
Serial schedules satisfy constraints of all recoverable, cascadeless and strict schedules and hence is a subset of strict schedules.
The relation between various types of schedules can be depicted as:



3. Concurrency Control 




Concurrency Control in DBMS

Executing a single transaction at a time will increase the waiting time of the other transactions which may result in delay in the overall execution. Hence for increasing the overall throughput and efficiency of the system, several transactions are executed.

Concurrently control is a very important concept of DBMS which ensures the simultaneous execution or manipulation of data by several processes or user without resulting in data inconsistency.

Concurrency control provides a procedure that is able to control concurrent execution of the operations in the database. 

  • Concurrency Control Problems

There are several problems that arise when numerous transactions are executed simultaneously in a random manner. The database transaction consist of two major operations “Read” and “Write”. It is very important to manage these operations in the concurrent execution of the transactions in order to maintain the consistency of the data. 


  • Dirty Read Problem(Write-Read conflict)

Dirty read problem occurs when one transaction updates an item but due to some unconditional events that transaction fails but before the transaction performs rollback, some other transaction reads the updated value. Thus creates an inconsistency in the database. Dirty read problem comes under the scenario of Write-Read conflict between the transactions in the database


The lost update problem can be illustrated with the below scenario between two transactions T1 and T2.

Transaction T1 modifies a database record without committing the changes.

T2 reads the uncommitted data changed by T1

T1 performs rollback

T2 has already read the uncommitted data of T1 which is no longer valid, thus creating inconsistency in the database.

  • Lost Update Problem

Lost update problem occurs when two or more transactions modify the same data, resulting in the update being overwritten or lost by another transaction. The lost update problem can be illustrated with the below scenario between two transactions T1 and T2.


T1 reads the value of an item from the database.

T2 starts and reads the same database item.

T1 updates the value of that data and performs a commit.

T2 updates the same data item based on its initial read and performs commit.

This results in the modification of T1 gets lost by the T2’s write which causes a lost update problem in the database.

  • Advantages of Concurrency

In general, concurrency means, that more than one transaction can work on a system. The advantages of a concurrent system are:

Waiting Time: It means if a process is in a ready state but still the process does not get the system to get execute is called waiting time. So, concurrency leads to less waiting time.

Response Time: The time wasted in getting the response from the cpu for the first time, is called response time. So, concurrency leads to less Response Time.

Resource Utilization: The amount of Resource utilization in a particular system is called Resource Utilization. Multiple transactions can run parallel in a system. So, concurrency leads to more Resource Utilization.

Efficiency: The amount of output produced in comparison to given input is called efficiency. So, Concurrency leads to more Efficiency.

  • Disadvantages of Concurrency 

Overhead: Implementing concurrency control requires additional overhead, such as acquiring and releasing locks on database objects. This overhead can lead to slower performance and increased resource consumption, particularly in systems with high levels of concurrency.

Deadlocks: Deadlocks can occur when two or more transactions are waiting for each other to release resources, causing a circular dependency that can prevent any of the transactions from completing. Deadlocks can be difficult to detect and resolve, and can result in reduced throughput and increased latency.

Reduced concurrency: Concurrency control can limit the number of users or applications that can access the database simultaneously. This can lead to reduced concurrency and slower performance in systems with high levels of concurrency.

Complexity: Implementing concurrency control can be complex, particularly in distributed systems or in systems with complex transactional logic. This complexity can lead to increased development and maintenance costs.

Inconsistency: In some cases, concurrency control can lead to inconsistencies in the database. For example, a transaction that is rolled back may leave the database in an inconsistent state, or a long-running transaction may cause other transactions to wait for extended periods, leading to data staleness and reduced accuracy.


  • Concurrency Control Techniques in DBMS:

The concurrency control techniques in DBMS that are used to avoid problems related to concurrency control in DBMS in order to maintain consistency and serializability, also known as Protocols are as follow:-

  • Lock-Based Protocol:-

In order to avoid issues related to inconsistency, one of the foremost requirements is to achieve isolation between all the transactions and to achieve that, locking is done on transactions on account of any read/write operation.


  • The two variant locks used in the lock-based protocol are

  1. Shared Lock
  2. Exclusive Lock

Shared Lock:

It locks the write operations but enables the read operation to take place hence they go by the name, read-only locks. They are denoted by ‘S’ in a transaction.

Exclusive Lock:

For certain data, It locks both the read and write operations in a transaction and is denoted by ‘E’.


Four types of lock-based protocols are:-


1. Simplistic Lock Protocol:-

It locks all the operations in the process, the moment when data is about to be updated and unlocks the operations afterwards.


2. Pre-claiming lock protocol:-

Before enabling locks, all the operations are analyzed and the ones that fall in the checklist of the problem-causing operations are locked only in case all the locks are available and the transaction is performed effectively else rollback is performed.


3. Two-Phase Locking:-

This technique is performed in three stages.

Asks for the availability of locks

On acquiring all locks, transaction releases first lock

Rest of the locks are released one by one after each operation.

4. Strict Two-Phase Locking:-

Slight modification in Two-phase Locking where the locks are not released after each operation but once all the operations are done executing for good and the commit is triggered, the collective release of locks is performed.

  • Time-Based Protocol:-

All the transactions are tagged with a timestamp which denotes the time when the first and latest read and write operations were performed on them. Timestamp Ordering Protocol is put in place where the timestamp of operations is responsible for ensuring serializability. It holds three timestamps, one for operation and the other two for Read and Writing time. (R & W)

Let’s suppose A is our transaction and G is data and we want to perform a write operation on data.

If Timestamp(A) < R(G) which indicates the Timestamp on the operation is lesser than the time when the data was read last, in such a scenario, rollback will be performed as a result of the data that was read later than the timestamp of the transaction.


Also If Timestamp(A) < W(G) which indicates the Timestamp on the operation is lesser than the time when the data was written last, in such a scenario, rollback will be performed as a result of the data that was written later than the timestamp of the transaction.


On the contrary, when we want to perform a read operation on data then,


If Timestamp(A) < W(G) which indicates the Timestamp on the transaction is lesser than the time when the data was written last, in such a scenario, rollback will be performed as a result of the data that was written later than the timestamp of the transaction.


And if Timestamp(A) >= W(G) then the operation will be executed because the Timestamp of the transaction is greater than the last time the data was written which means that the serializability will be maintained.


  • Validation-Based Protocol

This protocol is divided into three phases which are as follows:-


1. Read Phase:- All the data modifications inside the transaction are stored in a local buffer and reused when needed in later operations.


2. Validation Phase:- Validation is performed to ensure that the actual values can replace what is already pre-existent in the buffer.


Validation Test: Tests are performed on transactions in concurrency control in DBMS on the basis of their execution time such that Timestamp(A) < Timestamp(B). The set of rules are:-

Start(B) > End(A) – When transaction A is done executing, the start of transaction B is done. In this manner, the serializability is not affected.

Validate(B) > End(A) > Start(B) – It states that the validation is performed after the end of transaction A if A ends executing after the onset of Transaction B, as one of the effective concurrency control techniques in DBMS.

3. Write Phase:- If the validation phase is performed well and good, the values are copied otherwise a rollback is performed.


  • FAQs Related to Concurrency Control in DBMS:

1. What is concurrency control in DBMS?

Ans. To increase throughput and remove waiting time and errors, the measures or techniques applied are known as concurrency control in DBMS.

2. What are protocols in DBMS?

Ans. Protocols can be defined as rules or more precisely, the concurrency control techniques in DBMS to avoid unnecessary errors in transactions.

3. What is serializability relative to concurrency control in DBMS?

Ans. The order of correct execution of operations stands for serializability.



4.Crash Recovery



DBMS is a highly complex system with hundreds of transactions being executed every second. The durability and robustness of a DBMS depends on its complex architecture and its underlying hardware and system software. If it fails or crashes amid transactions, it is expected that the system would follow some sort of algorithm or techniques to recover lost data.


  • Failure Classification
To see where the problem has occurred, we generalize a failure into various categories, as follows −
  1. Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point from where it can’t go any further. This is called transaction failure where only a few transactions or processes are hurt.


Reasons for a transaction failure could be −

  • Logical errors − Where a transaction cannot complete because it has some code error or any internal error condition.
  • System errors − Where the database system itself terminates an active transaction because the DBMS is not able to execute it, or it has to stop because of some system condition. For example, in case of deadlock or resource unavailability, the system aborts an active transaction.

2.  System Crash

There are problems − external to the system − that may cause the system to stop abruptly and cause the system to crash. For example, interruptions in power supply may cause the failure of underlying hardware or software failure.
Examples may include operating system errors.

3. Disk Failure

In early days of technology evolution, it was a common problem where hard-disk drives or storage drives used to fail frequently.
Disk failures include formation of bad sectors, unreachability to the disk, disk head crash or any other failure, which destroys all or a part of disk storage.

  • Recovery and Atomicity
When a system crashes, it may have several transactions being executed and various files opened for them to modify the data items. Transactions are made of various operations, which are atomic in nature. But according to ACID properties of DBMS, atomicity of transactions as a whole must be maintained, that is, either all the operations are executed or none.

  • When a DBMS recovers from a crash, it should maintain the following −

  1. It should check the states of all the transactions, which were being executed.
  2. A transaction may be in the middle of some operation; the DBMS must ensure the atomicity of the transaction in this case.
  3. It should check whether the transaction can be completed now or it needs to be rolled back.
  4. No transactions would be allowed to leave the DBMS in an inconsistent state.

  • There are two types of techniques, which can help a DBMS in recovering as well as maintaining the atomicity of a transaction −
  1. Maintaining the logs of each transaction, and writing them onto some stable storage before actually modifying the database.
  2. Maintaining shadow paging, where the changes are done on a volatile memory, and later, the actual database is updated.

  • Log-based Recovery

Log is a sequence of records, which maintains the records of actions performed by a transaction. It is important that the logs are written prior to the actual modification and stored on a stable storage media, which is failsafe.

  • Log-based recovery works as follows −

  1. The log file is kept on a stable storage media.
  2. When a transaction enters the system and starts execution, it writes a log about it.

<Tn, Start>
  • When the transaction modifies an item X, it write logs as follows −

<Tn, X, V1, V2>
  • It reads Tn has changed the value of X, from V1 to V2.

  • When the transaction finishes, it logs −
<Tn, commit>
  • The database can be modified using two approaches −

  1. Deferred database modification − All logs are written on to the stable storage and the database is updated when a transaction commits.
  2. Immediate database modification − Each log follows an actual database modification. That is, the database is modified immediately after every operation.

  • Recovery with Concurrent Transactions

When more than one transaction are being executed in parallel, the logs are interleaved. At the time of recovery, it would become hard for the recovery system to backtrack all logs, and then start recovering. To ease this situation, most modern DBMS use the concept of 'checkpoints'.

Checkpoint

Keeping and maintaining logs in real time and in real environment may fill out all the memory space available in the system. As time passes, the log file may grow too big to be handled at all. Checkpoint is a mechanism where all the previous logs are removed from the system and stored permanently in a storage disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the transactions were committed.

Recovery

When a system with concurrent transactions crashes and recovers, it behaves in the following manner −
  1. The recovery system reads the logs backwards from the end to the last checkpoint.
  2. It maintains two lists, an undo-list and a redo-list.
  3. If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts the transaction in the redo-list.
  4. If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the transaction in undo-list.

All the transactions in the undo-list are then undone and their logs are removed. All the transactions in the redo-list and their previous logs are removed and then redone before saving their logs.

5. Database security 


Database Security means keeping sensitive information safe and prevent the loss of data. Security of data base is controlled by Database Administrator (DBA). 
  • The database administrator (DBA) is the central authority for managing a database system. The DBA’s responsibilities include granting privileges to users who need to use the system and classifying users and data in accordance with the policy of the organization. The DBA has a DBA account in the DBMS, sometimes called a system or superuser account, which provides powerful capabilities that are not made available to regular database accounts and users.
  • DBA-privileged commands include commands for granting and revoking privileges to individual accounts, users, or user groups and for performing the following types of actions:

1. Account creation. This action creates a new account and password for a user or a group of users to enable access to the DBMS.

2. Privilege granting. This action permits the DBA to grant certain privileges to certain accounts.

3. Privilege revocation. This action permits the DBA to revoke (cancel) certain privileges that were previously given to certain accounts.

4. Security level assignment. This action consists of assigning user accounts to the appropriate security clearance level.

The DBA is responsible for the overall security of the database system. Action 1 in the preceding list is used to control access to the DBMS as a whole, whereas actions 2 and 3 are used to control discretionary database authorization, and action 4 is used to control mandatory authorization.

  • The following are the main control measures are used to provide security of data in databases: 

1. Authentication
2. Access control
3. Inference control
4. Flow control
5. Database Security applying Statistical Method
6. Encryption 
These are explained as following below.  

  • Authentication : 
Authentication is the process of confirmation that whether the user log in only according to the rights provided to him to perform the activities of data base. A particular user can login only up to his privilege but he can’t access the other sensitive data. The privilege of accessing sensitive data is restricted by using Authentication. 
By using these authentication tools for biometrics such as retina and figure prints can prevent the data base from unauthorized/malicious users. 
  • Access Control : 
The security mechanism of DBMS must include some provisions for restricting access to the data base by unauthorized users. Access control is done by creating user accounts and to control login process by the DBMS. So, that database access of sensitive data is possible only to those people (database users) who are allowed to access such data and to restrict access to unauthorized persons. 
The database system must also keep the track of all operations performed by certain user throughout the entire login time. 
  • Inference Control : 
This method is known as the countermeasures to statistical database security problem. It is used to prevent the user from completing any inference channel. This method protect sensitive information from indirect disclosure. 
Inferences are of two types, identity disclosure or attribute disclosure. 
  • Flow Control : 
This prevents information from flowing in a way that it reaches unauthorized users. Channels are the pathways for information to flow implicitly in ways that violate the privacy policy of a company are called convert channels. 
  • Database Security applying Statistical Method : 
Statistical database security focuses on the protection of confidential individual values stored in and used for statistical purposes and used to retrieve the summaries of values based on categories. They do not permit to retrieve the individual information. 
This allows to access the database to get statistical information about the number of employees in the company but not to access the detailed confidential/personal information about the specific individual employee. 
  • Encryption : 
This method is mainly used to protect sensitive data (such as credit card numbers, OTP numbers) and other sensitive numbers. The data is encoded using some encoding algorithms. 
An unauthorized user who tries to access this encoded data will face difficulty in decoding it, but authorized users are given decoding keys to decode data. 



6. Database System Architecture



The general structure and parts of a database system are described by the database system architecture. It includes the following essential elements −

  • User Interface − Users can communicate with the database system using the user interface. It could take the form of a web-based interface, a GUI, or a command-line interface. Users may submit queries, enter data, and see query results or reports via the user interface.

  • Query Processor − The query processor executes and optimizes SQL queries after receiving them from users or applications. In order to get the required data and carry out any necessary activities, it analyses the query, chooses the most effective execution plan and communicates with other components. In order to reduce resource consumption and boost speed, the query processor makes sure that queries are processed as effectively as possible.

  • Storage Manager − Managing the actual physical storage of data on discs or other storage media is the responsibility of the storage manager. To read and write data, it communicates with the file system or storage subsystem. To facilitate data access and guarantee data integrity, the storage manager manages data archiving, retrieval, and indexin

For instance, the storage manager oversees the allocation of disc space to guarantee effective storage when a new order is placed in the e-commerce application. It also saves the order details in the relevant tables.

  • Buffer Manager − Data transfer between memory and disc storage is controlled by the buffer manager, an important component. It reduces disc I/O operations and boosts efficiency by using a buffer cache to keep frequently used data pages in memory. The buffer manager makes sure that data caching and replacement procedures are effective in order to maximize memory consumption.

For instance, when a query is run that needs to access data from the disc, the buffer manager pulls the necessary data pages into the buffer cache from the disc. The need for disc access can be avoided by serving subsequent requests that access the same data from memory.

  • Transactions Manager − Database transactions' atomicity, consistency, isolation, and durability are all guaranteed by the transaction manager. To maintain data integrity and concurrency management, it maintains concurrent access to the data, takes care of transaction execution, and enforces transaction isolation levels.

For instance, the transaction manager makes sure that each order is executed as a separate transaction when several clients place orders at once, ensuring data integrity and avoiding conflicts.

  • Data Dictionary − The metadata regarding the database schema and objects are stored in the data dictionary, sometimes referred to as the metadata repository. It includes details on various database structures, including tables, columns, data types, constraints, indexes, and more. The DBMS uses the data dictionary to verify queries, uphold data integrity, and offer details on the database structure

For instance, the data dictionary keeps tabs on the names, columns, data types, and constraints of the tables in the e-commerce application.

  • Concurrency Control − Multiple transactions can access and edit the database simultaneously without resulting in inconsistent data thanks to concurrency control methods. To regulate concurrent access and preserve data integrity, methods including locking, timestamp ordering, and multi-version concurrency control (MVCC) are utilized.

Concurrency control measures, for instance, make sure that two consumers updating their profiles in the same e-commerce application at the same time are serialized and applied appropriately to maintain data consistency.

  • Backup and recovery − In order to safeguard against data loss and guarantee data availability, database systems must have backup and recovery processes. In the case of system failures or data corruption, recovery procedures are employed to restore the database to a consistent condition. Regular backups are performed to create copies of the database.

To guarantee that data can be restored in the event of hardware problems or unintentional data loss, for instance, frequent backups of the e-commerce database are made.




1 mark question 

a) List out function parameter with syntax.

b) Write down features of PL/pgSQL.

c) What is Transaction?

d) What are different types of loops available in PL/pgSQL?

e) Which are states of transaction?

f) What is Cursor?

2 mark question 

a) Explain advantages of PL/SQL.
b) Explain Consistency and Isolation property
c) Explain Structure of PL/pgSQL code block.

d) Explain Conditional statements in PL/pgSQL. 

e) Define terms: a) Implicit Cursor b) Explicit Cursor

1) Explain declaring function parameters in PL/pgSQL.

5 mark question 

a) Explain ACID Properties of transaction

b) Explain concept of View in detail with example

. c) Explain States of Transaction in detail.

Answer.click Next page

What are ACID Properties in DBMS?

Transactions refer to the single logical units of work that access and (possibly) modify the contents present in any given database. We can access the transactions using the read and write operations.

If we want to maintain database consistency, then certain properties need to be followed in the transactions known as the ACID (Atomicity, Consistency, Isolation, Durability) properties. Let us discuss them in detail.



A – Atomicity

Atomicity means that an entire transaction either takes place all at once or it doesn’t occur at all. It means that there’s no midway. The transactions can never occur partially. Every transaction can be considered as a single unit, and they either run to completion or do not get executed at all. We have the following two operations here:

—Commit: In case a transaction commits, the changes made are visible to us. Thus, atomicity is also called the ‘All or nothing rule’.

—Abort: In case a transaction aborts, the changes made to the database are not visible to us.

Consider this transaction T that consists of T1 and T2: Transfering 100 from account A to account B.



In case the transaction fails when the T1 is completed but the T2 is not completed (say, after write(A) but before write(B)), then the amount has been deducted from A but not added to B. This would result in a database state that is inconsistent. Thus, the transaction has to be executed in its entirety in order to ensure the correctness of the database state.

C – Consistency

Consistency means that we have to maintain the integrity constraints so that any given database stays consistent both before and after a transaction. If we refer to the example discussed above, then we have to maintain the total amount, both before and after the transaction.

Total after T occurs = 400 + 300 = 700.

Total before T occurs = 500 + 200 = 700.

Thus, the given database is consistent. Here, an inconsistency would occur when T1 completes, but then the T2 fails. As a result, the T would remain incomplete.

I – Isolation

Isolation ensures the occurrence of multiple transactions concurrently without a database state leading to a state of inconsistency. A transaction occurs independently, i.e. without any interference. Any changes that occur in any particular transaction would NOT be ever visible to the other transactions unless and until this particular change in this transaction has been committed or written to the memory.

The property of isolation ensures that when we execute the transactions concurrently, it will result in such a state that’s equivalent to the achieved state that was serially executed in a particular order.

Let A = 500, B = 500

Let us consider two transactions here- T and T”



Suppose that T has been executed here till Read(B) and then T’’ starts. As a result, the interleaving of operations would take place. And due to this, T’’ reads the correct value of A but incorrect value of B.

T’’: (X+B = 50, 000+500=50, 500)

Thus, the sum computed here is not consistent with the sum that is obtained at the end of the transaction:

T: (A+B = 50, 000 + 450 = 50, 450).

It results in the inconsistency of a database due to the loss of a total of 50 units. The transactions must, thus, take place in isolation. Also, the changes must only be visible after we have made them on the main memory.

D – Durability

The durability property states that once the execution of a transaction is completed, the modifications and updates on the database gets written on and stored in the disk. These persist even after the occurrence of a system failure. Such updates become permanent and get stored in non-volatile memory. Thus, the effects of this transaction are never lost.

Uses of ACID Properties

In totality, the ACID properties of transactions provide a mechanism in DBMS to ensure the consistency and correctness of any database. It ensures consistency in a way that every transaction acts as a group of operations acting as single units, produces consistent results, operates in an isolated manner from all the other operations, and makes durably stored updates. These ensure the integrity of data in any given database......

What Does View Mean?

A view is a subset of a database that is generated from a user query and gets stored as a permanent object.

View

Views in SQL are the virtual tables that do not really exist like the tables of the database. These Views are created by SQL statements that join one or more tables. The views contain rows and columns. We can CREATE VIEW by selecting the fields from one or more tables of the database. We can also Update and Drop the views according to our requirements.

Create SQL View

We can create View using the CREATE VIEW statement. A View can be created using a single table or multiple tables.

The basic Syntax for creating VIEW:

CREATE VIEW view_name AS
SELECT column1, column2, column3...
FROM table_name
WHERE [condition];

Examples: Suppose we have two tables. First, the Customer_Details Table which has attributes as Customer_id, Name, Address, Age .

Second, the Customer_Order Table which has attributes as Customer_id, Name, Product, Date .

  • Creating a view from a single Table:

Query

CREATE VIEW Customer_view AS
SELECT Customer_id, Name, Address
FROM  Customer_Details
WHERE Address = "Miami";

The above CREATE VIEW statement would create a virtual table based on the result of the SELECT statement. Now, you can query the SQL VIEW as follows to see the output:

SELECT * FROM Customer_view;

In reality, there is no table named Customer_view. It's just a view that we are using.

Output

Types Of Transaction States

There are six major types of Transaction states which are as given below

  1. Active state
  2. Partially committed state
  3. Committed state
  4. Failed state
  5. Aborted state
  6. Terminated state
Detail diagram of transaction states as below



1. Active State
When the instructions of the transaction are executing then the transaction is in active state.
In case of execution of all instruction of transaction, Transaction can go to “partially committed state” otherwise go to “failed state” from active state.  

.2. Partially Committed State

After the execution of all instruction of a transaction, the transaction enters into a partially committed state from active state.
At this stage, Still Changes are possible in transaction because all the changes made by the transaction are still store in the buffer of main memory.
3. Committed State
Committed state permanently store all changes made by the transaction into the database
Now, the transaction is consider as fully committed.

4. Failed State

When a transaction is in the “active state” or “partially committed state” and some failure occurs then it becomes impossible to resume its execution, So it enters into a failed state.
Note: At this stage, Transaction cannot go to “partially committed state” or “active state” .
5. Aborted State
As we know failed state can never be resuming but it is possible to restart the failed transaction. To restart the failed transaction Rollback method comes into picture.
When we rollback (restart) the failed transaction the all the changes made by that transaction earlier have to be undone.
6. Terminated State
This is the last stage of transaction in its life cycle.I
f any transaction comes from “Aborted state” or “committed state” then that transaction is terminated and get ready to execute the new transactions.

Structure of PL/pgSQL

PL/pgSQL is a block-structured language. The complete text of a function definition must be a block. A block is defined as:
[ <<label>> ]
[ DECLARE
    declarations ]
BEGIN
    statements
END;
Each declaration and each statement within a block is terminated by a semicolon.
All key words and identifiers can be written in mixed upper and lower case. Identifiers are implicitly converted to lower-case unless double-quoted.
There are two types of comments in PL/pgSQL. A double dash (--) starts a comment that extends to the end of the line. A /* starts a block comment that extends to the next occurrence of */. Block comments cannot be nested, but double dash comments can be enclosed into a block comment and a double dash can hide the block comment delimiters /* and */.
Any statement in the statement section of a block can be a subblock. Subblocks can be used for logical grouping or to localize variables to a small group of statements.
The variables declared in the declarations section preceding a block are initialized to their default values every time the block is entered, not only once per function call. For example:
CREATE FUNCTION somefunc() RETURNS integer AS '
DECLARE
    quantity integer := 30;
BEGIN
    RAISE NOTICE ''Quantity here is %'', quantity; -- Quantity here is 30
    quantity := 50;
    --
    -- Create a subblock
    --
    DECLARE
        quantity integer := 80;
    BEGIN
        RAISE NOTICE ''Quantity here is %'', quantity; -- Quantity here is 80
    END;
    RAISE NOTICE ''Quantity here is %'', quantity; -- Quantity here is 50
    RETURN quantity;
END;
' LANGUAGE plpgsql;
It is important not to confuse the use of BEGIN/END for grouping statements in PL/pgSQL with the database commands for transaction control. PL/pgSQL's BEGIN/END are only for grouping; they do not start or end a transaction. Functions and trigger procedures are always executed within a transaction established by an outer query --- they cannot start or commit transactions, since PostgreSQL does not have nested transactions.

What are Cursors in SQL?

Cursors are user-defined iterative variables that allow us to access the query results or the results of the stored procedure.

Cursors are objects we define using a Declare statement and take a Select statement as the parameter. 

1. Explicit Cursor

An explicit cursor requires a declaration by the user by the use of the SELECT statement. It goes over each record but only a single row is processed at a time. Once the reading of one row is complete it moves to another row.

2. Implicit Cursor

An implicit cursor is put internally by SQL whenever the user runs any DML query. Also, implicit cursors are made for one single row of the database.


No comments:

Post a Comment