Database Emergency Exit from unforeseen Disasters

 With the emerging transformation in the field of AI(Artificial Intelligence) and ML(Machine Learning), data plays a very important role because it allows organizations to store large amounts of business data and retrieve them to get insight into products and customers, which will drive the business. The DataBase helps in accomplishing this. Database Management system provides facilities to access and modify these data efficiently and effectively.

Database are very important because of following reasons,

  • Databases enable businesses to efficiently and neatly store and handle massive volumes of data. This facilitates the retrieval and access of particular information when needed.
  • Organizations can use databases to evaluate and interpret their data. Through the utilization of tools like pivot tables and SQL queries, organizations can acquire valuable insights into patterns, trends, and other significant data that can assist in making informed decisions.
  • Through the provision of security features like user access controls and backup and recovery capabilities, databases can assist businesses in safeguarding their data
  • Databases provide a centralized location for storing data, which makes it easier to manage and access.
  • Databases use structured data storage systems, such as tables and fields, to organize data in a logical and consistent manner. This makes it easier to find and retrieve specific data and to perform analysis and reporting.
Losing the organization's data will affect the business. Data might be lost in several ways due to system failure/Hardware failure/SPOF(Single Point Of Failure). Technically in general it is referred to as a Disaster. A disaster might occur due to natural hazard events as well like an earthquake, fire, or other catastrophic event.

Being prepared for these kinds of Disasters plays very critical steps for any organization to recover the data. This is technically called "Disaster Recovery".


It is difficult to recover 100% of the data during these kinds of hazards. However, the effect/loss of data can be minimized. This is termed as RPO(Recovery Point Objective)/RTO(Recovery Time Objective).
RPO is the tolerable amount of data loss during disaster recovery and RTO is tolerable application down time.

Disaster Recovery Plan can include one or more of the following

  • Take regular Backup on Cloud Storage or on a different host.
  • Taking Database Snapshots regularly.
  • Data Replication to several remote databases.
  • Storage Disk with RAID level.
  • Recovery sites at different locations.
  • Different Standby host for restoring Database during emergency.



PostgresSQL is one of the most popular DataBase Managment system.
It is a highly stable, backed by more than 20 years of community development which has contributed to its high levels of resilience, integrity, and correctness. PostgreSQL is used as the primary data store or data warehouse for many web, mobile, geospatial, and analytics applications.

Pgbackrest is an open-source backup and recovery utility tool for managing PostgreSQL backup and recovery during disaster events. pgBackRest aims to be a reliable, easy-to-use backup and restore solution that can seamlessly scale up to the largest databases and workloads by utilizing algorithms that are optimized for database-specific requirements.



FIG: PostgreSQL Backup and Restore Using Pgbackrest

Greenplum is a big data technology based on MPP architecture and the Postgres open source database technology.VMware Greenplum is the only open source shared nothing, massively parallel processing (MPP) data warehouse that has been designed for business intelligence processing and advanced data analytics. The enterprise-grade analytical database provides powerful and rapid analytics on very large volumes of data.

Greenplum Disaster Recovery(GPDR) tool is based out of pgbackrest to provide backup and restore solutions to the Greenplum Database. GPDR provides three different recovery mechanisms that will help the customer or user to effectively use their hardware resource like storage since Nothing comes for Free.

Three different restore types supported by GPDR are
  1. Full Recovery
  2. Incremental Recovery
  3. Continuous Recovery
There are many options available to effectively manage the repository backup for utilizing the storage space effectively and efficiently. Manu backup expiry options are supported for this.
There is already gpbackup and gprestore toll is available, but still GPDR would be better for Disaster Recovery due to following advantage
  • PITR(Point-in-time recovery) is supported.
  • File based Physical backup and restore.
  • Back can be taken without interrupting the Database operation.
  • Lowest RPO and RTO.
  • Multiple Cloud storage like S3, GCP, Azure is supported.

Still, many more advantage is there. Thanks for Reading this blog and goodbye till we meet again with more interesting information on this.........


References:

[1] VMware Greenplum Disaster Recovery 

[2] https://en.wikipedia.org/wiki/PostgreSQL

[3] https://pgbackrest.org

[4] https://tanzu.vmware.com/greenplum




Comments

Popular posts from this blog

RCU Kernel Implementation

PostgreSQL Write-Ahead-Logging(WAL) Archiving Functionality

Linux FTRACE setup using Terminal and GUI Application created from Python Framework