SAP HANA System Replication Monitoring — the Splunk Way
Downtime is the consequence of outages, which may be intentional (for example, for system upgrades) or caused by unplanned faults. A fault can be due to equipment malfunction, software, or network failures — or even due to a major disaster such as a fire, a regional power loss or a construction accident — which may decommission the entire datacenter.?
As an in-memory database, SAP HANA is not only concerned with maintaining the reliability of its data in the event of failures, but also with resuming operations with most of that data loaded back in as memory — as quickly as possible.?
As an SAP Technical Architect for Fortune 500 companies involved in design and implementation of SAP Migration and Replication Projects, SAP Migration Projects require huge team and manual efforts to plan and execute the cutover activities necessary to successfully implement the SAP migration.
At RHONDOS , with the help of SAP PowerConnect?and consultants with years of experience in SAP, providing visibility and insights into SAP systems simplifies the planning and preparation for a successful SAP S/4HANA system replication. SAP-certified monitoring solutions like SAP PowerConnect have helped Fortune 500 companies with complex business processes and a vast amount of data to manage and ensure business continuity, data availability, and disaster recovery —?improving performance, enhancing flexibility, enabling testing, and reducing costs.
So, how do we avoid undesired downtime situations? What type of support is available in SAP HANA for high availability ??
SAP HANA supports the following recovery measures from failures:?
Disaster recovery support:?
Fault recovery support:?
How does system replication work?
Once SAP HANA system replication is enabled, each server process on the secondary system establishes a connection with its primary system counterpart and requests a snapshot of the data. From then on, all logged changes in the primary system are replicated continuously. Whenever logs are persisted (meaning they are written to the log volumes of each service) in the primary system, they are also sent to the secondary system.?
Common issues in system replication?
What if the network connection between the primary and secondary site is lost??
The connection between the primary and the secondary system must be available for replication. If this is not the case for a certain time, the redo log cannot be shipped to the secondary system, the log segments start piling up on the primary, and the secondary system is not ready for takeover. ?
What if there are intermittent connectivity problems??
A common intermittent error is that the log buffer is not shipped in a timely fashion from the primary to the secondary site – the system replication log replay backlog increases. A delayed log replay on the secondary system causes a longer takeover time.??
What is the impact on business if system replication issues occur??
If system replication in SAP HANA doesn’t go as planned, it can have a significant impact on a business, which can lead to:?
Data loss: if replication fails, the secondary system may not have the latest data, which can result in data loss.?
How does monitoring SAP HANA replication help??
Monitoring the replication process in real-time ensures that the data on the backup system is consistent and up-to-date with the primary system, minimizing the risk of data loss or corruption. Instant detection and resolution of performance bottlenecks or issues dramatically helps to improve the overall performance of the SAP HANA system and helps organizations to comply with various regulatory requirements for data protection and disaster recovery.?
SAP HANA Replication Monitoring Dashboard in Splunk?
Prerequisite in?SAP:??
领英推荐
Prerequisite in?Splunk:
Download software package: SAP PowerConnect for Splunk | Splunkbase
Dashboards — SAP HANA Replication Monitoring:
KPIs: ?
Panel 1: Trend of HANA replication issues over time.?
The first panel provides visualizations of when connection issues/configuration parameter mismatch/log relay backlog/Increased log Shipping Backlog/ASYNC Replication In-Memory Buffer Overflow/Inconsistent fallback snapshot/system replication support issues in ESS are occurring over the time interval selected.?
Panel 2: Count of replication issues.?
The pie chart illustrates the count of replication issues which gives insights to what the most common HANA replication issue is, drastically reducing time dedicated to analysis and resolution.?
Panel 3: User action and next steps.?
The third panel provides more details about the Alert, for example:?
HANA — Diagnostic Files
KPIs:
Panel?1: Top 10 Files Based on Size – the box indicates there was dump for log shipping timeout.?
If the primary system does not receive the acknowledgment for a sent log buffer within the time defined by logshipping_timeout, it closes the connection to the secondary system to continue data processing. This is done to prevent the primary system from blocking transaction processing if there is a hang situation on the connection to the secondary system.?
HANA System Replication Mini Checks?
Mini Checks provides insights on several replication KPIs/metrics. We can configure/change the threshold based on requirements to receive real-time alert notifications.
With proactive alerting and monitoring dashboards to aid in system replication, RHONDOS has helped a variety of customers from the entertainment and beverage industries implement a reliable, high-performance database system that can be used for disaster recovery, data consolidation, data integration, and real-time analytics to ensure business continuity — all by providing a reliable and fast failover mechanism.?
References:?
This guide was authored by Jetendra Pinninty of RHONDOS. Jetendra is an SAP PowerConnect and Splunk-certified consultant with more than 15 years of SAP experience.
For more articles and guides on SAP monitoring, head over to the RHONDOS blog — and for an explanation of the SAP PowerConnect architecture and more, visit RHONDOS on YouTube .
Americas Solution Lead - PowerConnect for SAP Solutions at SoftwareOne
1 年Nice one Jetendra Pinninty! Thorough and interesting.