SAPHanaSR_maintenance_examples

最新推荐文章于 2025-12-05 17:02:52 发布

原创最新推荐文章于 2025-12-05 17:02:52 发布 · 919 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#java #数据库 #apache

SAPHanaSR_maintenance_examples(7) SAPHanaSRSAPHanaSR_maintenance_examples(7)

NAME
SAPHanaSR_maintenance_examples - maintenance examples for SAPHanaCon-
troller.

DESCRIPTION
Maintenance examples for SAPHanaController. Please see
ocf_suse_SAPHanaController(7), susHanaSR.py(7), susHanaSrMultiTar-
get.py(7), susTkOver.py(7), susChkSrv.py(7), SAPHanaSR-manageAttr(8),
for more examples and read the REQUIREMENTS section below.

EXAMPLES
* Check status of Linux cluster and HANA system replication pair.

This steps should be performed before doing anything with the clus-
ter, and after something has been done. See also
cs_show_saphanasr_status(8) and section REQUIREMENTS below.

# cs_clusterstate -i
# crm_mon -1r
# crm configure show | grep cli-
# SAPHanaSR-showAttr
# cs_clusterstate -i

* Watch status of HANA cluster resources and system replication.

This might be convenient when performing administrative actions or
cluster tests. It does not replace the afore mentioned checks. See
also cs_show_saphanasr_status(8).

# watch -n9 "crm_mon -1r --include=none,nodes,resources,fail-
ures;echo;SAPHanaSR-showAttr;cs_clusterstate -i|grep -v '#'"

* Overview on stopping the HANA database at one site.

This procedure does work for scale-up and scale-out. No takeover will
be done. This procedure should be used, when it is necessary to stop
the HANA database. Stopping the HANA database should not be done by
just stopping the Linux cluster or shutting down the OS. This partic-
ularly applies to scale-out systems. It might be good to define up-
front which HANA site needs to be stopped. In case both sites need to
be stopped, it might be good to define the order. First stopping the
primary should keep system replication in sync.
How long a stop will take, depends on database size, performance of
underlying infrastructure, SAP HANA version and configuration. Please
refer to SAP HANA documentation for details on tuning and stopping an
HANA database.

1. Checking status of Linux cluster and HANA system replication
pair.
2. Setting SAPHana or SAPHanaController multi-state resource into
maintenance.
3. Stopping HANA database at the given site by using "sapcontrol
-nr <nr> -function StopSystem".
4. Checking that HANA is stopped.

Note: Do not forget to end the resource maintenance after you have
re-started the HANA database.

* Initiate an administrative takeover of the HANA primary from one
node to the other by using the Linux cluster.

This procedure does not work for scale-out. On scale-up, it will stop
the HANA primary. This might take a while. If you want to avoid
waiting for the stopped primary, use the below procedure which sus-
pends the primary. If the cluster should also register the former
primary as secondary, AUTOMATED_REGISTER="true" is needed. Before the
takeover will be initiated, the status of the Linux cluster and the
HANA system replication has to be checked. The takeover should be
initiated as forced migration of the multi-state SAPHanaController
resource.
Not working: Regular migration, migration of IP address, migration of
primitive SAPHanaController resource, setting primary node standby.
After takeover of the primary has been finished, the migration rule
has to be deleted. If AUTOMATED_REGISTER="true" is set, finally the
former primary will be registered as secondary, once the migration
rule has been deleted.

# crm_mon -1r
# SAPHanaSR-showAttr
# crm configure show | grep cli-
# cs_clusterstate -i
# crm resource move mst_SAPHanaCon_SLE_HDB10 force
# cs_clusterstate -i
# SAPHanaSR-showAttr
# crm resource clear mst_SAPHanaCon_SLE_HDB10
# SAPHanaSR-showAttr
# cs_clusterstate -i

Note: Former versions of the Linux cluster used "migrate" instead of
"move" and "unmigrate" instead of "clear".

* Perform an SAP HANA takeover by using SAP tools.

The procedure is described here for scale-out. It works for scale-up
as well. The procedure will stop the HANA primary. This might take a
while. If you want to avoid waiting for the stopped primary, use the
below procedure which suspends the primary. The status of HANA data-
bases, system replication and Linux cluster has to be checked. The
SAP HANA resources are set into maintenance, an sr_takeover is per-
formed, the old primary is registered as new secondary. Therefor the
correct secondary site name has to be used, see later example. Fi-
nally the SAP HANA resources are given back to the Linux cluster.
See also section REQUIREMENTS below and later example on determining
the correct site name.

1. On either node
# crm_mon -1r
# SAPHanaSR-showAttr
# crm configure show | grep cli-
# cs_clusterstate -i
If everything looks fine, proceed.
# crm resource maintenance mst_SAPHanaCon_SLE_HDB10
# crm_mon -1r
2. On the HANA primary master nameserver (e.g. node11)
# su - sleadm
~> sapcontrol -nr 10 -function StopSystem HDB
~> sapcontrol -nr 10 -function GetSystemInstanceList

Only proceed after you made sure the HANA primary is down!

3. On the HANA secondary master nameserver (e.g. node21)
# su - sleadm
~> hdbnsutil -sr_takeover
~> cdpy; python3 ./systemReplicationStatus.py; echo RC:$?
~> cdpy; python3 ./landscapeHostConfiguration.py; echo RC:$?
If everything looks fine, proceed.
4. On the former HANA primary master nameserver, now future sec-
ondary master nameserver (e.g. node11)
~> hdbnsutil -sr_register --remoteHost=node21 --remoteInstance=10
--replicationMode=sync --name=site2 --operationMode=logreplay
~> sapcontrol -nr 10 -function StartSystem HDB
~> exit
5. On the new HANA primary master nameserver (e.g. node21)
~> cdpy; python3 ./systemReplicationStatus.py; echo RC:$?
~> cdpy; python3 ./landscapeHostConfiguration.py; echo RC:$?
~> exit
If everything looks fine, proceed.
6. On either node
# cs_clusterstate -i
# crm resource refresh mst_SAPHanaCon_SLE_HDB10
# crm resource maintenance mst_SAPHanaCon_SLE_HDB10 off
# SAPHanaSR-showAttr
# crm_mon -1r
# cs_clusterstate -i

* Overview on SAP HANA takeover using SAP tools and suspend primary
feature.

The procedure works for scale-up and scale-out. The status of HANA
databases, system replication and Linux cluster has to be checked.
The SAP HANA resources are set into maintenance, an sr_takeover is
performed with suspending the primary, the old primary is registered
as new secondary. Therefor the correct secondary site name has to be
used. Finally the SAP HANA resources are given back to the Linux
cluster. See also section REQUIREMENTS below and later example on
determining the correct site name.

1. Check status of Linux cluster and HANA, show current site names.
2. Set SAPHanaController multi-state resource into maintenance.
3. Perform the takeover, make sure to use the suspend primary fea-
ture:
~> hdbnsutil -sr_takeover --suspendPrimary
4. Check if the new primary is working.
5. Stop suspended old primary.
6. Register old primary as new secondary, make sure to use the cor-
rect site name.
7. Start the new secondary.
8. Check new secondary and its system replication.
9. Refresh SAPHanaController multi-state resource.
10. Set SAPHanaController multi-state resource to managed.
11. Finally check status of Linux cluster and HANA.

* Check the two site names that are known to the Linux cluster.

This is useful in case AUTOMATED_REGISTER is not yet set. In that
case a former primary needs to be registered manually with the for-
mer site name as new secondary. The point is finding the site name
that already is in use by the Linux cluster. That exact site name has
to be used for registration of the new secondary. See also REQUIRE-
MENTS of SAPHanaSR(7) and SAPHanaSR-ScaleOut(7).
In this example, node is suse11 on the future secondary site to be
registered. Remote HANA master nameserver is suse21 on current pri-
mary site. Lowercase-SID is ha1.

# crm configure show suse11 suse21
# crm configure show SAPHanaSR | grep hana_ha1_site_mns
# ssh suse21
# su - ha1adm -c "hdbnsutil -sr_state; echo rc: $?"
# exit

* Manually start the HANA primary if only one site is available.

This might be necessary in case the cluster can not detect the status
of both sites. This is an advanced task.

Before doing this, make sure HANA is not primary on the other site!

1. Start the cluster on remaining nodes.
2. Wait and check for cluster is running, and in status idle.
3. Become sidadm, and start HANA manually.
4. Wait and check for HANA is running.
5. In case the cluster does not promote the HANA to primary, in-
struct the cluster to migrate the IP address to that node.
6. Wait and check for HANA has been promoted to primary by the
cluster.
7. Remove the migration rule from the IP address.
8. Check if cluster is in status idle.
9. You are done, for now.
10. Please bring back the other node and register that HANA as soon
as possible. If the HANA primary stays alone for too long, the log
area will fill up.

* Start Linux cluster after node has been fenced.

It is recommended to not configure the Linux cluster for always
starting autmatically on boot. Better is to start automatically only,
if cluster and/or node have been stopped cleanly. If the node has
been rebooted by STONITH, the cluster should not start automatically.
If the cluster is configure that way, some steps are needed to start
the cluster after a node has been rebooted by STONITH. STONITH via
SBD is used in this example.

# cs_clear_sbd_devices --all
# cs_show_sbd_devices
# crm cluster start
# crm_mon -r

* Overview on maintenance procedure for Linux, HANA remains running,
on pacemaker-2.0.

It is necessary to wait for each step to complete and to check the
result. It also is necessary to test and document the whole procedure
before applying in production. See also section REQUIREMENTS below
and example on checking status of HANA and cluster above.

1. Check status of Linux cluster and HANA, see above.
2. Set HANA multistate resource into maintenance mode.
# crm resource maintenance mst_... on
3. Set the Linux cluster into maintenance mode, on either node.
# crm maintenance on
4. Stop Linux Cluster on all nodes. Make sure to do that on all
nodes.
# crm cluster run "crm cluster stop"

5. Perform Linux maintenance.

6. Start Linux cluster on all nodes. Make sure to do that on all
nodes.
# crm cluster run "crm cluster start"
7. Set cluster ready for operations, on either node.
# crm maintenance off
8. Let Linux cluster detect status of HANA multistate resource, on
either node.
# crm resource refresh mst_...
9. Set HANA multistate resource ready for operations, on either
node.
# crm maintenance mst_... off
10. Check status of Linux cluster and HANA, see above.

* Overview on simple procedure for stopping and temporarily disabling
the Linux cluster, HANA gets fully stopped.

This procedure can be used to update HANA, OS or hardware. HANA
roles and resource status remains unchanged. It is necessary to wait
for each step to complete and to check the result. It also is neces-
sary to test and document the whole procedure before applying in pro-
duction.

1. disabling pacemaker on HANA primary
2. disabling pacemaker on HANA secondary
3. stopping cluster on HANA secondary
- HANA secondary will be stopped
- system replication goes SFAIL
4. stopping cluster on HANA primary
- HANA primary will be stopped
5. doing something with OS or hardware
6. enabling pacemaker on HANA primary
7. enabling pacemaker on HANA secondary
8. starting cluster on HANA primary
- HANA stays down
9. starting cluster on HANA secondary
- HANA primary and secondary will be started
- system replication recovers to SOK

Note: HANA is not available from step 4 to step 9.

* Overview on update procedure for the SAPHanaSR-angi package.

This procedure can be used to update RAs, HANA HADR provider hook
scripts and related tools while HANA and Linux cluster stay online.
See also SAPHanaSR-manageAttr(8) for details on reloading the HANA
HADR provider.

1. Check status of Linux cluster and HANA, see above.
2. Set resources SAPHanaController and SAPHanaTopology to mainte-
nance.
3. Update RPM on all cluster nodes.
4. Reload HANA HADR provider hook script on both sites.
5. Refresh resources SAPHanaController and SAPHanaTopology.
6. Set resources SAPHanaController and SAPHanaTopology from mainte-
nance to managed.
7. Check status of Linux cluster and HANA, see above.

* Remove left-over maintenance attribute from overall Linux cluster.

This could be done to avoid confusion caused by different maintenance
procedures. See above overview on maintenance procedures with run-
ning Linux cluster. Before doing so, check for cluster attribute
maintenance-mode="false".

# SAPHanaSR-showAttr
# crm_attribute --query -t crm_config -n maintenance-mode
# crm_attribute --delete -t crm_config -n maintenance-mode
# SAPHanaSR-showAttr

* Remove left-over standby attribute from Linux cluster nodes.

This could be done to avoid confusion caused by different maintenance
procedures. See above overview on maintenance procedures with run-
ning Linux cluster. Before doing so for all nodes, check for node
attribute standby="off" on all nodes.

# SAPHanaSR-showAttr
# crm_attribute --query -t nodes -N node1 -n standby
# crm_attribute --delete -t nodes -N node1 -n standby
# SAPHanaSR-showAttr

* Remove left-over maintenance attribute from resource.

This should usually not be needed. See above overview on maintenance
procedures with running Linux cluster.

# SAPHanaSR-showAttr
# crm_resource --resource cln_SAPHanaTop_HA1_HDB00 --delete-pa-
rameter maintenance --meta
# SAPHanaSR-showAttr

* Manually update global site attribute.

In rare cases the global site attribute hana_<sid>_glob_prim or
hana_<sid>_glob_sec is not updated automatically after successful
takeover, while all other attributes are updated correctly. The
global site attribute stays outdated even after the Linux cluster has
been idle for a while. In this case, that site attribute could be
updated manually. Make sure everything else is fine and just the
global site attribute has not been updated. Updating
hana_<sid>_glob_sec for SID HA1 with site name VOLKACH:

# crm configure show SAPHanaSR
# crm_attribute --type crm_config --set-name SAPHanaSR --name
hana_ha1_glob_sec --update VOLKACH
# crm configure show SAPHanaSR

* Upgrade scale-out srHook attribute from old-style to multi-target.

As final result of this upgrade, the RAs and hook script are upgraded
from old-style to multi-target. Further the Linux cluster's old-style
global srHook attribute hana_${sid}_glob_srHook is replaced by site-
aware attributes hana_${sid}_site_srHook_${SITE}. New auxiliary at-
tributes are introduced. The complete procedure and related require-
ments are described in detail in manual page SAPHanaSR-manageAttr(8).
The procedure at a glance:

a. Initially check if everything looks fine.
b. Set Linux cluster resources SAPHanaController and SAPHanaTopol-
ogy into maintenance.
c. Install multi-target aware SAPHanaSR-ScaleOut package on all
nodes.
d. Adapt sudoers permission on all nodes.
e. Replace HANA HADR provider configuration on both sites.
f. Reload HANA HADR provider hook script on both sites.
g. Check Linux cluster and HANA HADR provider for matching defined
upgrade entry state.
h. Migrate srHook attribute from old-style to multi-target.
i. Check Linux cluster for matching defined upgrade target state.
j. Set Linux cluster resources SAPHanaController and SAPHanaTopol-
ogy from maintenance to managed.
k. Optionally connect third HANA site via system replication out-
side of the Linux cluster.
l. Finally check if everything looks fine.

FILES
REQUIREMENTS
* For the current version of the resource agents that come with the
software packages SAPHanaSR-angi, the support is limited to the sce-
narios and parameters described in the respective manual pages
SAPHanaSR-angi(7), SAPHanaSR(7) and SAPHanaSR-ScaleOut(7).

* Be patient. For detecting the overall HANA status, the Linux clus-
ter needs a certain amount of time, depending on the HANA and the
configured intervals and timeouts.

* Before doing anything, always check for the Linux cluster's idle
status, left-over migration constraints, and resource failures as
well as the HANA landscape status, and the HANA SR status.

* Maintenance attributes for cluster, nodes and resources must not be
mixed.

* The Linux cluster needs to be up and running to allow HA/DR
provider events being written into CIB attributes. The current HANA
SR status might differ from CIB srHook attribute after Linux cluster
maintenance.

* Manually activating an HANA primary, like start of HANA primary or
takeover outside the Linux cluster creates risk of a duplicate-pri-
mary situation. The user is responsible for data integrity, particu-
larly when activating an HANA primary. See also susTkOver.py(7).

* When manually disabling or unregistering HANA system replication
that is controlled by the Linux cluster, the SAPHanaController re-
source needs to be in maintenance mode. The user is responsible for
data integrity.

* HANA site names are discovered automatically when the RAs are acti-
vated the very first time. That exact site names have to be used
later for all manual tasks.

* Just shutting down the cluster or OS while HANA is running is not a
valid maintenance procedure. This is known to yield undesired re-
sults, particularly in scale-out clusters.

BUGS
In case of any problem, please use your favourite SAP support process
to open a request for the component BC-OP-LNX-SUSE. Please report any
other feedback and suggestions to feedback@suse.com.

SEE ALSO
ocf_suse_SAPHanaTopology(7) , ocf_suse_SAPHanaController(7) ,
susHanaSR.py(7) , susHanaSrMultiTarget.py(7) , susCostOpt.py(7) ,
susTkOver.py(7) , susChkSrv.py(7) , SAPHanaSR-showAttr(8) ,
SAPHanaSR(7) , SAPHanaSR-ScaleOut(7) , SAPHanaSR-manageAttr(8) ,
SAPHanaSR-manageProvider(8) , cs_clusterstate(8) ,
cs_show_saphanasr_status(8) , cs_wait_for_idle(8) , crm(8) , crm_sim-
ulate(8) , crm_mon(8) , crm_attribute(8) ,
https://documentation.suse.com/sbp/sap/ ,
https://www.suse.com/support/kb/doc/?id=000019253 ,
https://www.suse.com/support/kb/doc/?id=000019207 ,
https://www.suse.com/support/kb/doc/?id=000019142 ,
https://www.suse.com/c/how-to-upgrade-your-suse-sap-hana-cluster-in-
an-easy-way/ ,
https://www.suse.com/c/tag/towardszerodowntime/ ,
https://help.sap.com/doc/eb75509ab0fd1014a2c6ba9b6d252832/1.0.12/en-
US/SAP_HANA_Administration_Guide_en.pdf

AUTHORS
F.Herschel, L.Pinne.

COPYRIGHT
(c) 2017-2018 SUSE Linux GmbH, Germany.
(c) 2019-2025 SUSE LLC
This maintenance examples are coming with ABSOLUTELY NO WARRANTY.
For details see the GNU General Public License at
http://www.gnu.org/licenses/gpl.html

10 Jan 202SAPHanaSR_maintenance_examples(7)