High Availability (AMF) Questions

Revision as of 17:33, 16 June 2011 by Karthick (Talk | contribs)



High Availability (AMF)

Overview

OpenClovis ASP provides a SA-Forum compliant High Availability solution (the AMF in SAF terminology). This component controls the starting/stopping of applications, application redundancy configuration, and role assignment.

FAQ

  • What’s the best way for an application to determine the name/slot of the currently active controller in the cluster?
The relevant API is "clCpmMasterAddressGet". Example code (which also converts the node's address into its name) is as follows:
#include <clCpmExtApi.h>
ClRcT masterNodeGet(void)
{
    ClIocNodeAddressT node = 0;
    ClRcT rc = clCpmMasterAddressGet(&node);
    if(node)
    {
        ClCpmSlotInfoT slotInfo = {.slotId = node } ;
        rc = clCpmSlotGet(CL_CPM_SLOT_ID, &slotInfo);
        if(rc == CL_OK)
            clLogNotice("MASTER", "GET", " Currently active controller node name is [%.*s]",
                        slotInfo.nodeName.length, slotInfo.nodeName.value);
    }
    return rc;
}
  • How does component recovery and escalation work ?
 If your recovery is component restart and your component is marked as restartable, 
 and it restarts sgconfig.compRestartCountMax within sgconfig.compRestartDuration, 
 then it escalates to SU restart. If the SU restart happens sgconfig.surestartcountmax 
 within sgconfig.surestartduration, then it escalates to SU failover. If SU is not marked 
 as restartable, then it escalates to SU failover. If the SU failover happens nodeconfig.sufailovercountmax 
 within nodeconfig.sufailoverduration, then it escalates to node failover.
 If the nodeconfig.autorepair is set to TRUE (default), the node is rebooted on 
 node recovery escalations. Also if a component cleanup action fails which typically happens
 when the configured component cleanup script fails when run on abnormal process exits/crashes,
 the recovery is escalated to node failover if the compconfig.nodeRebootCleanupFail is set
 to TRUE (default). 
 The node reboot behavior on escalation/recovery can be controlled by exporting the following environment 
 variables in your MODEL/target.env location thats automatically copied to etc/asp.conf while building the target images with make images.

 export CL_ASP_NODE_REBOOT_DISABLE=1

  The above flag would disable reboot of nodes. If the user wants to restart the 
  middleware or ASP on node recovery, then he could:
 
 export CL_ASP_NODE_RESTART=1