Difference between revisions of "Doc:latest/evalguide/csa102"

(Launching the SAFplus Platform IDE)
(csa102 Redundancy and Failover)
Line 38: Line 38:
 
</pre></code>
 
</pre></code>
 
</ul>
 
</ul>
 +
 +
The splash screen for SAFplus Platform IDE is displayed as illustrated in Figure [[#SAFplus Platform IDE Opening Screen | SAFplus Platform IDE Opening Screen]].
 +
 +
<span id='SAFplus Platform IDE Opening Screen'></span>[[File:Tutorial_GettingStartedIDE_OpeningScreen.png|frame|center| '''SAFplus Platform IDE Opening Screen''' ]]
 +
 +
 +
You will then be prompted to select a workspace in which to do your work as illustrated in Figure [[#SAFplus Platform IDE Workspace Launcher | SAFplus Platform IDE Workspace Launcher]].
 +
 +
The workspace you select should correspond to the project area that you created in the previous section (in this case "/home/clovis/projectarea1"). Note that the workspace includes a subdirectory of "/ide_workspace". This is done strictly for organizational purposes. It keeps the IDE models separate from the generated SAFplus Platform models and code. Select the workspace and click  '''OK'''  to launch SAFplus Platform IDE.
 +
 +
The '''SAFplus Platform''' IDE is launched and you are now looking at the main work area. For more information about the components of this main work area including the SAFplus Platform IDE menu and toolbar, see ''SAFplus Platform IDE User Guide''.
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
===natesh===
 +
 +
  
 
===Code===
 
===Code===

Revision as of 11:50, 7 June 2013

Contents

csa102 Redundancy and Failover

Objective

This sample demonstrates basic HA (High Availability) and SU (Service Unit) fail-over functionality. The application has two components, both processing the same workload as csa101, that is, repeatedly printing "Hello World". The difference, however, is that in this case there is now an active component and a standby component, with only the active component performing the printing function.

csa102 is quite similar to csa101, and this section will discuss the areas in which they deviate.

What you will learn

  • Keeping track of HA states and how to respond to callbacks requesting HA state changes.

How to create new Project Model csa102

The first step in setting up our example is to create a model which represents the system that we will eventually deploy. This is done through the SAFplus Platform IDE. In this chapter we will step through the following tasks:

  • Create a project area
  • Launch the SAFplus Platform IDE
  • Create the IDE project
  • Specify the Resource model (the types of physical hardware in our system)
  • Specify the Component model (the types of components or applications in our system)
  • Specify which Components run on which Resources
  • Specify important build and boot parameters

Creating a New Project Area

A project area is a directory on your system where you can develop SAFplus Platform models and generate the code corresponding to those models. This is same as created for csa101.

Create a project area using a script from the command line. To create a project area named projectarea1 in the /home/clovis directory you should execute the following command:

# cl-create-project-area /home/clovis/projectarea1

Launching the SAFplus Platform IDE

You can launch the SAFplus Platform IDE from the command line. If you chose to create symbolic links during installation you can simply enter cl-ide from any directory. If you did not create symbolic links during installation you can either:

  • Add the installation directory to the shells search path and then launch the IDE.
    # cl-ide
    

The splash screen for SAFplus Platform IDE is displayed as illustrated in Figure SAFplus Platform IDE Opening Screen.

SAFplus Platform IDE Opening Screen


You will then be prompted to select a workspace in which to do your work as illustrated in Figure SAFplus Platform IDE Workspace Launcher.

The workspace you select should correspond to the project area that you created in the previous section (in this case "/home/clovis/projectarea1"). Note that the workspace includes a subdirectory of "/ide_workspace". This is done strictly for organizational purposes. It keeps the IDE models separate from the generated SAFplus Platform models and code. Select the workspace and click OK to launch SAFplus Platform IDE.

The SAFplus Platform IDE is launched and you are now looking at the main work area. For more information about the components of this main work area including the SAFplus Platform IDE menu and toolbar, see SAFplus Platform IDE User Guide.








natesh

Code

The code can be found within the following directory

<project-area_dir>/eval/src/app/csa102Comp

This sample component is implemented in a few C modules that are quite similar to the csa101 module. We will discuss the additions in detail.

We change the logging from the default "application" stream to a custom stream. To do this, we include the header that defines our config routines, change the default "clLogApp" macro to use a different stream, and define that stream as a global variable:

clCompAppMain.c
    
#include "../ev/ev.h"
...
#define clprintf(severity, ...)   clAppLog(gEvalLogStream, severity, 10, CL_LOG_AREA_UNSPECIFIED, CL_LOG_CONTEXT_UNSPECIFIED, __VA_ARGS__)
...
ClLogStreamHandleT  gEvalLogStream = CL_HANDLE_INVALID_VALUE;
 

Next, the log stream is initialized in the application's "main" function:

clCompAppMain.c
   
    /*
     * Initialize the log stream
     */
    clEvalAppLogStreamOpen((ClCharT*)appName.value, &gEvalLogStream);

This function is implemented in the src/app/ev.c file.


As with csa101, the clCompAppAMFCSISet() function is called to set the component's HA state, and the following block of code assigns this requested state to the component, while verbosely detailing this process:

clCompAppMain.c
void clCompAppAMFCSISet(SaInvocationT       invocation,
                        const SaNameT       *compName,
                        SaAmfHAStateT       haState,
                        SaAmfCSIDescriptorT csiDescriptor)
{
    /*
     * Print information about the CSI Set
     */

    clprintf (CL_LOG_SEV_INFO, "Component [%.*s] : PID [%d]. CSI Set Received\n", 
              compName->length, compName->value, mypid);

    clCompAppAMFPrintCSI(csiDescriptor, haState);

    /*
     * Take appropriate action based on state
     */

    switch ( haState )
    {
        case SA_AMF_HA_ACTIVE:
        {
            /*
             * AMF has requested application to take the active HA state 
             * for the CSI.
             */
            pthread_t thr;
            
            clprintf(CL_LOG_SEV_INFO,"csa102: ACTIVE state requested; activating service");
            running = 1;
            pthread_create(&thr,NULL,activeLoop,NULL);
            
            saAmfResponse(amfHandle, invocation, SA_AIS_OK);
            break;
        }

        case SA_AMF_HA_STANDBY:
        {
            /*
             * AMF has requested application to take the standby HA state 
             * for this CSI.
             */
            clprintf(CL_LOG_SEV_INFO,"csa102: Standby state requested");
            running = 0;
            saAmfResponse(amfHandle, invocation, SA_AIS_OK);
            break;
        }

        case SA_AMF_HA_QUIESCED:
        {
            /*
             * AMF has requested application to quiesce the CSI currently
             * assigned the active or quiescing HA state. The application 
             * must stop work associated with the CSI immediately.
             */
            clprintf(CL_LOG_SEV_INFO,"csa102: Acknowledging new state quiesced");
            running = 0;

            saAmfResponse(amfHandle, invocation, SA_AIS_OK);
            break;
        }

        case SA_AMF_HA_QUIESCING:
        {
            /*
             * AMF has requested application to quiesce the CSI currently
             * assigned the active HA state. The application must stop work
             * associated with the CSI gracefully and not accept any new
             * workloads while the work is being terminated.
             */
            clprintf(CL_LOG_SEV_INFO,"csa102: Signaling completion of QUIESCING");
            running = 0;

            saAmfCSIQuiescingComplete(amfHandle, invocation, SA_AIS_OK);
            break;
        }
...
  

In this case the application spawns a thread when it is assigned active which is a very common strategy for threaded applications. It also sets a global variable "running" to true. When the application is "quesced" -- that is when the active work assignment is taken away -- the application sets this global variable back to 0 to trigger the active thread to quit itself. The thread is simply defined as:

clCompAppMain.c
void* activeLoop(void* p)
{
    while (running)
    {
        clprintf(CL_LOG_SEV_INFO,"csa102: Threaded Hello World! %s", show_progress());
        sleep(2);
    }
    return NULL;
}

static char* show_progress(void)
{
    static char bar[] = "          .";
    static int progress = 0;

    /* Show a little progress bar */
    return &bar[sizeof(bar)-2-(progress++)%(sizeof(bar)-2)];
}
  


The example also demonstrates a non-threaded approach. But first, some background: for both threaded and non-threaded applications, the main must have a "dispatch" loop that handles incoming AMF notifications and calls the relevant callback. So to implement a single threaded SAF aware application, the programmer must modify this dispatch loop adding active (and potentially standby) functionality:

clCompAppMain.c
main(...)

    do
    {
        struct timeval timeout;
        timeout.tv_sec = 2; timeout.tv_usec = 0;

        FD_ZERO(&read_fds);
        FD_SET(dispatch_fd, &read_fds);

        if( select(dispatch_fd + 1, &read_fds, NULL, NULL, &timeout) < 0)
        {
            if (EINTR == errno)
            {
                continue;
            }
		    clprintf (CL_LOG_SEV_ERROR, "Error in select()");
			perror("");
            break;
        }
        if (FD_ISSET(dispatch_fd,&read_fds)) saAmfDispatch(amfHandle, SA_DISPATCH_ALL);
        
        if (running) clprintf(CL_LOG_SEV_INFO,"csa102: Unthreaded Hello World! %s", show_progress());  // Run the "active" code
        else clprintf(CL_LOG_SEV_INFO,"csa102: idle");
    }while(!unblockNow);      

  

This code should be very familiar to anyone who has written single threaded "event loop" style code. As can be seen in the code snippet above, the select is given an idle timeout (in a real application the timeout would be much smaller) and the application only calls saAmfDispatch if the select actually indicates that the there is data in the FD. Then it falls down into an "if" statement that checks if we are active "if (running)..." and outputs a log if that is the case.

How to Run csa102 and What to Observe

As with the csa101 example we will use the SAFplus Platform Console to manipulate the administrative state of the csa102 service group.

  1. Start the SAFplus Platform Console
     # cd /root/asp/bin
     # ./asp_console
  2. Then put the csa102SGI0 service group into lock assignment state using the following commands.
     cli[Test]-> setc 1
     cli[Test:SCNodeI0]-> setc cpm
     cli[Test:SCNodeI0:CPM]-> amsLockAssignment sg csa102SGI0
    

    Because example 102 has two components there will be two application log files to view. These are /root/asp/var/log/csa102CompI0Log.latest and /root/asp/var/log/csa102CompI1Log.latest. Viewing these application logs using the tail -f, you should see the following.

    /root/asp/var/log/csa102CompI0Log.latest
    Sun Jul 13 22:38:17 2008   (SCNodeI0.13418 : csa102CompEO.---.---.00029 :   INFO) 
     Component [csa102CompI0] : PID [13418]. Initializing
    
    Sun Jul 13 22:38:17 2008   (SCNodeI0.13418 : csa102CompEO.---.---.00030 :   INFO) 
        IOC Address             : 0x1
    
    Sun Jul 13 22:38:17 2008   (SCNodeI0.13418 : csa102CompEO.---.---.00031 :   INFO)
        IOC Port                : 0x80
    
    Sun Jul 13 22:38:17 2008   (SCNodeI0.13418 : csa102CompEO.---.---.00032 :   INFO)
     csa102: Instantiated as component instance csa102CompI0.
    
    Sun Jul 13 22:38:17 2008   (SCNodeI0.13418 : csa102CompEO.---.---.00033 :   INFO)
     csa102CompI0: Waiting for CSI assignment...
      
    //root/asp/var/log/csa102CompI1Log.latest
    Sun Jul 13 22:38:18 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00028 :   INFO)
     Component [csa102CompI1] : PID [13422]. Initializing
    
    Sun Jul 13 22:38:18 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00029 :   INFO)
        IOC Address             : 0x1
    
    Sun Jul 13 22:38:18 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00030 :   INFO)
        IOC Port                : 0x81
    
    Sun Jul 13 22:38:18 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00031 :   INFO)
     csa102: Instantiated as component instance csa102CompI1.
    
    Sun Jul 13 22:38:18 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00032 :   INFO)
     csa102CompI1: Waiting for CSI assignment...
      
  3. Next, unlock the service group using the following SAFplus Platform Console command.
    # cli[Test:SCNodeI0:CPM]-> amsUnlock sg csa102SGI0
    

    and in the /var/log/csa102CompI*.log files we should see:

    /root/asp/var/log/csa102CompI0Log.latest
    Sun Jul 13 23:00:18 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00487 :   INFO)
     csa102: Hello World!       .
    
    Sun Jul 13 23:00:19 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00488 :   INFO)
     csa102: Hello World!        .
    
    Sun Jul 13 23:00:20 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00489 :   INFO)
     csa102: Hello World!         .
    
    Sun Jul 13 23:00:21 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00490 :   INFO)
     csa102: Hello World!          .
    
    Sun Jul 13 23:00:22 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00491 :   INFO)
     csa102: Hello World! .
    
    Sun Jul 13 23:00:23 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00492 :   INFO)
     csa102: Hello World!  .
      
    /root/asp/var/log/csa102CompI1Log.latest
    Sun Jul 13 22:43:00 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00043 :   INFO)
     csa102: New state is not the ACTIVE; deactivating service
      

    These can be watched in a separate terminal window using tail -f. csa102CompI0 is the active component in this case, and csa102CompI1 is the standby. Consequently, the "Hello world!" lines appear in csa102CompI0.log and not in csa102CompI1.log. They will continue to be logged to that file until the HA state of that component changes, for example, when the process logging those lines is killed. In the mean time the standby component: csa102CompI1 just waits until it is told that it should take over the workload.


Changing the HA state of the Client/Server

The easiest way to test component fail-over is to kill the process associated with the active component using the kill command. For this you need to know the process ID of the active component. To find the process ID issue the following command from a bash shell.

# ps -eaf | grep csa102

This should produce an output that looks similar to the following.

root     15872 15663  0 13:49 ?        00:00:01 csa102Comp -p
root     16328 15663  0 13:56 ?        00:00:00 csa102Comp -p
root     17304 16145  0 14:11 pts/4    00:00:00 grep csa102

Notice the two entries that end with csa102Comp -p. These are our two component processes. The first one is usually the active process. This is the one that we will kill. In this case the process ID is 15872. So to kill the active component you issue the command:

# kill -9 15872

OpenClovis Note.pngIf this step does not result in the active component being killed then it is likely that the standby component was killed. In this case simply try killing the other process.

After executing the kill command you can see in the csa102CompI1 application that the standby component is now active.

/root/asp/var/log/csa102CompI1Log.latest
Sun Jul 13 23:00:18 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00487 :   INFO)
 csa102: Hello World!       .

Sun Jul 13 23:00:19 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00488 :   INFO)
 csa102: Hello World!        .

Sun Jul 13 23:00:20 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00489 :   INFO)
 csa102: Hello World!         .

Sun Jul 13 23:00:21 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00490 :   INFO)
 csa102: Hello World!          .

Sun Jul 13 23:00:22 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00491 :   INFO)
 csa102: Hello World! .

Sun Jul 13 23:00:23 2008   (SCNodeI0.13422 : csa102CompEO.---.---.00492 :   INFO)
 csa102: Hello World!  .     .
  

This indicates that the standby component has taken over for the failed active component.

Looking in the csa102CompI0 application log you can see that this component was killed and has been restarted. Since csa102CompI1 took over as the active component this component now goes into the standby state.

/root/asp/var/log/csa102CompI0Log.latest
Sun Jul 13 22:53:02 2008   (SCNodeI0.13712 : csa102CompEO.---.---.00040 :   INFO)
 Component [csa102CompI0] : PID [13712]. Initializing

Sun Jul 13 22:53:02 2008   (SCNodeI0.13712 : csa102CompEO.---.---.00041 :   INFO)
    IOC Address             : 0x1

Sun Jul 13 22:53:02 2008   (SCNodeI0.13712 : csa102CompEO.---.---.00042 :   INFO)
    IOC Port                : 0x80

Sun Jul 13 22:53:02 2008   (SCNodeI0.13712 : csa102CompEO.---.---.00043 :   INFO)
 csa102: Instantiated as component instance csa102CompI0.

Sun Jul 13 22:53:02 2008   (SCNodeI0.13712 : csa102CompEO.---.---.00044 :   INFO)
 csa102CompI0: Waiting for CSI assignment...
  

You can continue to observe this failover by alternately killing the active component.

To stop csa102 using the SAFplus Platform Console.

cli[Test:SCNodeI0:CPM]-> amsLockAssignment sg csa102SGI0

Successfully changed state of csa102SGI0 to LockAssignment

cli[Test:SCNodeI0:CPM]-> amsLockInstantiation sg csa102SGI0
cli[Test:SCNodeI0:CPM] -> end
cli[Test:SCNodeI0] -> end
cli[Test] -> bye

Successfully changed state of csa102SGI0 to LockInstantiation and exit.

Summary

This Sample Application has covered basic HA and failover, with changing the state of a component to active and standby.