New monitoring modules can be easily developed.
These modules may use SNMP requests or can simply run
any script (locally or on a remote system) to collect
the requested values. The mechanism to run these
modules under independent threads, to perform the
interaction with the operating system or to control a
SNMP session are inherited from a basic monitoring
class. The user basically should only provide the
mechanism to collect the values, to parse the output
and to generate a result object. It is also required to
provide the names of the parameters that are collected
by this module.
While the modules currently provided with MonALISA
are integrated in the service binary distribution, the
source code of some example modules is provided in the
${MonaLisa_HOME}/Service/usr_code
directory. This is also the directory in which the
users can develop their own modules. The next section
contains instructions for creating and running new
modules.
1.2. How to Write a
New Module
Creating a new module means writing a class that
extends the lia.Monitor.monitor.cmdExec
class
and implements lia.Monitor.monitor.MonitoringModule
interface.
This interface has the following structure:
package lia.Monitor.monitor;
public interface MonitoringModule extends lia.util.DynamicThreadPoll.SchJobInt {
public MonModuleInfo init( MNode node, String args ) ;
public String[] ResTypes() ;
public String getOsName();
public Object doProcess() throws Exception ;
public MNode getNode();
public String getClusterName();
public String getFarmName();
public boolean isRepetitive() ;
public String getTaskName();
public MonModuleInfo getInfo();
}
The doProcess
function
is actually the function that collects and returns the
results. Usually the return type is a vector of lia.Monitor.monitor.Result
objects, but it can also be a single Result
object.
The init
function
initializes the useful information for the module, like
the name of the cluster that contains the monitoring
nodes, the name of the farm and the parameters for this
module. This function is the first called when the farm
loads the module. The second parameter of the function
represents the list of parameters provided for the
module in the farm configuration file (see the section
on activating the modules), which should be parsed to
obtain the parameter values.
The isRepetitive
function tells if the module has to collect results
only once or repetitively. The return values is the
isRepetitive
module's
boolean variable. If true, then the module is called
from time to time. The repetitive time is specified in
the <farm>.conf
file. If not there, then the default repetitive call
time is 30s.
The other functions return different module
information, that is usually set in the init()
method. In the source code
examples from usr_code
you can find models for writing these functions.
1.3. How to
Activate a New Module
.... (myFarm.conf) ...
In order for MonALISA to be able to load the new
module, the path to the module's directory should be
added to the CLASSURLs property from the ${MonaLisa_HOME}/Service/ml.properties
file. For example:
lia.Monitor.CLASSURLs=file:${MonaLisa_HOME}/Service/usr_code/MyModule/
Multiple directories can be specified here separated
by commas.
Examples to generate new modules can be found in
${MonaLisa_HOME}/Service/usr_code
.
In usr_code/MDS
there
is an example of writing the received values into MDS.
This is done using a unix pipe to communicate between
the dynamically loadable java module and the script
performing the update into the LDAP server.
Another simple example which simply prints all the
values on sysout can be found on usr_code/SimpleWriter
.
Another example to write the values into UDP sockets
is in usr_code/UDPWriter
.
2. Data Filters / Event
Triggers
Filters allow to dynamically create any new type of
derived value from the collected values. As an example it
allows to evaluate the integrated traffic over last n
minutes, or the number of nodes for which the load is
less than x. Filters may also send an email to a list or
SMS messages when predefined complex condition occur.
These filters are executed in independent threads and
allow any client to register for its output. They may be
used to help application to react when certain conditions
occur, or to help in presenting global values for large
computing facilities.
Each Filter has it's own Thread in MonALISA Service, so
that they can run independently from each other.
To write your own Filters/Triggers please follow the
following steps:
- Your filter MUST extend
lia.Monitor.Filters.GenericMLFilter
- It must have a constructor with a String param
(the FarmName) in which you must call
super(farmName). This constructor is used to
dynamicaly instantiate your filter at runtime.
-
Your filter MUST override the following methods:
-
public String
getName()
returns the Filter name
It is a short name to identify data sent
by your filter in the client. It is also used
by MonALISA clients to inform the Service
that they are interested in the data
processed by this filter. It MUST be unique
because all the filters in ML are identified
by their name.
-
public String
getName()
returns the Filter name
It is a short name to identify data sent
by your filter in the client. It is also used
by MonALISA clients to inform the Service
that they are interested in the data
processed by this filter. It MUST be unique
because all the filters in ML are identified
by their name.
-
public monPredicate[]
getFilterPred()
returns a vector of monPredicate(s)
These predicates are used to filter only
the interested results that they want to
receive from the entire data flow. If it
returns null, the filter will receive all the
monitoring information.
-
public void
notifyResult(Object o)
This method is called every time a Result
matches a predicate defined at b). The Filter
could save this in a local buffer for future
analysis, or it can take some real time
decision(s)/action(s) if it is a trigger.
-
public Object
expressResults()
returns a vector of Gresults and/or Results
This method is called from time to time to
let the filter to process the data that it
has received. It should return a Vector of
Gresults and/or Results that will be further
sent to all the registered clients, or null
if no data should be sent to Clients (e.g.
the filter is a trigger).
-
public long
getSleepTime()
returns a vector of Gresults and/or Results
Returns a time(in milliseconds) for how
often expressResults() should be called.
E.g.: If this method returns 2*60*1000 the
function expressResults() will be called
every 2 minutes.
-
In your ml.properties file please add the path to
the directory where filter has it's .class files.
The parameter is lia.Monitor.CLASSURLs (if there
are more filters/directories please separate them
by ,(commas))
E.g:
lia.Monitor.CLASSURLs=file:${MonaLisa_HOME}/Service/usr_code/FilterExamples/ExTrigger/
-
In ml.properties you must specify what filters
should be loaded,separated by commas.
E.g:
lia.Monitor.ExternalFilters=ExTrigger,ExLoadFilter
The
Service/usr_code/FilterExamples
directory contains some simple examples of dynamic
filters One of them (ExTrigger) is a simple alarm which
send an email if the Load5 parameter on master node
reaches a threshold value, and the other one
(ExLoadFilter) computes min, max and mean value for a
cluster. The data flux between MonALISA Service and
clients can contain, more or less, the following two
classes:
Agents are entities loaded on MonALISA service that
process the monitoring gathered data and communicate
between them for resolving a distributed task based on
these data.
An agent respects a given interface. Writing an agent
actualy means creating a class that implements lia.Monitor.monitor.AgentI
interface. This interface has the following
structure:
import lia.Monitor.DataCache.AgentsCommunication;
import lia.Monitor.monitor.AgentInfo ;
public interface AgentI {
public void init(AgentsCommunication comm);
public void doWork();
public String getName();
public String getGroup();
public String getAddress();
public AgentInfo getAgentInfo ();
public void processMsg(Object msg);
public void processErrorMsg (Object msg);
}
For an agent to be able to communicate, the
agent-to-agent communication environment has to be
initiated. An agent can do this by implementing the init
method. This method is
called by the Agents Engine when first loading the
agent.
Agents hosted on the monitoring service usually
communicate using the agents communication platform
created over the tcp connections to all the proxy
services. The communication is one reliabe, secure, fast
and scalable.
The AgentCommunication has methods to send
agent-to-agent messages (the sendMsg
method), or agent-to-proxy
message (the sendCtrlMsg
method) for getting information about other agents from
the distributed system (the list of agents from a group
or the number of agents from a group).
package lia.Monitor.DataCache;
public interface AgentsCommunication {
public void sendMsg (Object o);
public void sendCtrlMsg (Object o, String cmd);
}
Messages sent between agents are of a specified
format:
public class AgentMessage implements java.io.Serializable {
public Integer messageID;
public Long timeStamp;
public Integer messageType;
public Integer priority;
public String agentAddrS;
public String agentAddrD;
public String agentGroupS;
public String agentGroupD;
public Integer messageTypeAck ;
public Object message ;
}
In the messages sent between clients there are the
following fields:
- messageID - an integer number for messages
sequance.
- timeStamp - time in milliseconds when the messages
was sent from the source.
- messageType - type of the message.
- priority - messages priority, a number between 1 and
10, default 5. If the priority is high, the message is
forwarded faster by the proxy service than the other
messages.
-agentAddrS - address of the source agent.
- agentAddrD - address of the destination agent(s).
Can be a multicast address sent to all the agents
registered in a group.
- agentGroupS - the group of the source agent. If the
source agent hasn't had registered in a group yet, then
this field is null. When specified for the first time,
the agent registers in the group. If is the first agent
that registeres in the specified group, then the new
group is created in the proxy service.
- agentGroupD - the group of the destination
agent.
- messageTypeAck - if its an ACK message, then a
confirmation is required when reaching the
destination.
- message - the effective message transmitted. Can be
any serializing object.
What an agent does is implemented in the doWork
function. An agent is loaded on
the monitoring service calling the addAgent
function from the lia.Monitor.DataCache.AgentsEngine
.
Anytime an agent is loaded a new execution thread is
created. This thread executes the agent's dowork
function.
An agent is identified in the monitoring service by
its name. Every agent has to have a unique name. Based on
this name and on the monitoring service (hosting service)
ID, an agent has a distinct address in the whole
distributed system, agentName@farmID. Also, an agent can
register itself in an agent group. Agent groups make
possible multicast messages sent to all agents registered
in a group. If the agent doesn't want to register in a
group, it doesn't set the group field. All the
information about agent's name, group, address can be
known by calling getName
,
getAddress
or getAgentInfo
methods. For the last
mentioned method, an object of AgentInfo type is
returned, containing all the information about an agent.
The lia.Monitor.monitor.AgentInfo
class
has the following structure:
public class AgentInfo {
public String agentName;
public String agentGroup;
public String farmID;
public String agentAddr;
public AgentInfo (String agentName, String agentGroup, String farmID) {
this.agentName = agentName ;
this.agentGroup = agentGroup;
this.farmID = farmID;
this.agentAddr = agentName+"@"+farmID;
}
}
Messages can be received from other agents in the
distributed system. Messages are process by the processMsg
method.
If a message sent by the agent couldn't reach the
destination, and error message returns to the sending
agent to announce it about communication failure. The
error message is processed by the processErrorMsg
method.
An abstract class, lia.Monitor.Agents.AbstractAgent
exists to simplify the agents developement. This class
wraps the AgentI interface, defining all AgentI methods,
except processMsg
and doWork
methods. There also is a
method for messages creating:
public AgentMessage createMsg(
int messageID,
int messageType,
int messageTypeAck,
int priority,
String agentAddrD,
String agentGroupD,
Object message);