LogService

The DIET platform can be monitored using a system called LogService. This monitoring service offers the capability to be aware of information that you want to relay from the platform. As shown in Figure 12.1, LogService is composed of three modules: LogComponent, LogCentral and LogTool.

Figure 12.1: DIET and LogService.
Image DIET_arch_request-2

-
A LogComponent is attached to a component and relays information and messages to LogCentral. LogComponents are typically used within components one wants to monitor.
-
LogCentral collects messages received from LogComponents, then LogCentral stores or sends these messages to LogTools.
-
LogTools connect themselves to LogCentral and wait for messages. LogTools are typically used within monitoring tools.
The main interest in LogService is that information is collected by a central point LogCentral that receives logEvents from LogComponents that are attached to DIET elements (MA, LA and SeD). LogCentral offers the possibility to re-send this information to several tools (LogTools) that are responsible for analysing these message and offering comprehensive information to the user.

LogService defines and implements several functionalities:

Filtering mechanisms
As few messages as possible should be sent to minimize network traffic. With respect to the three-tier model, the communications between applications (e.g., LogComponent) and the collector (e.g., LogCentral), as well as between the collector and the monitoring tools (e.g., LogTools), should be minimized. When a LogTool registers with the LogCentral, it also registers a filter defining which messages are required by the tool.

Message ordering
Event ordering is another important feature of a monitoring system. LogService handles this problem by the introduction of a global time line. At generation each message receives a time-stamp. The problem that can occur is that the system time can be different on each host. LogService measures this difference internally and corrects the time-stamps of incoming messages accordingly. The time difference is correcting by using a time difference measurement recorded during the last ping that LogCentral has sent to the LogComponent (pings are sent periodically to verify the “aliveness” of the LogComponent).

However, incoming messages are still unsorted. Thus, the messages are buffered for a short period of time in order to deliver a sorted stream of messages to the tools. Messages that arrive out of order within this time are sorted in the buffer and can thus be properly delivered. Although this induces a delivery-delay for messages, this mechanism guarantees the proper ordering of messages within a certain tolerance. As tools should not rely on true real-time delivery of messages, this short delay is acceptable.

The System State Problem
A problem that arises in distributed environments is the state of the application. This state may for example contain information on connected servers, their relationships, the active tasks and many other pieces of information that depend on the application. The system state can be constructed from all events that occurred in the application. Some tools rely on this state to work properly.

The problem emerges if those specific tools do not receive all messages. This might occur as tools can connect to the monitor after the application has been started. In fact, this is quite probable as the lifetime of the distributed application can be much longer than the lifetime of a tool.

As a consequence, the system state must be maintained and stored. In order to maintain a system state in a general way, LogService does not store the system state itself, but all messages which are required to construct it. Those messages are identified by their tag and stored in a special list. This list is forwarded to each tool that connects. For the tool this process is transparent, since it simply receives a number of messages that represent the state of the application.

In order to further refine this concept, the list of important messages can also be cleaned up by LogService. This is necessary as components may connect and disconnect at runtime. After a disconnection of a component the respective information is no longer relevant for the system state. Therefore, all messages which originated at this component can be removed from the list. They have become obsolete due to the disconnection of the component and can be safely deleted in order to reduce the length of the list of important messages to a minimum.

All DIET components implement the LogComponent interface. By using LogCentral, the DIET architecture is able to relay information to LogCentral, and then it is possible to connect to LogCentral by using a LogTool to collect, store and analyse this information. LogService is available for download. See the web page http://graal.ens-lyon.fr/DIET/logservice.htmlfor more information.

The DIET Team - Mer 29 nov 2017 15:13:36 EST