When a change in a computer system occurs, things happen.
But in ITIL, you don't just watch for events that go wrong or create problems.
All events should be promptly identified, monitored and rectified if necessary.
Comprehensive event management makes IT systems more proactive and holistic.
Definition of ITIL event management
Event management monitors events that occur during changes and improvements to the IT infrastructure. This allows normal operations to continue while also detecting “exception conditions” or “exceptional events”.
What is an “exceptional event” in ITIL?
An exceptional event occurs when a problem occurs, such as a server failure. Thousands, if not millions, of “events” occur every day in an IT infrastructure, and only a few are exceptional.
An event is any change in configuration item (CI) from one state to another within an IT department. Exception events are considered significant because they require a response to remedy them.
For example, a server going from online to idle mode can be an event. It is worth knowing because it means that action can be taken if necessary. But this is only considered an exception if it has gone wrong and requires immediate action.
Visit Also: ITIL Foundation
Event management tools
Monitoring Tools, or CIs (Configuration Items), send notifications about events. There are two types of these tools available:
Active monitoring tools - query the status and availability of key configuration items. An exception generates an alert.
Passive monitoring tools - detect and correlate CI alerts.
For example, let's say that a switch on a network needs to stay “on”. An event management tool would confirm this by monitoring the switch by sending "pings" to it. Failure to respond would be logged as a status change, sending an alert that galvanizes action to address it.
Examples of event management
Events fall into one of three categories:
Information - a successful task, such as a user login or an email received by the participant.
Warning - when a device or service reaches a threshold limit, such as a scheduled backup not running or a server's memory is less than 10% of its usable memory.
Exception - an error thrown when a system component acts abnormally, such as a down server or a failed backup.
Visit Also: Prince2 Course
Event management measures
You can define event management metrics during the design phase of IT services. Decide what types of events should be generated and how they will be generated for each type of CI. Typical event management measures include:
Observation of services and components is essential for the proper functioning of the system. You should therefore regularly record and report the selected changes in the system, i.e. events. This helps you prioritize services and processes. In other words, knowing the cause and fixing a problem allows you to identify it and stop it before it happens again.