Monitoring and Event Management

Monitoring and event management is another service management practice you have to recall the purpose of, and there's a term that is defined in this article.
The purpose of the monitoring and event management practice is to systematically observe a service or a service component, and record and report selected changes of state identified as events. Events is the word that's going to be defined here. This practice identifies and prioritizes infrastructure, services, business processes, information security events, and then establishes the appropriate response to those events, including responding to the conditions that could lead to potential faults or incidents. That means event management is really important for you and your organization.


Now on to the term that you need to memorize: event. An event is any change of state that has significance for the management of a configuration item or a CI or an IT service. For example, an event might be when you log in successfully, or when you fail to log in successfully. It might be that the bandwidth is exceeded for a certain threshold. It might be that the free space on a server gets below a certain threshold. These are all things that could be considered events. You'll want to make sure that you take some action based on these events.
Depending on what the event is, that's going to tell you what you need to do. The event might be something that's informational, or it might be a warning, or it might be an alert, depending on the severity. If it's something informational, like you had a successful login, that's fine, you don't have to take any action, you might just have to log it so that you have documentation. But if we have something like a login failure that happened three times, that might become a warning and you want to figure out why that account tried to log in three times unsuccessfully. If you have something like the free space on a server getting below 10 gigabytes, that might be an alert because it might cause one of your servers to crash if it ran out of disk space.


So that's the idea here with the event and monitoring - you want to make sure you understand what events there are and monitor them to decide what actions need to be taken. Some of those actions need to go out and get into the problem management perspective, or the incident management perspective. But again, here inside of the event monitoring, you're really focused on the event itself, which is any change in state that has significance for the management of a configuration item, a CI, or an IT service.