Calculate criteria propagation values
Each metric has a numeric value. Let’s say X.
Also, for each metric, we expect it to have a state, i.e. red/yellow/green, representing a level of compliance.
By default, there are three states and a default numeric value for each of them:
- Green: 100
- Yellow: 85
- Red: 0
Given a modelled node (a server, an application, etc), a value and state. However, each node which is not a metric (leaf node) will have several dimensions: availability, capacity, service desk, service level, etc. That means that each node will have up to values (and 4 associated states to those values). Most of the nodes will have at least an availability and capacity values.
For example:
Server A
---- Availability metric 1: Green (100)
---- Availability metric 2: Yellow (85)
---- Capacity metric1: red (0)
---- Capacity metric 2: green (100)
So, depending on the propagation criteria in server A, we will calculate a value for A’s availability based on values (100, 85) and capacity based on values (0,100).
Given a cluster of X elements, the state of each dimension of the cluster (availability, capacity, service desk, etc.) has to be calculated based on the state of those elements.
Choosing “best child” will propagate the best state of the children. Choosing “worst child” will propagate the worst. However, clusters are defined to guarantee high availability. So propagating the worst child is not a good idea. Propagating best child is also not always the best solution. So let’s think of an special criterion for clusters.
In this criterion, we will consider each state independently: reds first, yellows second, etc.
The same calculation will be done for each state.
Two thresholds between 0 and 100 will be defined, A and B. B represents the % of nodes in that state needed to consider the cluster in that state. A represents the % of nodes in that state needed to consider the cluster to be in (state -1).
A B
-----||-----||----
Let’s assume that A is 15% and B is 75%. That can be changed at any time in the modelling.
In a cluster of 5 elements, there should be 4 reds (80% > 75%) for the cluster to be red and 1 red (20% > 15%) for the cluster to be yellow.
Since the calculation is done for each dimension, the calculation for yellows has to be done also:
In a cluster of 5 elements, 4 yellows are required to be yellow, and 1 yellow to be green
Summarizing, the state of the cluster will be:
worst (calculation based on reds, calculation based on yellows)
So, for every one of the different states it is calculated the highest value as an indicative of this states, and the precedent thresholds are indicative of it’s precedent states in order of criticality.
Let’s see some examples:
We start from a scenario where we have three states in order of criticality: critical, warning and ok
We are going to suppose that the availability thresholds are:
- CRITICAL: 30
- WARNING: 80
- OK: 100
Example 1:
SERVER1 CRITICAL
SERVER2 OK
SERVER3 OK
SERVER4 OK
SERVER5 OK
A B
-------|15%|-------|75%|-------
Status calculation:
In this case for the critical states the 75% indicates the value from which the critical is taken into account. Between 15% and 75% the states will be warning and before 15% the states will be ok.
So, as the critical level is the 20% because we have 1 server of 5 it means we are in a warning level because we are between 15% and 75%
Value calculation:
Let’s calculate the value to be propagated taking into account that we have a Warning:
State Range (SR) = threshold – (threshold-1) = 80 - 30 = 50
Maximum value of the state to propagate (MV) = 80
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 20 – 15 = 5
Total Range of the cluster state to propagate (TR) = Total state range - minimum value to accomplish the state = 75 – 15 = 60
MV – SR * CR / TR = 80 – 50 * 5/60 = 80 - 50*0,08 = 76
So 76 will be the value to be propagated
Example 2:
SERVER 1 WARNING
SERVER 2 OK
SERVER 3 OK
SERVER 4 OK
SERVER 5 OK
A B
-----|15%|---------|75%|----
Status calculation:
In this case we have 20% in warning but as we should have more than 75% to mark a warning then the 20% (1 server of 5) means ok.
Value calculation:
Let’s calculate the value to be propagated taking into account that we have an OK:
State Range (SR) = threshold – (threshold-1) = 100 - 80 = 20
Maximum value of the state to propagate (MV) = 100
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 20 – 0 = 20
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 75 – 0 = 75
MV – SR * CR / TR = 100 – 20 * 20/75 = 100 - 20*0,26 = 94
So 94 will be the value to be propagated
Example 3:
SERVER 1 WARNING
SERVER 2 WARNING
SERVER 3 WARNING
SERVER 4 WARNING
SERVER 5 CRITICAL
A B
-----|15%|---------|75%|----
Status calculation:
Here we have 20% from critical which means -> Warning (because it is over 15%)
And 80% of warning which also means -> Warning (because it’s over 75%)
So in this case we have a warning
Value calculation:
Let’s calculate the value to be propagated taking into account that we have a WARNING:
In this case as we have a warning due to several levels of criticality (warning and critical), we have to calculate the value from everyone and take into account the worst
WARNING:
State Range (SR) = threshold – (threshold-1) = 80 - 30 = 50
Maximum value of the state to propagate (MV) = 80
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 80 – 75 = 5
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 100 – 75 = 25
MV – SR * CR / TR = 80 – 50 * 5/25 = 80 - 50*0,2 = 70
CRITICAL:
State Range (SR) = threshold – (threshold-1) = 80 - 30 = 50
Maximum value of the state to propagate (MV) = 80
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 20 – 15 = 5
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 75 – 15 = 60
MV – SR * CR / TR = 80 – 50 * 5/60 = 80 - 50*0,08 = 76
So in this case we will propagate 70 because is the worst value
Example 4:
SERVER1 WARNING
SERVER2 WARNING
SERVER3 WARNING
SERVER4 CRITICAL
SERVER5 CRITICAL
A B
-----|15%|---------|75%|-----
Status calculation:
Here we have 40% from critical which means -> WARNING
And 60% of warning which means -> OK
So in this case we have a warning
Value calculation:
Let’s calculate the value to be propagated taking into account that we have a WARNING:
State Range (SR) = threshold – (threshold-1) = 80 - 30 = 50
Maximum value of the state to propagate (MV) = 80
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 40 – 15 = 25
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 75 – 15 = 60
MV – SR * CR / TR = 80 – 50 * 25/60 = 80 - 50*0,41 = 59
So 59 will be the value to be propagated
Example 5:
SERVER1 OK
SERVER2 OK
SERVER3 WARNING
SERVER4 WARNING
SERVER5 WARNING
A B
-----|15%|---------|75%|----
State calculation:
Here we have 60% from warning which means -> ok
So in this case we have an ok
Value calculation:
State Range (SR) = threshold – (threshold-1) = 100 - 80 = 20
Maximum value of the state to propagate (MV) = 100
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 60 – 0 = 60
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 75 – 0 = 75
MV – SR * CR / RT = 100 – 20 * 60/75 = 100 - 20*0,8 = 84
So 84 will be the value to be propagated
Example 6:
SERVER1 OK
SERVER2 CRITICAL
SERVER3 CRITICAL
SERVER4 CRITICAL
SERVER5 CRITICAL
A B
-----|15%|---------|75%|----
Status calculation:
Here we have 80% from critical which means -> Critical
So in this case we have a critical
Value calculation:
State Range (SR) = threshold – (threshold-1) = 30 - 0 = 30
Maximum value of the state to propagate (MV) = 30
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 80 – 75 = 5
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 100 – 75 = 25
MV – SR * CR / TR = 30 – 30 * 5/25 = 30 - 30*0,2 = 24
So 24 will be the value to be propagated
In case we have three states in order of criticality: critical, semicritical, warning, ok
We are going to suppose that the availability thresholds are:
- CRITICAL: 30
- SEMICRITICAL: 40
- WARNING: 80
- OK: 100
Example 1:
SERVER1 SEMICRITICAL
SERVER2 OK
SERVER3 SEMICRITICAL
SERVER4 WARNING
SERVER5 CRITICAL
A B C
-----|15%|------|45%|-----|75%|-----
Status calculation:
Here we have 20% from critical which means -> WARNING
40% of semicritical which means -> OK
And 20% of warning which means -> OK
So in this case we have a WARNING
Value calculation:
State Range (SR) = threshold – (threshold-1) = 80 - 40 = 40
Maximum value of the state to propagate (MV) = 80
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 20 – 15 = 5
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 45 – 15 = 30
MV – SR * CR / TR = 80 – 40 * 5/30 = 80 - 40*0,16 = 73
So 73 will be the value to be propagated
Example 2:
SERVER1 CRITICAL
SERVER2 CRITICAL
SERVER3 CRITICAL
SERVER4 OK
SERVER5 WARNING
A B C
-----|15%|------|45%|-----|75%|-----
Status calculation:
Here we have 60% from critical which means -> SEMICRITICAL
And 20% of warning which means -> OK
So in this case we have a SEMICRITICAL
Value calculation:
State Range (SR) = threshold – (threshold-1) = 40 - 30 = 10
Maximum value of the state to propagate (MV) = 40
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 60 – 45 = 15
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 75 – 45 = 30
MV – SR * CR / TR = 40 – 10 * 15/30 = 40 - 10*0,5 = 35
So 35 will be the value to be propagated
Example 3:
SERVER1 CRITICAL
SERVER2 SEMICRITICAL
SERVER3 SEMICRITICAL
SERVER4 SEMICRITICAL
SERVER5 WARNING
SERVER6 WARNING
A B C
-----|15%|------|45%|-----|75%|-----
Status calculation:
Here we have 16% from critical which means -> WARNING
50% of semicritical which means -> WARNING
And 32% of warning which means -> OK
So in this case we have a WARNING
Value calculation:
In this case as we have a warning due to several levels of criticality (semicritical and critical), we have to calculate the value from everyone and take into account the worst
SEMICRITICAL:
State Range (SR) = threshold – (threshold-1) = 80 - 40 = 40
Maximum value of the state to propagate (MV) = 80
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 50 – 45 = 5
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 75 – 45 = 30
MV – SR * CR / TR = 80 – 40 * 5/30 = 80 - 40*0,16 = 73,6
CRITICAL:
State Range (SR) = threshold – (threshold-1) = 80 - 40 = 40
Maximum value of the state to propagate (MV) = 80
Cluster State Range to propagate (CR) = state percentage – minimum value to accomplish the state = 16 – 15 = 1
Total Range of the cluster state to propagate (TR) = Total state rank - minimum value to accomplish the state = 45 – 15 = 30
MV – SR * CR / TR = 80 – 40 * 1/30 = 80 - 40*0,03 = 78,8
So in this case we will propagate 73,6 because it’s the worst value