FireHydrant's Runbooks offer the powerful ability to customize execution based on various conditions. This allows controlling when certain Runbooks, as well as steps within them, should attach/execute. If you configure any Custom Fields, they will also be available as conditions.
Runbook-level conditions will determine whether a Runbook will attach to an incident. This allows you to create Runbooks tailored to specific situations, such as only
SEV1 incidents, or only incidents where
my-specific-service is impacted.
FireHydrant is constantly scanning your incidents for matching incident conditions. This means Runbooks can attach to any of your incidents at any point in the incident lifecycle (with some exceptions. See Condition Expiration below).
- Click Runbooks in the navigation, then click the Runbook you want to configure.
- On the summary page, click Edit Runbook to configure changes for the Runbook.
- In the edit page, you can configure Runbook conditions on the right panel under Execution rules.
Individual steps within Runbooks can also be conditionally executed.
- Available rules
- Runbook execution rules include all of the available conditions for Runbook-level conditions, in addition to Previous Runbook step, which explicitly sets a dependency on the previous step completing or failing prior to executing the current one.
- You can configure certain Runbook steps to repeat on intervals. See Repeating Steps.
- You can also configure whether a step should immediately evaluate and execute, or if it should be triggered manually by a user. This can be done via the UI or via
/fh runbooksin Slack (see below).
All steps within a Runbook execute concurrently as soon as the Runbook attaches. If you want to ensure sequential execution of steps, make sure you use the Previous Runbook step condition and reorder the steps in the order you'd like to have them execute in.
- Follow the same steps above to enter edit mode for the Runbook in question.
- Find the specific step to which you'd like to add conditions and click the writing icon on the right side.
- In the modal that opens for that step, click Conditions & scheduling tab. This opens your options for applying rules that dictate when this specific step runs.
Some Runbook steps allow for repetition. This allows, for example, things like reminding the incident channel to post updates every 30 minutes. If a step can be repeated, you will see it as an extra option under Scheduling, pictured above.
The shortest interval allowed today is 5 minutes.
Once a recurring step has been scheduled, you'll see it on the Command Center page on the Runbooks tab. You can also stop repeating Runbook steps from here.
Runbooks and Steps will not always execute or be on the lookout for forever. FireHydrant expires condition checks for Runbooks and Steps based on various conditions.
FireHydrant will stop checking incidents for matching Runbook conditions when one of the following occurs:
- The incident reaches the
- After an incident is resolved, no new Runbooks will automatically attach, even if conditions match. You will need to attach them manually. "When Current milestone = Resolved," etc., conditions will still not automatically attach the Runbook.
- The Runbook has already attached
- We do not support unattaching and reattaching the same Runbooks to the same incident as of right now. Other Runbooks not yet attached will still be evaluated as normal.
- The incident is older than 30 days
- All conditions expire after 30 days, even if an incident has not been resolved yet. So, any changes to an incident after 30 days will not trigger a new Runbook attachment.
If you want Runbooks to attach post-resolution, you can workaround this limitation by adding a Attach a Runbook step to other Runbooks that do attach pre-resolution.
FireHydrant will expire steps or certain aspects of steps when one of the following occurs:
- When the incident reaches the
- FireHydrant terminates any recurring steps when an incident is resolved
- **This does not apply to post-incident steps/conditions (e.g.
When current milestone is Retrospective Completed). Those will still execute as normal post-resolution.
- When a step has been polling for longer than 30 days
- If a step has been waiting and checking for matching conditions for longer than 30 days, the step will expire.
- Incident Slack channel - Whether the incident has an existing Slack channel
- Incident Microsoft Teams channel - Whether the incident has an existing Microsoft Teams channel
- Current milestone - If the current milestone matches the supplied conditions**
- Current severity - If the current severity matches the supplied conditions
- Current priority - If the current priority matches the supplied conditions
- Previous Runbook step (STEP only) - Specifies for a Runbook step only to execute if the step prior has started, errored, or completed.
- Incident tags - If the current tags match the supplied conditions
- Incident ticket - Whether the incident already has an incident ticket created
- Time since milestone [...] - Checks the time elapsed from when the incident first entered the specified milestone
- Incident assigned roles - Whether the specified role(s) has/have at least one person assigned
- Incident assigned teams - Whether the specified team(s) have been assigned to the incident
- Incident impacted infrastructure - Whether the incident has any of the specified Service Catalog components marked as impacted
- Incident impacted service tiers - Whether any of the specified Service Catalog components impacted on the incident match the specified tiers
- Incident attached Runbooks - Whether the specified Runbooks are also attached to the Incident
- When invoked (RUNBOOK only) - Sets the Runbook only to attach when specified by a human during an incident
- [Any Custom Field] - Specifies conditions according to your defined custom fields
When any incident reaches the Resolved Milestone or beyond, any automation on Runbooks will change to a delayed model that executes every 10 minutes.
For example, if you have a step to Archive Incident Channel when the Milestone is Retrospective Completed, it may not archive immediately after you've completed your retro.
However, steps will consistently execute within 10 minutes after an incident is resolved.
Several Runbook steps have prerequisites. As noted above, all steps within a Runbook execute concurrently upon attachment to an incident. Subsequently, you'll need to be aware of potential race conditions for steps that require certain things. Some examples:
- Let's say you want to Add a Bookmark to Incident Channel. You will need to, for example, ensure in the conditions for that step that the Incident channel exists. This will delay the step from executing until the channel is confirmed created, otherwise it may try to set a bookmark in a non-existent channel and fail.
- If you want to name the incident channel after the Jira ticket, then you'll want to either set the condition to Previous step has completed and put the Slack channel step just after the Jira step. Or, you can use the Incident ticket exists condition.
In general, if a Runbook step depends on some condition that another Runbook step changes, then you'll want to pay careful attention to the conditions of downstream, dependent steps to avoid race conditions.
Updated about 1 month ago