ScyllaDB University Live | Free Virtual Training Event
Learn more
ScyllaDB Documentation Logo Documentation
  • Server
  • Cloud
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Download
ScyllaDB Docs ScyllaDB Monitoring ScyllaDB Monitoring Stack Procedures Scylla Monitoring Stack Alert Manager Alerting

Caution

You're viewing documentation for a previous version of ScyllaDB Monitoring. Switch to the latest stable version.

Alerting¶

Alerts are a Prometheus enhancement to notify that something is wrong in the system.

You can read more about Prometheus alerting here

By default, alerts are shown in the overview dashboard, but you can also configure the alerts to be sent to other external systems like email or slack.

In general, alerts are generated by Prometheus when a specific condition is met within a certain period of time, they are then sent to the Alertmanager, that can send them to external systems.

The Alertmanager also acts as a data source for Grafana so the active alerts are shown in the dashboard.

../../_images/monitoring_stack.png

Prometheus Alerts¶

The Prometheus alerts are defined in the prometheus.rules.yml file that is located in the prometheus directory.

Each alert consists of:

  • Name

  • What happened

  • For how long

  • What to report

For example, let us look at the InstanceDown alert that comes by default as part of Scylla-Monitoring.

- alert: InstanceDown
  expr: up == 0
  for: 60s
  labels:
    severity: "2"
  annotations:
    description: '{{ $labels.instance }} has been down for more than 30 seconds.'
    summary: Instance {{ $labels.instance }} down

The expr is a legal Prometheus expression, in this case, the up metric is equal to 0.

The for set the minimal duration before the alert will be active. Prior to that, the alert will be in pending-mode and will not be sent.

The labels part holds additional labels that will be added to the alert, in the example, the severity label will be set to 2.

annotations are special labels that add a description to the alert and can be used in emails and notification when the alert is forward to external systems.

Alertmanager¶

The Alertmanager gets the alerts from the Prometheus server responsible for duplicating, grouping and routing.

The Alertmanager configuration is found in rule_config.yml file that is located in the prometheus directory. The configuration consists of three parts:

  • Routes: Represent a routing tree, the most specific rule, wins

  • Inhibition: Mute an alert, based on another alert

  • Receiver: Send a notification via email, sluck, etc’

Check the Alertmanager documentation for details on how to specify a specific receiver.

In the default configuration, the Grafana server will use the Alertmanager as a data source for the alert table found in the overview dashboard.

Tips When Adding an Alert¶

It’s worse having an alert that does not work, than not having an alert at all

There are multiple ways you can use to make sure your alert is set correctly:

  • Start with making sure that the full path (alert to mail or dashboard) is working a simple way is by negating the logic.

  • Force the alert to be active by reducing the limits or the duration.

  • When possible simulate the actual scenario and see that the alert works as expected.

For example, low available disk space.

Assume that you want an alert when you have lower than 10% disk space. You can start by setting the alert that is fired when you have more than 10% disk space. That would show that the alert is active and is propagated to the dashboard and to any external system you are using. You can then set the alert rule to a value similar to the current situation.

Now you are sure that when setting the expression to alert when the free space below 10% the alert would work, but if possible, simulate that situation, in this case by creating some big files.

Starting with a simple working expression helps you bisect problems.

Was this page helpful?

PREVIOUS
Scylla Monitoring Stack Alert Manager
NEXT
Adding and Modifying Dashboards
  • Create an issue
  • Edit this page

On this page

  • Alerting
    • Prometheus Alerts
    • Alertmanager
    • Tips When Adding an Alert
ScyllaDB Monitoring
  • 4.2
    • 4.9
    • 4.8
    • 4.7
    • 4.6
    • 4.5
    • 4.4
    • 4.3
    • 4.2
    • 4.1
    • 4.0
    • 3.10
    • 3.9
    • 3.8
    • 3.7
    • 3.6
    • 3.5
  • Introduction
  • Download and Install
    • Install
    • The start-all.sh script
    • Deploy without Docker
    • Docker Compose
    • System Recommendations
    • Using Thanos
  • User Guide
    • CQL Optimization Dashboard
    • Advisor
      • Some queries use ALLOW FILTERING
      • Some queries use Consistency Level: ALL
      • Some queries use Consistency Level: ANY
      • Some queries are not token-aware
      • Some SELECT queries are non-paged
      • Some queries are non-prepared
      • Some queries use reverse order
      • Compaction takes lots of memory and CPU
      • Some operation failed due to unsatisfied consistency level
      • I/O Errors can indicate a node with a faulty disk
      • Some operations failed on the replica side
      • CQL queries are not balanced among shards
      • Prepared statements cache eviction
      • System Overload
  • Procedures
    • Datadog Integration
    • Alert Manager
      • Alerting
    • Adding and Modifying Dashboards
    • Upgrade Guides
      • Monitoring 3.x to 4.y
      • Monitoring 3.x to 3.y
      • Monitoring 2.x to 3.y
      • Monitoring 2.x to 2.y
      • Monitoring 1.x to 2.x
  • Upgrade
    • Monitoring 3.x to 4.y
    • Monitoring 3.x to 3.y
    • Monitoring 2.x to 3.y
    • Monitoring 2.x to 2.y
    • Monitoring 1.x to 2.x
  • Troubleshooting
    • Troubleshooting
    • Troubleshooting Guide for Scylla Manager and Scylla Monitor Integration
  • Reference
    • Support Matrix
    • Interfaces
  • GitHub Project
Docs Tutorials University Contact Us About Us
© 2025, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 04 May 2025.
Powered by Sphinx 7.4.7 & ScyllaDB Theme 1.8.6