SQL Server Alerts - Manage with miniDBA


The following are all the SQL Server alerts that are included with miniDBA out of the box. They cover the categories of Disk, Query, Memory, CPU, Availability and General. To learn how to configure them for different priorities, thresholds and communications please see Alert Configuration.
Note:- This list is extensable by using your own Custom SQL Alerts.

Disk

Database Off-line
Any databases are not in an online state
Free Disk Space
Free disk space has fallen below a certain threshold
This may mean that the drive will fill completely and the server will not be able to function correctly
Backup Restore
Backup/Restore in progress
This can negatively effect database performance of any databases that share the same drive as the database being backed up.
Can often occur at unexpected times and require better scheduling
File Size Limit
Data or Log file is near maximum size limit
Either the file will autogrow which may impact performance during the operation or it will run out of space if set to have a size limit
Disk Queue Length
IO items waiting to be serviced by the disk
High disk queue length can signify poor IO configuration.
Check if multiple databases sharing the same drive.
The thresholds are device dependant so experiment with the settings
Tran Log % Used
Transaction Log file at high % use may need increasing in size
A full or near full transaction log may cause bad database performance and will likely have a run-away query with poor design causing it
File IO Stall Time
Average time in ms, waiting for the IO subsystem to perform operations
Lower waiting times are better - the IO Ms is calculated across total stall ms / read & write operations, giving an average rather than an on the spot calculation
Write Log Time
Time in ms, per second, the system has to wait for the IO subsystem to perform log writes
The figure is server wide not specific to any database and indicates a performance bottleneck around whichever drive the offending log file(s) are on
File Auto Growths
Data or Log file auto growths occurred which will take longer to perform as the database gets bigger
Avoid autogrowths by giving the files a larger, more appropriate maximum size
IO (Checksum) Error
IO error (incorrect checksum) occurred when reading a data file. The read eventually completed successfully but encountered a checksum error. Check the device for other errors.

Memory

OS Memory State
The OS is lacking memory available for programs. SQL Server will attempt to trim its working set of memory pages
Instance Memory
Memory used by this instance (Mb)
Use this alert to tell you if the allocated memory is actually being used or not while tuning memory settings
Page Life Expectancy
If the life expectancy of memory in the buffer pool is very low it can indicate high memory pressure
Buffer Cache Hit Ratio
The percentage of reads that use the buffer cache, if dipped below will trigger the alert
Plan Cache Hit Ratio
The percentage of queries that reuse execution plans from the plan cache, if dipped below will trigger the alert
Insufficient System Memory
SQL Server does not have enough memory available to execute the current query. Adjusting the minimum and maximum memory settings can help with this problem.

CPU

Instance Cpu %
% of the Cpu used by this Sql Server instance
Sustained high usage may indicate poor T-SQL or index design
Non Instance Cpu %
% of the Cpu used by processes other than this instance
Ideally there should be minimal activity by any non SQL Server processes on the server.
Check virus scanners and other maintenance software is configured correctly

Query

Sort Warning
Indicates that sort operations do not fit into memory and are spilling to TempDb.
Performance of the query can be affected because multiple passes over the data are required to sort the data
Blocking Processes
Processes being blocked by other processes
A degree of this is to be expected in busy environments as locks are taken out on objects
The MaxThreshold represents the number of blocking processes that it takes to trigger an alert
Deadlocks
Number of deadlocks detected in 1 minute
Processes deadlocked by other processes
Often caused by locks being held for too long, in turn caused by poor index design
Long Running Queries
Active processes that have been running the same sql batch for a long period of time
Huge queries may be consuming way too much of the servers resources
The MaxThreshold represents number of minutes the batch must run to trigger the alert

Availability

No Connection
The Sql Server cannot be contacted, either miniDBA does not have permission to connect or the server is down
Maximum Connections
The number of concurrent client connections to the instance.
This may indicate periods of very high use or inefficient client application code.
Agent Not Running
SQL Server Agent is not running
No jobs will be able to run if the agent is not running
Always On Enabled But Failed
The Always On feature is enabled but it failed to start properly and is an error state
Always ON Failed To Start
The alert is raised when the HADR component os SQL Server is enabled but did not start properly
Always ON Failover
The alert is raised when the primary replica fails over from one cluster node to another
Always ON Replica Unhealthy
The alert is raised when replica synchronization state becomes unhealthy
Always ON Database Unhealthy
The alert is raised when database synchronization state becomes unhealthy
Always ON Transaction Delay Too Long
The alert is raised when the time a primary replica has to wait for secondary replicas to harden transmitted log data becomes too much.
Database Mirroring State Change
Indicates when the state of a mirrored database changes

General

Failed Job
SQL Server Agent jobs that have failed
If the latest execution failed, it is raised as an alert
If this server is expected to have failing jobs then disable this alert
Error Log High Severity
Errors in the Sql Server error log of severity 17 or above
Error Log Medium Severity
Errors in the Sql Server error log of severity 15 or 16
These levels of severity covers user correctable errors such as problems with High Availability Groups or queries that throw errors
Failed Login
A connection was attempted to the SQL Server but the login was incorrect
Custom T-SQL Query
The alert is raised when the custom TSQL query is executed (every 1 minute) and the result is over a pre-defined threshold or boolean value. Read more about Custom SQL Alerts