I am currently considering which alerting solution I want to build for my monitoring setup, one idea is a voice based solution. So today I tested the text to speech solution Mimic 3 from Mycroft.
The current idea is to use Mimic 3 or an alternative to generate an audio file that will be played via a SIP client during a call.