Welcome to panic’s documentation!¶
PANIC is a set of tools (api, Tango device server, user interface) that provides:
- Periodic evaluation of a set of conditions.
- Notification (email, sms, pop-up, speakers)
- Keep a log of what happened. (files, Tango Snapshots)
- Taking automated actions (Tango commands / attributes)
- Tools for configuration/visualization
Contents:
PANIC Description¶
PANIC, a python Alarm System for TANGO¶
Contents
Description¶
PANIC is a set of tools (api, Tango device server, user interface) that provides:
- Periodic evaluation of a set of conditions.
- Notification (email, sms, pop-up, speakers)
- Keep a log of what happened. (files, Tango Snapshots)
- Taking automated actions (Tango commands / attributes)
- Tools for configuration/visualization
The Panic package contains the python AlarmAPI for managing the PyAlarm device servers from a client application or a python shell. The panic module is used by PyAlarm, Panic Toolbar and Panic GUI.
PANIC IS TESTED ON LINUX ONLY, WINDOWS/MAC MAY NOT BE FULLY SUPPORTED IN MASTER BRANCH
The optional panic submodules are:
panic.ds : PyAlarm device server panic.gui : Placeholder for the PanicGUI application
See the docs at: http://www.pythonhosted.org/panic
Recipes are also available at: https://github.com/tango-controls/PANIC/tree/documentation/doc/recipes
Get the latest release of Panic from: https://github.com/tango-controls/PANIC/releases
See CHANGE log in panic/CHANGES file
PyAlarm Device Server¶
panic.ds.PyAlarm Device Class
PyAlarm is the alarm device server used by ALBA Alarm System, it requires PyTango and Fandango modules, both available from tango-cs.sourceforge.net
Some configuration panels in the GUI require PyAlarm to be available in the PYTHONPATH, to do so you can add the PyAlarm.py folder to the PYTHONPATH variable or copy the PyAlarm.py file within the panic folder; so it could be loaded as a part of the module.
Panic GUI¶
panic.gui.AlarmGUI Class
Panic is an application for controlling and managing alarms. It depends on panic and taurus libraries.
It allows the user to visualize existing alarms in a clear form and adding/editing/deleting alarms. In edit mode user can change name, move alarms to another device, change descriptions and modify formulas. Additional widgets in which the app is equipped allows alarm history viewing, phonebook editing and device settings manipulation.
Changelog¶
PANIC 7.3.0¶
- 7.3.0
- solve disabled bug, add event pushing save regexp of userfilters between () allow multiple disable/acknowledge Merge branch ‘documentation’ fix setup.py
7.2.3 - fix bug on snaps widget 7.2.2 - fix bug when UserTimeout is not specified 7.2.1 - Merge panic-wiki-links by s2Innovation
PANIC_DEFAULT deprecated as option Merge branch ‘S2Innovation-panic-2-wiki-links’ into develop Solve bug on empty PanicUserTimeout property Merge branch ‘panic-2-wiki-links’ of https://github.com/S2Innovation/PANIC into S2Innovation-panic-2-wiki-links
7.2.0 - added back taurus3 compatibility, solved bug on history widget
7.1.2 - Solved bugs at gui startup
7.1.1 - Solved bugs on server init, add test_telegram.py
7.1.0 - Merged pull requests from github
S2Innovation: Solve -r problems, added MailRDashOptio for backwards compatibility Gabriel Jover: Add Telegram messaging, using SendTelegram command and TGConfig property Daniel Roldan: patch in setup.py for Debian packaging
7.0.0 AlarmHandler compatibility, use TANGO properties instead of PyAlarm class
6.5.1 Fixes on new/delete alarms by s2innovation
6.5.0: Fix New Alarm button bug
“New Alarm” button fix by S2innovation Improve Status Message update kpi
6.4.1: solved high cpu issues in pyalarm
Solve High CPU bug: init_callbacks(50ms) added to panic.engine and PyAlarm fix case bug in panic.view.check_multi_host Avoid api.devices clear during load, do 1by1 update instead
Temporary patch: Multiple disabling/ack disabled to prevent PyAlarm exceptions. see CS4-526 Add Enabled sorting as “PreCondition” to sort alarms by operation modes Solve bugs on editing formulas or looking at remote values remove unnecessary imports Solve bug on writing arrays on ACTION Execute SNAP/actions on DEBUG Alarms (only sms ignored) Add warning on reset of active alarms Remove unused severity checkboxes from main window (replaced by priority filter) Force devices update on PhoneBook values change remove wrong multi-host warning Return exception strings on evaluation from client
Add warnings on modifying Device config, unify calls on gui/editor Add exceptions on wrong tag name, use UNACK as active state Extend regexp syntax to key=value and & clauses add minimal widget for panic system status
PANIC 6.3.1¶
PANIC 7 GUI is ready. PanicEngine/PanicViewDS devices are pending.
Added QAlarmPanels, searches and userfilters to GUI
Zillions of bugs solved.
PANIC 6.2.1¶
IEC & Elettra compatibility, QAlarmPanel widget, new GUI usable
AlarmSummary/GetAlarmInfo added to PyAlarm for GUI connection; format agreed with Elettra for cross-compatibility Requires Fandango > 13.2 Alarm States and Fields renamed to match IEC terminology New GUI using AlarmView to query attribute values as arrays, AlarmView backwards compatibility with Panic <6 Performance improvement using latest Fandango (cached DB/DeviceProxy searches) Split and refactor gui module in gui/actions/views New Alarm object state machine based on json-like arrays (AlarmSummary) Increase ERROR alarms visibility Allow ‘~’ for negated regexp searches New QAlarmPanel widget New methods for .json exporting / web browsing Many GUI bugs solved Add ArchivingBrowser launcher to toolbar Solve validation issues on AlarmForm editor API refactoring, solved internal imports problem
PANIC 6.0¶
Package refactored to build valid system/PIP/rpm packages PANIC Migrated to github. Development moved to develop branch, stable to master
Main new features:
- enhanced logs and actions
- properties managed in the API side
- added multiple test cases
- added GlobalReceivers and Defines
- logs: record local or remote using text or json
- enabled plugin methods for user validation
- solved many, many bugs
Dropped features from this release:
- gui refactoring
- alarm collections
- IEC compliance (in progress)
- kibana integration
Summary of changes since PANIC 5.4 (Last Sourceforge release):
PyAlarm device server:
All panic times are now seconds, added deprecated message for pollings in milliseconds Quality of failed alarms set to INVALID (~DISABLED)
Solve bug on PyAlarm.GenerateReport command Solve bug on “zombie” alarm deleted/removed Move Reset notification to send_alarm() Replace phonebook entries on ACTION receivers reload global_receivers on init()
free_alarm and send_alarm methods refactored for better actions Added SendAlarm as command kill/pause events added Added MemUsage/LastUpdate attributes Properties definition moved to panic.properties Add invalid quality to disabled alarms Add FAULT state when many alarms are failed Add LastUpdate and MemUsage attributes Avoid update_locals to reread attributes if check=False Add receiver defines: $ALARM/TAG/NAME/DEVICE/DESCRIPTION/VALUES/REPORT/DATE/JSON
Implement Pause() command and kill/pause methods for thread management. LogFiles capable of saving remotely using fandango.device.FolderDS
Solved bugs on trigger_action (see documentation on github) Properties definition moved to panic.properties Added global receivers property Add Reports Cache, refactor send_alarm logging Refactor PyAlarm.parse_receivers Alarm actions will be executed before mail and snapshot. Added replacement of $DESCRIPTION,$ALARM in actions
- GUI:
- New panic Icon Reduce gui dependencies to speed up startup GUI adapted to Taurus 4 launcher renamed to just “panic” Solve GUI bug on empty ActiveAlarms attribute Solve getParams deprecation on Taurus4 Added User login access via UserValidator widget
API:
Logging added to AlarmAPI Added Alarm.disabled flag to_dict and ping() methods for Alarms and AlarmDS objects get_active_alarms() method added to AlarmDS AlarmDS.Enable/Disable methods now can enable both Devices and individual Alarms Added export_to_csv and export_to_dict API methods panic.Evaluate() timeout set to 1000. allow multiple filters on GlobalReceivers Add test cases for Group/Action/Clock/Reset Solve API bug on empty receivers Solve bug on AlarmAPI.put_db_properties (Wrote to device instead of free property) Solve bug in AlarmsAPI.get_global_properties Solved bug on phonebook parsing Group macro refactored (see documentation/recipes) Add api.split_formula() Change api.evaluate() timeout and checks
Release 5.4 - 2015/12¶
Changes in API and device server to solve several multi-host evaluation issues. Small patches required by SKA project.
Release 5.2 - New evaluate() from API/GUI, added user admins for alarms¶
evaluate() method adapted to be usable by GUI and test evaluation on a remote PyAlarm device Bug solved on SentEmails recording.
Release 5.1 - May 2015¶
PyAlarm: added try/except to update_locals() method API: get_admins_for_alarm() method added to enable some minimal access control.
Release 5.0 - May 2015¶
NOTE: Requires Fandango update, to use NaN values and new TangoEval macros
API: Improved group macro to use only cached values for evaluation API: Improved caselessness on API Eval: Added EvalTimes dictionary to keep the time needed on each EvaluateFormula() call. Eval: Added DEVICE, ALARMS, PANIC objects to locals() Eval: Added own Attribute values to Eval cache to avoid deadlocks when evaluating itself Enabled property, converted from string to DevVarStringArray to allow time and formulas Attributes: Removed locks from read methods (UI was locking the evaluation trend), lock is needed only on write/update actions Solved bug that didn’t send VALUES on State/Attribute exception RethrowAttribute, from boolean to string to allow choosing False/0/NaN/None Emails: solved problems string arrays and ‘r’ and ‘”’ characters Traces: shortened strings
Release 4.20¶
@pending: solve threads and UseProcess issues MaxAlarmsPerDay property removed (was unused) Added new GROUP macro to formula evaluation. SnapContext: Using modify instead of create context when this already exists Added methods to taurus-like get_model from alarms Email report refactored to show values in rows and parse state values. Added methods do import/export alarm configurations from .csv files EvaluateFormula converted in a Tango command callable by clients Solved bugs using SNAP as receiver. Snap: Using newest context when several match the alarm name Eval cache reduced to AlarmThreshold+1 to adjust .delta to AlarmThreshold Solved bug in SMS sending when source contains non-alpha characters
Release 4.19 @pending: solve threads and UseProcess issues Solved bugs in alarm parsing, loading .csv and loading alarms from device Added AlarmValueLabel widget
Release 4.18 Disable screen in launch script, replaced by Tango logging Added DDebug device for debugging threads AlarmsAPI.load() time consumption reduced to avoid timeouts RethrowState=False and RethrowAttribute=False will disable exception propagation from TangoEval, it allows to manage exceptions as None in the alarm formulas. Added IgnoreExceptions property Using polling period instead of timeout as keeptime on TangoEval Renamed method get_attribute_values to get_last_values
In AlarmAPI:
Added filter_alarms, export_to_csv, modify methods get/filter* methods modified to allow custom alarm lists Bugs solved in load_from_csv getCurrent will return last API instance used
Added IgnoreExceptions property children() replaced by get_basic_alarms method parse_variables replaced by parse_attributes and evaluate()
Release 4.17 2013/09/09
Added VersionNumber attribute Added methods Status/dev_status to remove automatic messages on qualities. Added self.update_locals() for a better update of alarm values, periods and conditions reviewed. fandango.threads.WorkerProcess has been optimized, PyAlarm modified to use new pause method Methods returning sorted lists
Release 4.16
References to taurus removed if UseTaurus property is False (default) Minimum polling reduced to 250 ms Using panic.PyAlarmDefaultProperties to have consitency between api/gui/device Solved bug that caused timeouts on alarm exception (time wait before finally clause) Disable method is now capable to disable alarms only for TIMEOUT argument If Enabled property is an integer, alarm changes will be ignored for INT seconds at startup; it should allow to restart devices w/out resending all active alarms; a ResetAll() can be used to rethrow all messages if wanted. CheckDisabled will manage alarm reactivation after timeout
The Panic module has been renamed to panic; several bugs have been solved and methods for enabling/disabling alarms have been added.
4.15 September 2012 Disabled LogFile by default BETA: Cache added to TangoEval to try alarm on transition.
4.14, September 2012 Added StartupDelay property Bugs solved in Snap context creation. Added user message to alarm RESET emails.
Installing PANIC on a New System¶
Dependencies¶
PANIC is available from Github, PyPI and as Debian or SuSE packages.
If you install from SuSE or Debian packages dependencies will be automatically installed.
If not, then you’ll need Tango, PyTango and Fandango for the server side (including its dependencies, ZMQ, numpy, …).
For the client side you’ll also need Taurus library and PyQt4.
You should be able to get all these packages also from www.tango-controls.org
Run the GUI and create a PyAlarm¶
Running “setup.py install” should install the panic-gui script in your system.
But if you don’t want to install the application you can just run python panic/gui/gui.py to launch the client.
In your first run it will apply completely empty. Just create your first PyAlarm instance going to the “Config” icon in the toolbar and pushing “Create New” button.
Now you can create your first PyAlarm pushing “New” in the main widget. You’ll be prompted to fill the gaps, for a first installation I recommend this alarm:
TAG: TEST_LOG Description: just testing Severity: WARNING Receivers: your_mail@your_domain.com Formula: True
This simple alarm will allow you to check if email sending works properly.
Run the PyAlarm Server¶
Use Astor or the shell to start your newly created PyAlarm:
python ds/PyAlarm.py TEST -v4
After ~45 seconds (if you didn’t modified the default configuration) you’ll receive your first email from PANIC.
Now head to the configuration docs to know all the options you have for tuning the behaviour.
PyAlarm Device Server User Guide¶
Contents
- PyAlarm Device Server User Guide
Description¶
This device server is used as a alarm logger, it connects to the list of attributes provided and verifies its values.
Its focused on notifying Alarms by log files, Mail, SMS and (some day in the future) electronic logbook.
You can acknowledge these alarms by a proper command.
Internal Structure¶
The device server behaviour relies on three python objects: AlarmAPI, updateAlarms thread and TangoEval.
Each alarm is independent in terms of formula and receivers; but all alarms within the same PyAlarm device will share a common evaluation environment determined by PyAlarm properties.
The AlarmAPI¶
This object encapsulates the access to the alarm configurations database. Tango Database is used by default, all alarm configurations are stored as device properties of each declared PyAlarm device (AlarmList, AlarmReceivers, AlarmSeverities).
The api object allows to load alarms, reconfigure them and transparently move Alarms between PyAlarm devices.
The updateAlarms thread¶
This thread will be executed periodically at a rate specified by the PollingPeriod. All Enabled alarms will be evaluated at each cycle; and if evaluated to a True value (understood as any value not in (0,”“,None,False,[],{})).
Once an Alarm has been active by a number of cycles equal to the device AlarmThreshold it will become Active. Then the PyAlarm will process all elements of the AlarmReceivers list.
The TangoEval engine¶
This engine will automatically replace each Tango attribute name in the formula by its value. It will also provide several methods for searching attribute names in the tango database.
Amongst other features, all values are kept in a cache with a depth equal to the AlarmThreshold+1. This cache allows to create alarms using .delta or inspecting the cache for specific behaviors.
Alarm Syntax Recipes¶
Alarms are parsed and evaluated using fandango.TangoEval class.
Sending a Test Message at Startup¶
This alarm formula is just “True” ; therefore will be enabled immediately sendin an email message to test@tester.com
AlarmList -> DEBUG:True
AlarmDescriptions -> DEBUG:The PyAlarm Device $NAME has been restarted
AlarmReceivers -> DEBUG: test@tester.com
Testing a device availability¶
It is done if you put directly the name of the device or its State as a condition by itself. In the second case and alarm will be triggered either if the Pressure is above threshold or the device is not reachable.
PRESSURE:SR/VC/VGCT/Pressure > 1e-4
STATE_AND_PRESSURE:?SR/VC/VGCT and SR/VC/VGCT/Pressure > 1e-4
Getting Tango state/attribute/value/quality/time/delta in formulas¶
The Alarm syntax allows to add the following clauses to the attribute name (value returned by default):
some/device/name{/attribute}{.value/all/time/quality/delta/exception}
attribute: if no attribute name is given, then device state is read.
PLC_Alarm: BL22/CT/EPS-PLC-01 == FAULT
value: default, returns the value of the attribute
Pressure_Alarm: BL22/CT/EPS-PLC-01/CC1_AF.value > 1e-5
time: returns the epoch in seconds of the last value read
Not_Updated: BL22/CT/EPS-PLC-01/CPU_Status.time < (now-60)
quality : returns the tango quality value (ATTR_VALID, ATTR_INVALID, ATTR_WARNING, ATTR_ALARM).
Temperature_Alarm: BL22/CT/EPS-PLC-01/OP_WBAT_OH01_01_TC11.quality == ATTR_ALARM
delta : returns the variation of the value in the last N=AlarmThreshold reads (stored in TangoEval.cache array of size AlarmThreshold+1)
Valve_Just_Closed: BL22/CT/EPS-PLC-01/VALVE_11.delta == -1
exception : True if the attribute is unreadable, False otherwise
Not_Found: BL22/CT/EPS-PLC-01/I_Dont_Exist.exception
all : returns the raw attribute object as returned by PyTango.DeviceProxy.read_attribute method.
Creating a periodic self-reset alarm¶
A simple clock alarm would use the current time and will set AlarmThreshold, PollingPeriod and AutoReset properties. See this example:
A single formula clock would be more hackish; this alarm will execute a command on its own formula
PERIODIC:(FrontEnds/VC/Elotech-01/Temperature and FrontEnds/VC/VGCT-01/P1 \
and (1920<(now%3600)<3200)) or (ResetAlarm('PERIODIC') and False)
Enabling search, expression matching and list comprehensions¶
Having the syntax dom/fam/mem/attr.quality
whould allow us to call attrs like:
any([ATTR_ALARM==s+'.quality' for s in FIND('dom/fam/*/pressure')])
One way may be using QUALITY, VALUE, TIME key functions:
any([ATTR_ALARM==QUALITY(s) for s in FIND('dom/fam/*/pressure')])
The use of FIND allows PyAlarm to prepare a list Taurus models that can be redirected from an <pre>event_received(…)</pre> hook.
Some list comprehension examples¶
any([s for s in FIND(SR/ID/SCW01/Cooler*Err*)])
equals to
any(FIND(SR/ID/SCW01/Cooler*Err*))
The negate:
any([s==0 for s in FIND(SR/ID/SCW01/Cooler*Err*)])
is equivalent to
any(not s for s in FIND(SR/ID/SCW01/Cooler*Err*)])
is equivalent to
not all(FIND(SR/ID/SCW01/Cooler*Err*))
is equivalent to
[s for s in FIND(SR/ID/SCW01/Cooler*Err*) if not s]
Grouping Alarms in Formulas¶
The proper way is (for readability I use upper case letters for alarms):
ALARM_1: just/my/tango/attribute_1
ALARM_2: just/my/tango/attribute_2
then:
ALARM_1_OR_2: ALARM_1 or ALARM_2
or:
ALARM_1_OR_2: any(( ALARM_1 , ALARM_2 ))
or:
ALARM_ANY: any( FIND(my/alarm/device/ALARM_*) )
Any alarm you declare becomes both a PyAlarm attribute and a variable that you can anywhere (also in other PyAlarm devices). You don’t trigger any new read because you just use the result of the formula already evaluated.
The GROUP is used to tell you that a set of conditions has changed from its previous state. GROUP instead will be triggered not if any is True, but if any of them toggles to True. It forces you to put the whole path to the alarm:
GROUP(my/alarm/device/ALARM_[12])
PyAlarm Device Properties¶
Distributing Alarms between servers¶
Alarms can be distributed between PyAlarm servers using the PyAlarm/AlarmsList property. A Panic system works well with 1200+ alarms distributed in 75 devices, with loads between 5 and 70 attrs/device. But instead of thinking in terms of N attrs/pyalarm you must distribute load trying to group all attributes from the same host or subsystem.
There are two reasons to do that (and also apply to Archiving):
- When a host is down you’ll have a lot of proxy threads in background trying to reconnect to lost devices. If alarms are distributed on rough numbers it becomes a lot of timeouts spreading through the system. When alarms are grouped by host you isolate the problems.
- Same applies for very event-intensive devices. Devices that generate a lot of information will need lower attrs/pyalarm ratio than devices that do not change so much.
But, it is a good advice to keep the overall number of alarms in the system below 10K alarms. For manageability of the log system and avoid avalanches of useless information the logical number of alarms should be around or below 1000.
Alarm Declaration Properties¶
AlarmList¶
Format of alarms will be:
TAG1:LT/VC/Dev1
TAG2:LT/VC/Dev1/State
TAG3:LT/VC/Dev1/Pressure > 1e-4
NOTE: This property was previously called AlarmsList; it is still loaded if AlarmList is empty for backward compatibility
AlarmDescriptions¶
Description to be included in emails for each alarm. The format is:
TAG:AlarmDescriptions...
NOTE: Special Tags like $NAME (for name of PyAlarm device) or $TAG (for name of the Alarm) will be automatically replaced in description.
AlarmReceivers¶
TAG1:vacuum@accelerator.es,SMS:+34935924381,file:/tmp/err.log
vacuum@accelerator.es:TAG1,TAG2,TAG3
Other options are SNAP or ACTION:
user@cells.es,
SMS:+34666777888, #If SMS sending available
SNAP, #Alarm changes will be recorded in SNAP database.
ACTION(alarm:command,mach/alarm/beep/play_sequence,$DESCRIPTION)
Or Telegram messages, see:
Adding ACTION as receiver¶
Executing a command on alarm/disable/reset/acknowledge:
ACTION(alarm:command,mach/alarm/beep/play_sequence,$DESCRIPTION)
The syntax allow both attribute/command execution and the usage of multiple typed arguments:
ACTION(alarm:command,mach/dummy/motor/move,int(1),int(10))
ACTION(reset:attribute,mach/dummy/motor/position,int(0))
Also commands added to the Class property @AllowedCommands@ can be executed:
ACTION(alarm:system:beep&)
PhoneBook (not implemented yet)¶
File where alarm receivers aliases are declared; e.g.
User:user@accelerator.es;SMS:+34666555666
Default location is: `` $HOME/var/alarm_phone_book.log ``
If User and Operator are defined in phonebook, AlarmsReceivers can be:
TAG2:User,Operator
REMINDER / RECOVERED / AUTORESET messages¶
Reminder¶
If a number of seconds is set, a reminder mail will be sent while the alarm is still active, if 0 no Reminder will be sent.
AlertOnRecovery¶
A message is sent if an alarm is active but the conditions of the attributes return to a safe value. To enable the message the content of this property must contain ‘email’, ‘sms’ or both. If disabled no RECOVERY/AUTO-RESET messages are sent.
Snapshot properties¶
UseSnap¶
If false no snapshots will be trigered (unless specifically added to receivers using “SNAP” ),
CreateNewContexts¶
It enables PyAlarm to create new contexts for alarms if no matching context exists in the database.
Alarm Configuration Properties¶
(In future releases these properties could be individually configurable for each alarm)
Enable : If False forces the device to Disabled state and avoids messaging.
LogFile : File where alarms are logged Default: “/tmp/alarm_$NAME.log”
FlagFile : File where a 1 or 0 value will be written depending if theres active alarms or not.n<br>This file can be used by other notification systems. Default: “/tmp/alarm_ds.nagios”
PollingPeriod : Periode in seconds. in which all attributes not event-driven will be polled. Default: 60000
MaxAlarmsPerDay : Max Number of Alarms to be sent each day to the same receiver. Default: 3
AlarmThreshold : Min number of consecutive Events/Pollings that must trigger an Alarm. Default: 3
FromAddress : Address that will appear as Sender in mail and SMS Default: “controls”
SMSConfig : Arguments for sendSMS command Default: “:”
MaxMessagesPerAlarm : To avoid the previous property to send a lot of messages continuously this property has been added to limit the maximum number of messages to be sent each time that an alarm is enabled/recovered/reset.
StartupDelay : Time that PyAlarm waits before starting the Alarm evaluation threads.
EvalTimeout : Timeout for read_attribute calls, in milliseconds .
UseProcess : To create new OS processes instead of threads.
Device Server Example¶
These will be the typical properties of a PyAlarm device
#---------------------------------------------------------
# SERVER PyAlarm/AssemblyArea, PyAlarm device declaration
#---------------------------------------------------------
PyAlarm/AssemblyArea/DEVICE/PyAlarm: "LAB/VC/Alarms"
# --- LAB/VC/Alarms properties
LAB/VC/Alarms->AlarmDescriptions: "OVENPRESSURE:The pressure in the Oven exceeds Range",\
"ADIXENPRESSURE:The pressure in the Roughing Station exceeds Range",\
"OVENTEMPERATURE:The Temperature of the Oven exceeds Range",\
"DEBUG:Just for debugging purposes"
LAB/VC/Alarms->AlarmReceivers: OVENPRESSURE:somebody@cells.es,someone_else@cells.es,SMS:+34999666333,\
ADIXENPRESSURE:somebody@cells.es,someone_else@cells.es,SMS:+34999666333,\
OVENTEMPERATURE:somebody@cells.es,someone_else@cells.es,SMS:+34999666333,\
DEBUG:somebody@cells.es
LAB/VC/Alarms->AlarmsList: "OVENPRESSURE:LAB/VC/BestecOven-1/Pressure_mbar > 5e-4",\
"OVENRUNNING:LAB/VC/BestecOven-1/MaxValue > 70",\
"ADIXENPRESSURE:LAB/VC/Adixen-01/P1 > 1e-4 and OVENRUNNING",\
"OVENTEMPERATURE:LAB/VC/BestecOven-1/MaxValue > 220",\
"DEBUG:OVENRUNNING and not PCISDOWN"
LAB/VC/Alarms->PollingPeriod: 30
LAB/VC/Alarms->SMSConfig: ...
Mail Messages¶
PyAlarm allows to send mail notifications. Each alarm may be configured with AlarmReceivers property to provide notification list. There is also a GobalReceivers property which allows to define notification for all alarms.
PyAlarm supports two ways of sending mails configured with the MailMethod class property:
- using mail shell command, when MailMethod is set to mail, which is default,
- or using smtplib python library when MailMethod is set to smtp[:host[:port]].
When using mail method it setup from variable as ‘-S’ option (see: https://linux.die.net/man/1/mail ). However, some setups may require to use -r option additionally. To enable it set MailDashRoption class property with a proper mail address.
As it is now, mail messages are formatted as the following:
Format of Alarm message¶
Subject: LAB/VC/Alarms: Alarm RECOVERED (OVENTEMPERATURE)
Date: Wed, 12 Nov 2008 11:52:39 +0100
TAG: OVENTEMPERATURE
LAB/VC/BestecOven-1/MaxValue > 220 was RECOVERED at Wed Nov 12 11:52:39 2008
Alarm receivers are:
somebody@cells.es
someone_else@cells.es
Other Active Alarms are:
DEBUG:Fri Nov 7 18:37:35 2008:OVENRUNNING and not PCISDOWN
OVENRUNNING:Fri Nov 7 18:37:17 2008:LAB/VC/BestecOven-1/MaxValue > 70
Past Alarms were:
OVENTEMPERATURE:Fri Nov 7 20:49:46 2008
Format of Recovered message¶
Subject: LAB/VC/Alarms: Alarm RECOVERED (OVENTEMPERATURE)
Date: Wed, 12 Nov 2008 11:52:39 +0100
TAG: OVENTEMPERATURE
LAB/VC/BestecOven-1/MaxValue > 220 was RECOVERED at Wed Nov 12 11:52:39 2008
Alarm receivers are:
somebody@cells.es
someone_else@cells.es
Other Active Alarms are:
DEBUG:Fri Nov 7 18:37:35 2008:OVENRUNNING and not PCISDOWN
OVENRUNNING:Fri Nov 7 18:37:17 2008:LAB/VC/BestecOven-1/MaxValue > 70
Past Alarms were:
OVENTEMPERATURE:Fri Nov 7 20:49:46 2008
PANIC Recipes¶
Alarms Distribution¶
About distributing load (answer to paul bell, 2014)¶
We have 1200+ alarms and system works quite well with it. But regarding distribution of PyAlarm devices and servers the rules must be more intelligent.
Instead of thinking in terms of N attrs/pyalarm you must distribute load trying to group all attributes from the same host or subsystem.
There are two reasons to do that (and also apply to Archiving):
- When a host is down you’ll have a lot of proxy threads in background trying to reconnect to lost devices. If alarms are distributed on rough numbers it becomes a lot of timeouts spreading through the system. When alarms are grouped by host you isolate the problems.
- Same applies for very event-intensive devices. Devices that generate a lot of information will need lower attrs/pyalarm ratio than devices that do not change so much.
Apart of that … if you have 1000 alarms just for the linac then you may have a wrong specification. I use to say than “all” should be in the order of 10K ; by experience any number about that is too much. If you need more than 10K of a kind what you really need is to add a level of abstraction (do not check all gauges of a vacuum section, just had an attribute where you can read the max value).
It applies to all Tango systems I’ve seen (alarms, archiving, save/restore, pool, device tree, …); if you reach a number above 10K then you must add an abstraction layer. It’s not only that you reach a performance limit, also your users will feel too dazed and confused when searching for things.
e.g. Our accelerator group requested 1200 alarms … and after some months they asked for a filter to show only the 240 they really care about.
Alarm Formulas Examples¶
Contents
- Alarm Formulas Examples
- Sending a Test Message at Startup
- Testing a device availability
- Getting Tango state/attribute/value/quality/time/delta in formulas
- Creating a periodic self-reset alarm
- Enabling search, expression matching and list comprehensions
- Some list comprehension examples
- Grouping Alarms in Formulas
- Alarm on delta and value
- Generating Clock Signals
Alarms are parsed and evaluated using fandango.TangoEval class.
Sending a Test Message at Startup¶
This alarm formula is just “True” ; therefore will be enabled immediately sendin an email message to test@tester.com
AlarmList -> DEBUG:True
AlarmDescriptions -> DEBUG:The PyAlarm Device $NAME has been restarted
AlarmReceivers -> DEBUG: test@tester.com
Testing a device availability¶
It is done if you put directly the name of the device or its State as a condition by itself. In the second case and alarm will be triggered either if the Pressure is above threshold or the device is not reachable.
PRESSURE:SR/VC/VGCT/Pressure > 1e-4
STATE_AND_PRESSURE:?SR/VC/VGCT and SR/VC/VGCT/Pressure > 1e-4
Getting Tango state/attribute/value/quality/time/delta in formulas¶
The Alarm syntax allows to add the following clauses to the attribute name (value returned by default):
some/device/name{/attribute}{.value/all/time/quality/delta/exception}
attribute: if no attribute name is given, then device state is read.
PLC_Alarm: BL22/CT/EPS-PLC-01 == FAULT
value: default, returns the value of the attribute
Pressure_Alarm: BL22/CT/EPS-PLC-01/CC1_AF.value > 1e-5
time: returns the epoch in seconds of the last value read
Not_Updated: BL22/CT/EPS-PLC-01/CPU_Status.time < (now-60)
quality : returns the tango quality value (ATTR_VALID, ATTR_INVALID, ATTR_WARNING, ATTR_ALARM).
Temperature_Alarm: BL22/CT/EPS-PLC-01/OP_WBAT_OH01_01_TC11.quality == ATTR_ALARM
delta : returns the variation of the value in the last N=AlarmThreshold reads (stored in TangoEval.cache array of size AlarmThreshold+1)
Valve_Just_Closed: BL22/CT/EPS-PLC-01/VALVE_11.delta == -1
exception : True if the attribute is unreadable, False otherwise
Not_Found: BL22/CT/EPS-PLC-01/I_Dont_Exist.exception
all : returns the raw attribute object as returned by PyTango.DeviceProxy.read_attribute method.
Creating a periodic self-reset alarm¶
A simple clock alarm would use the current time and will set AlarmThreshold, PollingPeriod and AutoReset properties. See this example:
A single formula clock would be more hackish; this alarm will execute a command on its own formula
PERIODIC:(FrontEnds/VC/Elotech-01/Temperature and FrontEnds/VC/VGCT-01/P1 \
and (1920<(now%3600)<3200)) or (ResetAlarm('PERIODIC') and False)
Enabling search, expression matching and list comprehensions¶
Having the syntax dom/fam/mem/attr.quality
whould allow us to call attrs like:
any([ATTR_ALARM==s+'.quality' for s in FIND('dom/fam/*/pressure')])
One way may be using QUALITY, VALUE, TIME key functions:
any([ATTR_ALARM==QUALITY(s) for s in FIND('dom/fam/*/pressure')])
The use of FIND allows PyAlarm to prepare a list Taurus models that can be redirected from an <pre>event_received(…)</pre> hook.
Some list comprehension examples¶
any([s for s in FIND(SR/ID/SCW01/Cooler*Err*)])
equals to
any(FIND(SR/ID/SCW01/Cooler*Err*))
The negate:
any([s==0 for s in FIND(SR/ID/SCW01/Cooler*Err*)])
is equivalent to
any(not s for s in FIND(SR/ID/SCW01/Cooler*Err*)])
is equivalent to
not all(FIND(SR/ID/SCW01/Cooler*Err*))
is equivalent to
[s for s in FIND(SR/ID/SCW01/Cooler*Err*) if not s]
Grouping Alarms in Formulas¶
The proper way is (for readability I use upper case letters for alarms):
ALARM_1: just/my/tango/attribute_1
ALARM_2: just/my/tango/attribute_2
then:
ALARM_1_OR_2: ALARM_1 or ALARM_2
or:
ALARM_1_OR_2: any(( ALARM_1 , ALARM_2 ))
or:
ALARM_ANY: any( FIND(my/alarm/device/ALARM_*) )
Any alarm you declare becomes both a PyAlarm attribute and a variable that you can anywhere (also in other PyAlarm devices). You don’t trigger any new read because you just use the result of the formula already evaluated.
The GROUP is used to tell you that a set of conditions has changed from its previous state. GROUP instead will be triggered not if any is True, but if any of them toggles to True. It forces you to put the whole path to the alarm:
GROUP(my/alarm/device/ALARM_[12])
Alarm on delta and value¶
This alarm will be triggered whenever a channel (HV*Code attributes) changes its value (delta!=0) and the new value is OFF (value=0)
any([(changed and value==0) for changed,value in
zip( FIND(bl*/vc/ipct*/hv*code.delta) ,
FIND(bl*/vc/ipct*/hv*code.value) )])
Generating Clock Signals¶
Playing with PollingPeriod, AlarmThreshold and AutoReset properties is possible to achieve an square signal that keeps the alarm active/inactive at regular intervals.
CLOCK=NOT CLOCK
The AlarmThreshold applies to both activation and reset of the alarm, so it has to be added to the AutoReset period to regulate the duty cycle. Keeping the PollingPeriod and AutoReset values very small will generate an accurate frequency (do not expect high accuracy, that’s a trick for testing but not a proper signal generator).
My values for a 10 seconds alarm cycle are:
.. code-block:: python
PollingPeriod = 0.1 AlarmThreshold = 50 AutoReset = 0.0001
If you want a more accurate alarm, you can also use the NOW() function. This example generates a switch every second
CLOCK = NOW()%2<1
PollingPeriod=1
AlarmThreshold-1
AlarmStates¶
Contents
State transitions¶
Alarm States and Severities are defined in panic.properties module.
With PyAlarm > 6.1; GUI will read the current Alarm state from the AlarmList attribute.
For compatibility with older versions, the events of ActiveAlarms will be used instead:
- If ActiveAlarms doesn’t cotain tag, alarm.active will be 0, state = NORM
- Activealarms contains tag, alarm.active = activealarms timestamp, state = ACTIVE
- ActiveAlarms is None or Exception, alarm.active will be set to -1. state = ERROR
Disabled States¶
Their meanings are:
- OOSRV = Device server is Off (not exported), no process running
- DSUPR = Enabled property is False
- SHLVD = Alarm is listed in DisabledAlarms attribute (temporary disabled)
- ERROR = Device is alive but the alarm is not being evaluated (exported=1 and thread dead or exception).
Hierarchies In Alarms¶
TOP/BOTTOM¶
The TOP/BOTTOM just provides a filter for finding alarms where the value of another alarm is used directly in the formula. It is case sensitive, so you can use lower/upper case to show/hide alarms in these filters.
To use hierarchies, alarms shall be written using the result of previous ones:
GAB1 = any([t >5 for t in FIND(tc1:10000/LMC/C01/GAB/*)])
GAB2 = any([t >5 for t in FIND(tc1:10000/LMC/C02/GAB/*)])
GAB_ALL= GAB1 or GAB2
OTHER = tc1:10000/LMC/C02/Other/State != ON
CAPITAL = GAB_ALL or OTHER
Then, the filter by hierarchy will return:
TOP (alarms that depend on others): CAPITAL, GAB12
BOTTOM (alarms isolated or referenced from others): OTHER, GAB_ALL, GAB1, GAB2
In this case GAB_ALL appears in both lists; to avoid that just rewrite it using lower case attribute names:
GAB_ALL = any(FIND('lmc1:10000/lmc/alarms/01/gab*'))
Now you should have only “CAPITAL” as TOP Alarm.
You can reproduce this behaviour from the api calling:
panic.AlarmAPI().filter_hierarchy('TOP')
Alarm GROUP¶
For an expression matching multiple alarms or attributes, GROUP returns a new formula that will evaluate to True if any of the alarm changes to active state (.delta) or matches a given condition:
GROUP(ALARM1, ALARM2, ALARM3)
Thus, GROUP will be activated when any of the three alarms switches to active; and immediately reset to wait for the next change. In this way you get a notification for any new activation of the three alarms.
NOTE: BY DEFAULT IS NOT LIKE any(FIND(*)); it will react only on change, not taking in account previous states!
NOTE2: you must tune your PyAlarm properties to have AlarmThreshold = 1 and AutoReset <= 3 to take profit of this feature.
NOTE3: The GROUP activation will be just a peak when using .delta (default); take this in account when setting up several levels of alarms as fast peaks may not be noticed if higher level alarms have long thresholds.
It uses the read_attribute schema from TangoEval, thus using .delta to keep track of which values has changed. For example, GROUP(test/alarms/*/TEST_[ABC]) will be replaced by:
any([t.delta>0 for d in FIND(test/alarms/*/TEST_[ABC].all)])
But, as regular expressions may trigger unexpected results, the syntax with explicit ALARM names is prefered.
The GROUP macro can be called with one or several expressions separated by commas and a condition separated by semicolon:
GROUP(expression1[,expression2;condition)
Expressions may contain a device name or not. If no device name is passed then it will search for it in the alarm list:
expression=[a/dev/name*/]attribute*
Thus, a valid GROUP expression is:
GROUP(LOCAL_ALARM1,t01:10000/an/alarm/dev/ALARM2)
Or
GROUP(LOCAL_ALARM1,t01:10000/an/alarm/dev/ALARM2;x>=1)
In the first case you’ll get a peak when any of them changes from 0 to 1; in the second case you’ll get if any of them is already on 1 (so a change in the second alarm will not trigger a second peak).
Future Releases¶
In future releases the GROUP macro will be capable of evaluating any tango attribute and not only alarms. As of 6.0 this feature is not yet supported
If the condition is empty then PyAlarm checks any .delta != 0. It can be modified if the formula contains a semicolon “;” and a condition using ‘x’ as variable; in this case it will be used instead of delta to check for alarm:
GROUP(bl09/vc/vgct-*/p[12];x>1e-5) => [x>1e-5 for x in FIND(bl09/vc/vgct-*/p[12])]
Special Alarm Recipes¶
Special keys used in Alarm formulas¶
- DEVICE: PyAlarm device name
- DOMAIN,FAMILY,MEMBER: Parts of the device name
- ALARMS: Alarms managed by this device
- PANIC: API containing all declared alarms
- t: time since the device was started
- T(…): string to time
- str2time(…): string to time
- now, NOW(): current timestamp
- DEVICES: instantiated devices
- DEV(device): DeviceProxy(device)
- NAMES(expression’): Finds all attributes matching the expression and return its names.
- CACHE: Saved values
- PREV: Previous values
- READ(attr): TangoEval.read_attribute(attr)
- FIND(‘expression’): Finds all attributes matching the expression and return its values.
Expiration Date¶
Disabling or re-enabling after a given date
A temporal condition can be achieved using the T() macro in the formula.
To disable an Alarm after a given date:
T() < T('2013-04-23') and D/F/M.A > V1
To re-enable it after a maintenance period:
T() > T('2013-04-23') and D/F/M.A > V1
Accessing PyAlarm Values CACHE¶
The PyAlarm CACHE dictionary contains the last values stored for each tango attribute that appeared in the formulas. The size of cache is AlarmTrheshold + 1
Usage:
PASS_BY_0=[(k,v.time.tv_sec,str(v.value)) for k,t in CACHE.items() for v in t if v.value==0]
This will trigger alarm if ALL values in the cache are equal, it is NOT the same as Delta because it checks only the first and last values:
not (lambda l:max(l)-min(l))([v.value for v in CACHE['wr/rf/circ-1/heartbeat']])
Clock: Alarm triggered by time¶
This alarm will be enabled/disabled every 5 seconds.
First, create a new PyAlarm device:
import fandango as fn
fn.tango.add_new_device('PyAlarm/Clock','PyAlarm','test/pyalarm/clock')
Add the new alarm (formula will use current time to switch True/False very 5 seconds)
from panic import AlarmAPI
alarms = AlarmAPI()
alarms.add(device='test/pyalarm/clock',tag='CLOCK',formula='NOW()%10<5')
Start your device server using Astor, fandango or manually
import fandango as fn
fn.Astor('test/pyalarm/clock').start_servers(host='your_hostname')
Then, configure the device properties to react every second for both activation and reset:
dtest = alarms.devices['test/pyalarm/clock']
dtest.get_config()
dtest.config['Enabled'] = 1
dtest.config['AutoReset'] = 1
dtest.config['AlarmThreshold'] = 1
dtest.config['PollingPeriod'] = 1
alarms.put_db_properties(dtest.name,dtest.config)
dtest.init()
This is the result you can expect when plotting test/pyalarm/clock/CLOCK in a taurustrend:

Exception Management¶
Alarm properties that control if exceptions trigger alarms or not …
- ‘RethrowState’:
- [PyTango.DevBoolean, “Whether exceptions in State reading will be rethrown.”, [ True ] ],#Overriden by panic.DefaultPyAlarmProperties
- ‘RethrowAttribute’:
- [PyTango.DevBoolean, “Whether exceptions in Attribute reading will be rethrown.”, [ False ] ],#Overriden by panic.DefaultPyAlarmProperties
- ‘IgnoreExceptions’:
- [PyTango.DevBoolean, “If True unreadable values will be replaced by None instead of Exception.”, [ True ] ],#Overriden by panic.DefaultPyAlarmProperties
Grouping Alarms¶
The proper way is (for readability I use upper case letters for alarms):
ALARM_1: just/my/tango/attribute_1 ALARM_2: just/my/tango/attribute_2
then:
ALARM_1_OR_2: ALARM_1 or ALARM_2
or:
ALARM_1_OR_2: any(( ALARM_1 , ALARM_2 ))
or:
ALARM_ANY: any( FIND(my/alarm/device/ALARM_*) )
Any alarm you declare becomes both a PyAlarm attribute and a variable that you can anywhere (also in other PyAlarm devices). You don’t trigger any new read because you just use the result of the formula already evaluated.
The GROUP is used to tell you that a set of conditions has changed from its previous state. GROUP instead will be triggered not if any is True, but if any of them toggles to True. It forces you to put the whole path to the alarm:
GROUP(my/alarm/device/ALARM_[12])
How PyAlarm Device Server Works¶
This document tries to summarize how PyAlarm processes alarms and executes its actions. A full explanation of alarm syntax and each property is available in the PyAlarm user guide, but here I provide a summary for convenience.
The device server behaviour relies on three python objects: AlarmAPI, updateAlarms thread and TangoEval.
Each alarm is independent in terms of formula and receivers; but all alarms within the same PyAlarm device will share a common evaluation environment determined by PyAlarm properties.
Contents
The AlarmAPI¶
This object encapsulates the access to the alarm configurations database. Tango Database is used by default, all alarm configurations are stored as device properties of each declared PyAlarm device (AlarmList, AlarmReceivers, AlarmSeverities).
The api object allows to load alarms, reconfigure them and transparently move Alarms between PyAlarm devices.
The updateAlarms thread¶
This thread will be executed periodically at a rate specified by the PollingPeriod. All Enabled alarms will be evaluated at each cycle; and if evaluated to a True value (understood as any value not in (0,”“,None,False,[],{})).
Once an Alarm has been active by a number of cycles equal to the device AlarmThreshold it will become Active. Then the PyAlarm will process all elements of the AlarmReceivers list.
AlertOnRecovery and AlarmReset¶
Whenever an alarm formula becomes True; a counter starts to increase until it reaches the AlarmThreshold value, becoming an active alarm.
This counter is kept at AlarmThreshold value and starts decreasing once the formula is no longer True. If the counter reaches 0 (its minimum value) the alarm will be still active but its new state will be RECOVERED, an email will be sent to receivers if AlertOnRecovery property is True.
Then, if the AlarmReset value (in seconds) is distinct from 0, a time count starts from the point of RECOVERY. If there’s no change in the alarm state during this time count, the alarm will be automatically RESET (notifying receivers or not depending on configuration).
So, if you need an alarm to have a fast recovery keep in mind that you’ll have to apply a delay equal to AlarmThreshold+PollingPeriod to the value that you have set as AutoReset.
The TangoEval engine¶
This engine will automatically replace each Tango attribute name in the formula by its value. It will also provide several methods for searching attribute names in the tango database.
Amongst other features, all values are kept in a cache with a depth equal to the AlarmThreshold+1. This cache allows to create alarms using .delta or inspecting the cache for specific behaviors.
PANIC Setup¶
by Sergi Rubio — 2006, 2016
Contents
Description¶
The Package for Alarms and Notification of Incidents from Controls
PANIC Alarm System is a set of tools (api, Tango device server, user interface) that provides:
- Periodic evaluation of a set of conditions.
- Notification (email, sms, pop-up, speakers)
- Keep a log of what happened. (files, Tango Snapshots)
- Taking automated actions (Tango commands / attributes)
- Tools for configuration/visualization.
Other Documentation in this same repository
- PANIC presentation at PCAPAC‘14: Panic Talk at PCAPAC‘14
- The Panic python API: PanicAPI.rst
- The PyAlarm User Guide: PyAlarmUserGuide.rst
- The Panic UI manual: panicdoc.html
Launch your PANIC System in few steps¶
Dependencies¶
You must have PyTango + Tango + MySQL up and running and your TANGO_HOST and PYTHONPATH environment variables properly set.
PyTango is available at PyPI: https://pypi.python.org/pypi/PyTango
Get the code¶
ALL OF THIS IS DEPRECATED; GET THE PACKAGES FROM https://github.com/tango-controls INSTEAD
Fandango library (functional tools for tango) is required to be in your PYTHONPATH:
svn co https://tango-cs.svn.sourceforge.net/svnroot/tango-cs/share/fandango/trunk/fandango fandango
You can download PyAlarm and the panic api from tango-ds at sourceforge:
svn co https://svn.code.sf.net/p/tango-ds/code/DeviceClasses/SoftwareSystem/PyAlarm/trunk
The PANIC User Interface is available in the /clients branch:
svn co https://svn.code.sf.net/p/tango-ds/code/Clients/python/Panic/trunk
Setup your Tango database¶
Create your devices from a python console (or Jive):
import PyTango
db = PyTango.Database()
def add_new_device(server,klass,device):
dev_info = PyTango.DbDevInfo()
dev_info.name = device
dev_info.klass = klass
dev_info.server = server
get_database().add_device(dev_info)
#Create a PyAlarm device
add_new_device('PyAlarm/1','PyAlarm','test/alarms/1')
#I'll add a simulator, but you can't use TangoTest or whatever device you want:
add_new_device('PySignalSimulator/1','PySignalSimulator','test/sim/1')
db.put_device_property('test/sim/1',{'DynamicAttributes':['A=t%100']})
From shell, launch your PyAlarm and Simulator devices:
# python PyAlarm/PyAlarm.py 1 &
# python PySignalSimulator/PySignalSimulator.py 1 &
Create a TEST_ALARM using the API:
import panic
alarms = panic.api()
alarms.add('TEST_ALARM',formula='(test/sim/1/A%15 > 5)',description='test',receivers='your@mail')
Run the panic application and configure your Alarms¶
python Panic/gui.py
See the application manual: http://plone.tango-controls.org/tools/panic/panic-ui/
If you want to see faster changes in the alarm cycle try to set the following configuration values (Tools->Adv.Config):
PollingPeriod = 1
AlarmThreshold = 1
AutoReset = 5
Notification Services
The syntax for sending an email (from linux, you’ll need the “mail” command available in the system, from windows you’ll have to set as receiver a command from a device running in a linux machine):
DeviceProxy("your/alarm/device").command_inout("SendMail",["Bonjour,\n\nthis is a test message\n\nau revoire","RE: testing","your-name@tango-controls.org"])
The other command we have for notification is SendSMS; but it requires our smslib.py file that is specific to our SMS provider (it uses http transactions to send the messages). If you’re interested on it you’ll have to write your own smslib.py file to use it.
FestivalDS, Speech and pop-ups¶
There’s another notification device you can use, the FestivalDS. It provides speech synthesizing and pop-ups in a linux environment (it requires “festival” and “libnotify-bin” linux packages):
https://svn.code.sf.net/p/tango-ds/code/DeviceClasses/InputOutput/FestivalDS/trunk
The commands are:
Play(string): speech to speakers
Beep(): beep!
Play_sequence(string): it just makes some beeps before and after the speech
PopUp(title,text,[seconds]): shows a pop-up with title/text for the given time
And that’s all regarding our current notifiers, for database we don’t have anything yet, as we use the device properties to store all the data. You’ll find more information in the PyAlarm user guide.
Exception Management in Panic Alarms¶
The exception management will be done using the _raise=RAISE argument of the TangoEval.eval method.
Three properties control if exceptions will enable the alarm or will be simply ignored.
IgnoreExceptions: | |
---|---|
if False then all exceptions will be registered as FailedAlarms and the PyAlarm will change to FAULT whenever an exception is encountered. If no rethrow option is active, FailedAlarms will be displayed in grey in AlarmGUI as “disabled”. | |
RethrowAttribute: | |
if True, any exception in the formula will set the alarm as active. PyAlarm state will change to ALARM or FAULT if IgnoreExceptions is False and all alarms are in failed state. | |
RethrowState: | if True, only alarms reading State attributes will be activated by exception. PyAlarm state will change to ALARM or FAULT if IgnoreExceptions is False and all alarms are in failed state. |
So, in case of having an alarm reading a faulty attribute, the status of the alarm will be:
DISABLED: | If IgnoreExceptions=False and RethrowAttribute=False |
---|---|
NOT ACTIVE: | If IgnoreExceptions=True and RethrowAttribute=False |
ACTIVE: | If IgnoreExceptions=False and RethrowAttribute=True |
ACTIVE: | If IgnoreExceptions=True and RethrowAttribute=True |
Using the PANIC python API¶
Contents
The Panic Module¶
Panic contains the python AlarmAPI for managing the PyAlarm device servers from a client application or a python shell. The panic module is part of the Panic bliss package.:
import panic
alarms = panic.api()
Browsing existing alarms¶
The AlarmAPI is a dictionary-like object containing Alarm objects for each registered Alarm tag. In addition the AlarmAPI.get method allows caseless search by tag, device, attribute or receiver:
alarms.get(self, tag='', device='', attribute='', receiver='')
alarms.get(device='boreas')
Out[232]:
[Alarm(BL29-BOREAS_STOP:The BakeOut controller has been stop),
Alarm(BL29-BOREAS_PRESSURE_1:),
Alarm(BL29-BOREAS_PRESSURE_2:),
Alarm(BL29-BOREAS_START: BL29-BOREAS bakeout started
...]
alarms.get(receiver='eshraq')
Out[234]:
[Alarm(RF_LOST_EUROTHERM:),
Alarm(OVEN_COMMS_FAILED:Oven temperatures not updated in the last 5 minutes),
Alarm(RF_PRESSURE:The pressure in the cavity exceeds Range),
Alarm(OVEN_TEMPERATURE:The Temperature of the Oven exceeds Range),
Alarm(RF_EUROTHERM:),
Alarm(RF_LOST_MKS:),
Alarm(RF_TEMPERATURE_MAX2:),
...]
alarms['RF_LOST_MKS'].receivers
Out[237]: '%SRUBIO,%ESHRAQ,%VACUUM,%LOTHAR,%JNAVARRO'
Adding / Removing alarms¶
The add/remove methods take care of properties modification:
alarms.add('RF_ON_FIRE','rf/ct/alarms',formula='rf/ct/plc-01/temperature>1000.',message='FIRE!',receivers='rf@cells.es,plc@cells.es')
alarms.remove('RF_ON_FIRE')
Modifying alarms¶
Each Alarm object contains strings with its configuration, if you modify it you must call Alarm.write() method to update the alarm device. An Alarm.rename() method is also available.
In [235]: alarms[‘RF_LOST_MKS’].device Out[235]: ‘sr/rf/alarms’
In [236]: alarms[‘RF_LOST_MKS’].formula Out[236]: ‘SR/RF/VGCT-01/State==UNKNOWN or SR/RF/VGCT-02/State==UNKNOWN’
In [237]: alarms[‘RF_LOST_MKS’].receivers Out[237]: ‘%SRUBIO,%ESHRAQ,%VACUUM,%LOTHAR,%JNAVARRO’
In [238]: alarms[‘RF_LOST_MKS’].write()
Modifying a receiver in all alarms¶
And a fast way for updating alarm receivers:
[a.replace_receiver('%DFERNANDEZ','%SRUBIO') for a in alarms.get(receiver='fernandez')]
PanicAdminUsers property¶
Contents
The PanicAdminUsers property will contain all users enabled to modify an alarm.
Although, any user identified as an email receiver of an alarm will be allowed to change it.
The propery is check from the get_admins_for_alarm() method in AlarmAPI.
The method will be used to call the setAllowedUsers() of a validator plugin.
The methods that the i*ValidatedWidget decorator requires of a validator are:
- setLogging()
- setAllowedUsers()
- setLogMessage()
- exec_()
User validation in the GUI will be kept for consecutive actions as long as the allowed users list for each action doesn’t change. If a new action is required on an Alarm with different receivers, the login will be asked again.
The login will be kept for a time defined by PyAlarm.PanicUserTimeout property. This time is 60 seconds by default.
PyAlarm Startup Modes¶
The PyAlarm Startup is controlled by StartupDelay and Enabled properties.
StartupDelay will put the PyAlarm in PAUSED state after a restart; to not start to evaluate formulas immediately but after some seconds, thus giving time to other devices to start.
The Enabled property will instead control the notification actions:
- If False, no notification will be triggered.
- If True, all notifications can be sent once StartupDelay has passed.
- If a Number is given, all notifications triggered between startup and
t+Enabled
will be ignored. Enabled>(AlarmThreshold*PollingPeriod)
: “Silent restart”, activates the Alarms that were presumably active before a restart; but do not retriggers the notifications.
Enabled = 120
is the typical case; not triggering notifications until the device has been running for at least 3 minutes.
If Enabled = False
or while t < Start+Enabled
the PyAlarm State will be DISABLED.
PyAlarm timing configuration¶
- StartupDelay: the device will wait before starting to evaluate the alarms (e.g. giving some time to the system to recover from a powercut).
- Enabled: if False or 0 the PyAlarm it equals to disabling all alarm actions of the device; if it is True the behavior will be the normal expected; if it has a numeric value (e.g. 120) it means that the device will evaluate the alarms but not execute actions during the first 120 seconds (thus alarms can be activated but no action executed). It is used to prevent a restart of the device to re-execute all alarms that were already active.
- EvalTimeout: The proxy timeout used when evaluating the attributes (any read attribute slower than timeout will raise exception).
- AlarmThreshold: number of cycles that an alarm must evaluate to True to be considered active (to avoid alarms on “glitches”).
- RethrowAttribute/RethrowState: Whether exceptions on reading attributes or states should be rethrown to higher levels, thus causing the alarm to be triggered. By default alarms are enabled if an State attribute is not readable (RethrowState=True), but when a numeric attribute is not readable its value is just replaced by None (RethowAttribute=False) and the formula evaluated normally.
- Reminder: A new email will be sent every XX seconds if the alarm remains active. When AlertOnRecovery is True an email will be sent also every time when the formula result oscillates from True to False.
- UseProcess: This is an experimental feature, like UseTaurus and others. In general, I advice you to not modify any parameter that is not detailed in the PyAlarm user guide as you may obtain unexpected results. Some parameters are used to test new features still under development and their behavior may vary between commits.
Regarding actions on recovery … this option is planned but not yet fully available. Actually just emails are sent when AlertOnRecovery is True. This feature may be implemented in the next 6 months or so but the syntax is still to be decided.
Testing your PyAlarm installation¶
This script will check the current performance of your PyAlarm devices:
> TANGO_HOST=your_hostname:10000 python panic/extra/report.py check
PANIC Receivers, Logging and Actions¶
Contents
Alarm Receivers¶
Allowed receivers are email, sms, action and shell commands.
SMS / Mail Config¶
These CLASS properties will control how SMS and Mail is configured:
SMSConfig
SMSMaxLength
SMSMaxPerDay
FromAddress
MailMethod
Global Receivers¶
The PyAlarm class property “GlobalReceivers” allows to set receivers that will be applied to all Alarms; independently of the device that is managing them.
The syntax is:
GlobalReceivers
{regexp}:{receivers}
.*:oncall@facility.dom
Logging¶
Alarm logging can be managed in three ways: local logs, remote logs via FolderDS or Snapshoting.
All the logging methods support defined variables ($ALARM, $DATE, $DEVICE, $MESSAGE, $VALUES, $…)
Local LogFile¶
Simply set the LogFile property to your preferred local file path:
LogFile = /tmp/pyalarm/$NAME_$DATE_$MESSAGE.log
Remote LogFile¶
You can use the fandango.FolderDS device to specify a remote logfile destination on the LogFile property:
# LogFile = tango://[folderds/device/name]/[logfile_name]
LogFile = tango://sys/folder/panic-logs/$NAME_$DATE_$MESSAGE.log
You can have both local and remote logging by setting LogFile to a local file and adding an ACTION receiver:
LogFile = /tmp/pyalarm/$NAME_$DATE_$MESSAGE.log
AlarmReceivers = ACTION(alarm:command,controls02:10000/test/folder/tmp-folderds/SaveText,
'$NAME_$DATE_$MESSAGE.txt','$REPORT')
FolderDS documentation: https://github.com/tango-controls/fandango/blob/documentation/doc/devices/FolderDS.rst
Using SNAP database¶
This database logging will save the alarm state and all associated attributes every time that the alarm is activated/reset.
You should have configured previously an Snapshoting Database (java/mysql service by Soleil).
Then you have to:
- Set the CreateNewContexts property of PyAlarm to True (it will automatically create a new context on alarm triggering)
- Or create manually a new context in the database using Bensikin.
- Set UseSnap=True to trigger snapshots for all alarms
- Or simply add the SNAP receiver.
Creating a context manually instead of doing it with PyAlarm may allow you to store Tango attributes that do not appear in the formula, thus enabling a sort of alarm-triggered archiving mode.
Triggering Actions from PyAlarm¶
See basic details on the user guide:
Here you have some more examples:
# Send an email (equivalent to just %MAIL:address@mail.com)
%SENDMAIL:ACTION(alarm:command,lab/ct/alarms/SendMail,$DESCRIPTION,$ALARM,address@mail.com)
# Reset another alarm, DONT USE [] TO CONTAIN ARGUMENTS!
%RESET:ACTION(alarm:command,test/pyalarm/logfile/resetalarm,'TEST','$NAME_$DATE_$DESCRIPTION')
# Reload another device
%INITLOG:ACTION(alarm:command,test/pyalarm/logfile/init)
# Write a tango attribute
%WRITE:ACTION(alarm:attribute,sys/tg_test/1/string_scalar,'$NAME_$DATE_$VALUES')
# Execute a command in another tango host
# in this example a FolderDS saves the alarm log
%LOG:ACTION(alarm:command,controls02:10000/test/folder/tmp-folderds/SaveText,'$NAME_$DATE_$MESSAGE.txt','$REPORT')
Then declare the AlarmReceivers like:
ACTION(alarm:command,mach/dummy/motor/move,int(1),int(10))
ACTION(reset:attribute,mach/dummy/motor/position,int(0))
The first field is one of each PyAlarm.MESSAGE_TYPES:
ALARM
ACKNOWLEDGED
RECOVERED
REMINDER
AUTORESET
RESET
DISABLED
Available keywords (managed by PyAlarm.parse_devices()) in ACTION are:
$TAG / $NAME / $ALARM
$DEVICE
$DATE / $DATETIME
$MESSAGE
$VALUES
$REPORT
$DESCRIPTION
PyAlarm Using Events With Taurus¶
Setting up a PyAlarm getting Tango events from Taurus¶
We will test events using the CLOCK alarm created in the previous recipe (polling should be enabled, this example uses polling on CLOCK attribute at 10 ms):
Then, create a new PyAlarm device and the event-based alarm:
import fandango as fn
fn.tango.add_new_device('PyAlarm/events','PyAlarm','test/pyalarm/events')
from panic import AlarmAPI
alarms = AlarmAPI()
alarms.add(device='test/pyalarm/events',tag='EVENTS',formula='test/pyalarm/clock/clock')
Start your device server using Astor, fandango or manually
import fandango as fn
fn.Astor('test/pyalarm/events').start_servers(host='your_hostname')
Then, configure the device properties to read attributes using Taurus and react as fast as possible Taurus will take care of subscribing to events and update cached values.
dtest = alarms.devices['test/pyalarm/events']
dtest.config['UseTaurus'] = True
dtest.config['AutoReset'] = 0.05
dtest.config['Enabled'] = 10
dtest.config['AlarmThreshold'] = 1
dtest.config['PollingPeriod'] = 0.05
alarms.put_db_properties(dtest.name,dtest.config)
dtest.init()
This is the result you can expect when showing both alarm attributes (test/pyalarm/clock/clock and test/pyalarm/events/events) in a taurustrend:

Is this approach really Event-Based?¶
Yes, but not asynchronously. PyAlarm will use Taurus to catch Tango Events and buffer them; but alarms are still triggered by the internal polling thread of PyAlarm. It means that the PyAlarm.PollingPeriod property effectively filters how often incoming events are processed.
But, delegating event collection to Taurus allows to not execute read_attribute in the polling thread; allowing to very small PollingPeriod values (10-20 ms)
As seen in this picture, it allows to have a very fast reaction from the Alarm attributes respect to the trigger:

This approach, however, is costly in terms of cpu usage if using polling periods below 100 ms. A pure-asynchronous event implementation of the PyAlarm is still pending.