Limitations
Missing AutoAction policy violation on Unravel UI
For an AutoAction policy violation, if the threshold value is 0 and the metric value is also 0, the violation events will not be generated. Also, such events are not shown on the Unravel UI.
Alerting on running apps
Applications of the following types do not provide any means for real-time alerts, in other words, alerting when the app is running. After the app has finished, alerts are generated for policy violations that have already occurred.
Impala
Hive-on-Tez
Running duration versus final duration inconsistency
Unravel calculates and publishes internally the current duration for apps of the following types in real-time, that is, when in the running state. Upon the app, completion Unravel receives the actual end time and performs the final duration calculation. This can lead to an inconsistency where the aggregated duration published during the running state is greater than the duration published upon the app's completion.
Workflow
Missing AutoAction violation badge
The badge () is not displayed for the following app types.
Workflow
Running and failed Impala apps.
Unsupported
The following items are unsupported by the AutoAction feature.
Kill action for the following apps
Hive
Workflow
Databricks
Move to Queue action for the following apps
Hive
Impala
Workflow
Cloud type setups
Unravel for EMR, Databricks, Dataproc, and HDInsight
Kill and move actions for all types of apps.
Rules that span multiple clusters.
In multi-cluster configurations, AutoActions doesn't differentiate between entities of each cluster and sets up a policy that targets all monitored clusters. For instance, creating a rule to target the root queue results in the queue being monitored on all clusters.
Workaround:
If the cluster ID is known, isolate the policy for the cluster using policy options.
Uses the internal Hadoop cluster ID instead of Unravel cluster ID/name. You must obtain the internal cluster ID to specify a Hadoop cluster in the policy options section. It can be obtained from HDFS namenode, where it’s stored in {dfs.namenode.name.dir}/current/VERSION.
In case of transport message protocol synchronization error, on exceptionally rare occasions, AutoAction can be triggered up to 180 seconds after the violation occurs. No data loss is expected.
Recent Events & Alerts shows the events across all clusters regardless of the currently selected cluster.
Application Master level metrics, such as job metrics and job counters, aren't collected by EMR sensor by default and therefore can't be used in AA policies. Collection of AM metrics can be enabled manually using “am-polling” option in EMR sensor.
In exceptionally rare cases AutoActions can be triggered up to 180 seconds later in case of transport message protocol synchronization error but no data loss is expected.
Note
Prior to Unravel v4.5.2.0 Cloud Release, AutoActions aren't supported.
AutoActions properties
See AutoAction properties general AutoActions and AutoAction daemon properties.