Skip to main content

Home

Notification for Long-Running Streaming Applications

Overview

The Appstore application is designed to monitor and notify administrators about long-running streaming applications. It fetches data from a Spark application index and identifies streaming applications that have been running beyond a user-specified threshold. The notification includes a link to the Unravel UI for further investigation of each application.

Streaming Applications Identification

Streaming applications are identified based on the type field in Spark hitdocs. The application is classified as a streaming application if the type value matches any of the following:

  • structured-streaming

  • spark-streaming

  • streaming-sql

  • streaming

Example of a Spark hitdoc:

{ "_index": "app-20240818_24", "_type": "apps", "_id": "app-20240822162019-0000-0822-161547-utt2zva2", "_version": 10404, "_score": 3, "_routing": "app-20240822162019-0000", "_source": { "kind": "spark", "appt": "wfi", "id": "app-20240822162019-0000", "name": "job-284887302010684-run-784605153770294-Job_cluster_sql", "appid": "app-20240822162019-0000-14572172431119038908dbx", "startTime": "2024-08-22T16:20:12.562Z", "finishedTime": "2024-08-22T17:57:20.265Z", "duration": 5827703, "clusterId": "job-284887302010684-run-784605153770294-Job_cluster_sql", "clusterUid": "0822-161547-utt2zva2", "clusterTg": "adb-4202288953632492.12.azuredatabricks.net", "status": "S", "userName": "mjose@unraveldata.com", "queue": "redteam-4795hf", "user": "mjose@unraveldata.com", "wn": "284887302010684_4202288953632492", "wt": "app-20240822162019-0000", "wi": "app-20240822162019-0000-14572172431119038908dbx", "nick": "spark", "totalDfsBytesRead": 0, "totalDfsBytesWritten": 0, "numEvents": 0, "cents": 0.6442849636077881, "aid": "", "userType": "", "inputTables": [], "outputTables": [], "type": "structured-streaming", "db": ["default"], "instances": [], "vcoreSeconds": 0, "memorySeconds": 0, "key": "YARN", "numApps": 0, "numMRJobs": 1430, "mrJobIds": [], "totalMapTasks": 5720, "totalReduceTasks": 0, "totalMapSlotDuration": 136506, "totalReduceSlotDuration": 0, "sm": 5720, "km": 0, "fm": 0, "sr": 0, "kr": 0, "fr": 0, "numSparkApps": 1430, "totalSparkTasks": 5720, "ss": 5720, "ks": 0, "fs": 0, "totalSparkSlotDuration": 136506, "shuffleBytesRead": 0, "shuffleBytesWritten": 0, "processingDelay": 0, "totalDelay": 0, "jobId": 284887302010684, "runId": 784605153770294, "runName": "structured streaming job", "wsId": 4202288953632492, "wsName": "redteam-4795hf", "wsInstance": "adb-4202288953632492.12.azuredatabricks.net", "setupDuration": 299000, "cleanupDuration": 0, "clusterType": "AUTOMATED", "dbuCost": 0.2428209, "dbus": 1.6188059 }
}
Installing and opening the Streaming app

The Appstore application takes the following inputs to configure the notification settings:

Input Parameters
  • Time: The time threshold for filtering long-running applications.

  • TopX: The number of top long-running applications to include in the email notification.

  • Email: The recipient email address for notifications.

Output

Once the system is configured and triggered, it generates email notifications with the following details:

Email Log Table:

  • Email Sent Time: Timestamp of when the email was sent.

  • To User: Email address of the notification recipient.

  • TopK: The number of top applications included in the email.

  • Time: The time threshold for long-running applications.

Long-running-streaming-app.png