Impala Resource Pool Analysis report

Home

Impala Resource Pool Analysis report

The Impala Resource Pool Analysis report can be used to analyze Impala data, understand the usage of Impala queues, and compare it against the resource pool configurations.

Generating Impala Resource Pool Analysis report

Click Generate Reports > New button.

In the New Report dialog box, enter the following details.

Items	Description
General
Name	Name of the report.
Environment	Select your platform. The reports corresponding to the selected platform are only listed in the Report type. If you select the All option, all the reports are listed.
Report type	Type of report. Select Impala Resource Pool Analysis.
Schedule	Select the checkbox to schedule the report to run daily, hourly, weekly, or monthly. You can also set the schedule can using a cron expression. You can expand the Example drop-down and select the corresponding options. The next four sample run times are displayed for reference.
Retention	The number of days to retain the report files. All the reports are stored in the `unity-one/src/assets/reports/jobs` directory. After completion of the retention period, the report files are automatically purged.
Parameters
Look Back	The period range in days when applications can be selected for report generation. There is a notification displayed above this option that informs about the duration when data is available for the report.
Use Exact Date-Time	Check this option and select the from and to dates to schedule the report for the exact time range specified. The only anomaly with this option is that the same report will be scheduled to be generated repeatedly.
Interval	Time series interval. It represents the granularity level at which you want to perform the analysis. The shorter the interval, the longer it takes to generate the report.
Queue Config File Path (Optional)	Path to queue.json file. The field is optional. The Cloudera queue configurations are automatically fetched if the `unravel.properties` is configured with the credentials for Cloudera Manager API. If the credentials are not configured, the queue configuration can be fetched manually.
Users	Select the users who must be included in the report. You can select multiple users. If not selected, all the users will be considered in the report.
Tag Names	Select the tag names that you want to be included in the report. You can select multiple tag names. If not selected, all the tag names will be considered in the report.
Tag Values	Select the tag values, corresponding to the tag names you have selected, that you want to include in the report. Options for tag values will only show when you select a tag. You can select multiple tag values. If none are selected, all tag values detected for a tag will be included in the report.
Pools	Select the Cloudera Manager queues that you want to be included in the report. You can select multiple users.
Clusters	Select the clusters that you want to be included in the report. You can select multiple clusters.
Notifications
Email to	Email ID to send the notification when the report is generated. You can specify multiple email IDs with comma separation. You can also select the Attach Files to Email checkbox to receive the reports as an attachment.
Advance Options
Profile Memory	Select this option if you want to generate logs that help to troubleshoot scenarios where the report takes excessive time to generate or fails to generate. Caution The option will increase the report run time excessively.

Click OK. The generated reports will be listed under Reports on the App UI.
Select the generated report and then click Run. After the report is successfully run, the details of the report runs are listed in the Run box on the right.
Click the following:
- HTML files link to view the report details.
- Input parameters link to view the parameters you chose to run the report.
- Log file link to view the logs of the report.

Using Impala Resource Pool Analysis report

The following reports are generated:

Summary report
Contains stats on various metrics for each resource pool detected in the data collected by Unravel.
Detailed report
Contains the following information:
- Pool Configuration: Settings for each resource pool.
- Query Level Admission Result: Provides a good view of the SLAs as perceived by the clients. The table presents two main metrics - the status of admission control and the wait times associated with all the queries in the data.
- Query Level Performance and Resource Usage: Summarizes the statistics for metrics - query durations, number of nodes for which the queries ran, max aggregate memory, and per-node peak memory. A few links are also provided to enable deeper analysis.
- Temporal Baseline of Aggregate Memory Usage: Considering only aggregate memory usage as a metric, summary of the statistics observed by slicing data into intervals of 1 second and summing up the aggregate memory usage across all the queries in each interval. The statistics are computed over the series of aggregate memory across time intervals.
- Top Time Intervals by Aggregate Memory Used: To understand the temporal baseline for aggregate memory usage the top 5 intervals per resource pool based on the total aggregated memory are presented.
- Temporal Baseline of Query Concurrency: Considering only query concurrency as a metric, summary of the statistics observed by slicing data into one-second intervals and summing up the number of queries in each interval. The statistics are computed over the series of query concurrency observed across time intervals.
- Top Time Intervals by Query Concurrency: To understand the temporal baseline for query concurrency the top 5 intervals per resource pool based on the query concurrency are presented
- Temporal Baseline of Host Peak Mem Usage: Considering only the Host level Peak Mem Usage as a metric, a summary of the statistics observed by slicing data into one-second intervals and taking the maximum per host peak memory usage across all the hosts in each interval. The statistics are computed over the series of per-host peak memory across time intervals.
- Top Time Intervals by Host Peak Mem Usage: To understand the temporal baseline for per host peak memory usage the top 5 intervals per resource pool based on the max per host peak memory usage are presented.

Viewing metrics and insights

Open the report and click any app link from the Signature column.
The Metrics tab displays the trend charts for the Duration, IO, MemorySeconds, and CPU metrics. The dots plotted on the trend charts are the app runs. The app info is provided in the right panel.
You can select any of the following features to view the metric trends:
- status
- cluster id
- cluster UID
Click any dot on any of the Metric trend charts. This becomes the app target run. The details of the app target run appear in the App Run info panel on the right.
In the App Run info panel, click the ID link. This is the ID link of the app target run. The Metric tab is shown with the app target run you have chosen in a different color. The details of the app target run are displayed in the App Target Run panel on the right. Click the View in Unravel link to view the app's details in Unravel.
Select any dot (other than the dot for the app target run) on any of the Metric trend charts. This becomes the app baseline run. This is highlighted in a different color. The details of the app target run are displayed in the App Baseline Run panel on the right. Click the View in Unravel link to view the app's details in Unravel.
In the App Baseline Run panel, click Compare.
1. Select an app from the App Type options:
  - Hive on Tez
  - Spark
  - Impala
2. From the Cluster, select a cluster.
3. Specify the app runs that you want to compare.
  - Baseline App Run: Specify the ID of the baseline app.
  - Target App Run: Specify the ID of the target app with which you want to compare the baseline app.
4. Click Compare. The following app details are given for the baseline and target app run:
  - App details
  - Cluster resource usage when these apps were running
  - App statistics comparison details
  - App config comparison details
  - Query string comparison details
  - Query plan comparison details
In the App Baseline Run panel, click Refresh Insights. The insights are provided in the following sections.
- Baseline comparison
  The results of the comparison are displayed.
- Concurrent heavy hitter apps
  This section will show up only when the information is available.
  Select this option to view the top five applications in a specific category.
  1. Click the Concurrent heavy hitter apps option.
  2. Select any option from the Top-5 Applications option.
    The corresponding details are displayed.