v4.6.1.0 Release notes
Software version
Release Date: 05/18/2020
See v4.6.1.0 for download information
Software upgrade support
All that is required is an RPM upgrade. The following upgrade paths are supported for v4.6.1.0:
4.5.4.x to 4.6.1.0
4.5.5.x to 4.6.1.0
4.5.3.x to 4.6.1.0
4.3.1.x to 4.5.0.x to 4.6.1.0
Sensor upgrade
A sensor upgrade is mandatory.
Certified platforms
You must review your platform's compatibility matrix before you upgrade or install.
Updates to Unravel's configuration properties
Refer to v4.6.x - Updates to Unravel properties.
Unsupported
AutoAction
Databricks jobs orchestration via services like ADF
Notebooks on interactive clusters
Spark Program / Query Graph for Notebook and Python tasks
Chargeback view by custom tags
Cost and Instance Recommendations for Jobs on AWS Databricks
Unravel's APIs
Sessions
Role-based Access Control (RBAC)
Support the HDFS path in displaying Spark source code feature.
Data Insights for:
Workload
Spark
Reports
Small files
Cluster optimization
Notebooks
Top-X
Forecasting
Migration planning
Queue analysis
Datapage
Size created of the table
Total size
Accessed partitions
Size created of the partitions
Datapage
Size created of the table.
Total size
Size created of the partitions.
Oozie on EMR
Missing table and column statistics events.
Migration Planning is not supported for the following regions for Azure Data Lake:
US DoD East
US DoD Central
Germany Central (Sovereign)
Germany Northeast (Sovereign)
New features
Unravel for Databricks (Azure and AWS)
Support for jobs running on interactive clusters.
Support for visibility into costs Incurred for application on Azure Databricks.
New recommendations on cost and instance sizing on Azure Databricks.
Support for Spark Streaming apps.
Support for Unravel tagging to add Custom Application Tags.
Reporting
Added property com.unraveldata.report.excluded.queues that allows you to exclude some queues from metrics collection and reports in the Queue Analysis report.
Spark Data Pipeline
Stage timeline data is displayed live.
Added properties to allow skip event log loading and executor log loading.
Support for loading spark program zip using HDFS path with a new spark property: spark.unravel.program.path.
Spark Insights
Added Timings tab in Spark APM, which shows the time breakdown of apps and tasks.
Added Bottleneck Analyzer, which detects the application area that is consuming the most time and provides recommendations.
Data Insights
Spark implementation for FSImage (Small Files) processing.
Datapage
Register table accesses from Spark and SparkSQL apps in the Data Insights framework.
Missing table and column statistics events.
Register table accesses from Hive queries in the Data Insights framework.
Provide insights on compression in table inefficient format event.
Data insights framework and generate small files event.
Get table and partition sizes from Fsimage to dashboard_summaries and table__info.
Provide insights/recommendations on file formats in case of an inefficient format event.
User Interface (UI)
Added interface to the Manage section to manage API tokens. An administrator can:
View the list of tokens, which are authorized for the third parties.
Generate the API tokens.
Delete the API tokens.
Sensor
Added support in MR and TEZ sensor to read cluster-ID from extraJavaOptions and env.
Provided a way to set cluster-ID to use in URL to LR to POST various data from BTrace sensors.
Operations
Unravel now detects inefficient and badly formed tables. This information is shared in Operations > Dashboard Inefficient Tables tile. An app version of the table events is also shown in the Hive-MR, Hive-Tez, Impala, and Spark APMs.
Limitation: Only supports:
On-premise (CDH, HDP)
HDFS file system
Hive metastore
Improvements and Enhancements
Data page
Added paging for scalable retrieval of Impala queries in building the data page, index on column d2 for table dashboard_summaries. Use start_time field instead of created_at in retrieveImpalaQueryInfoByTimeRange().
Add index on column d2 for table dashboard_summaries.
Add paging for scalable retrieval of Impala queries in building the data page.
Group table events by table.
Support External Tables/Partitions and Nested Partitions in Fsimage size calculation.
Forward port: Minimize exception logging in getting DFS path and misc changes.
Minimize exception logging in getting DFS path and misc changes.
Reporting
Improved Capacity Forecasting graph UX. Clear demarcation of actual vs predicted; capacity stays flat rather than predicting capacity increases.
Enable updating of sizes to table-info.
Implement Fsimage (aka Small Files) using Spark instead of Hive.
Impala
Added support for tagging Impala queries with workflows for CDH versions that are later than 5.13.
Impala Operator status and KPIs are captured correctly and is more consistent.
Various improvements in Impala events.
Certified for CDP.
Change skew duration threshold default for time skew event from 0.5s to 3s.
Recommend use LIMIT only for large # rows.
Consider absolute time besides ratio in identifying “long row fetch”.
AutoAction
Provision of AutoAction violation badge for Hive.
Spark Data Pipeline
Store the processed SQL data from Event log if all live data is not received by SW.
Improvements to fetch query plan.
Job group is displayed with the job ID if present.
Sensor conf to add shutdown delay (spark.unravel.shutdown.delay.ms).
Application end time and duration gets updated with the live BTrace sensor data.
Spark pipeline improvements to process executor logs with a new Kafka message.
Kafka
Add version tag in RequestMetrics:RequestsPerSec.
Support for Kafka 2.x.
Workflow
Stability fixes are done for Workflow.
# OF APPS are displayed correctly for Tagged and for Oozie workflow.
Improved the workflow status updating logic ( Stale workflow ).
Support for Oozie 5.0+.
Added the Stale Oozie Workflow Task to update the Workflows that are stuck in RUNNING state.
Data Insights
Data page skips calculating the size of non-existing paths and prevents exceptions as a result.
Paging retrieval of Impala queries gets the app access info.
Reduce the frequency of checking Kerberos authentication in size calculation.
Spark Insights
Improved number of cores and executor recommendations.
Migration Planning
Added new properties for Migration Planning Workload fit report and heat map generation.
com.unraveldata.migrationplanning.workloadfit.max.apps
Configure the limit on the maximum number of apps that can be shown in a single slice of workload fit report and heatmap generation.
com.unraveldata.migrationplanning.workloadfit.timeout
Configure the timeout for workload fit report and heat map generation.
Customer fixes
API Call Consolidation: Bulk endpoint for apps with the summary. (CUSTOMER-676)
Users are not seeing all Impala applications that are based on the tagging script. (EAR-52)
Distinguish which AA triggered an event for a given application not working. (CUSTOMER-942)
AutoAction send email action containing & in the email address results in The email address is not valid. (CUSTOMER-1143)
AutoAction Templates - Change the Elapsed time to Seconds. (CUSTOMER-841)
Add queue dropdown in AA filters. (CUSTOMER-954)
AutoAction for failed spark jobs is not working (CUSTOMER-1160)
Connecting HDI Clusters across Azure subscriptions to a single Unravel. (CUSTOMER-676)
HitdocLoader does not start due to: [ERROR:ELASTIC] Elastic daemon mismatched indexes. (CUSTOMER-1265)
Data Insight's Detail page is blank within the customer large production cluster. (CUSTOMER-185)
Data Insights are not working appropriately. (CUSTOMER-754)
Additional metrics to the data insights - detailed report. (CUSTOMER-178)
Add support for Oracle Hive Metastore (Hive libs do more operations than just read-only). (CUSTOMER-265)
Data Insights not fully showing metrics against Oracle Hive Metastore seeing lots of ORA-01031: insufficient permissions. (CUSTOMER-479)
Data Insights report - feature requests. (CUSTOMER-546)
Data insight details page is not showing any data. (CUSTOMER-1153)
Sorting by partitions in DataInsights Details is unusable. (CUSTOMER-1164)
Data insights column sort in playground 3 broken (Partitions and RP) (CUSTOMER-914)
Data page: Display each table's data storage format (ORC/Parquet etc.) (CUSTOMER-391)
Complete documentation for setting configs around cluster type, and name vs id. (CUSTOMER-610)
Race condition while processing the hive queries. (CUSTOMER-1196)
Unravel attempts to write a DELETEME table to Hive Metastore (CUSTOMER-466)
The dependency issue on HiveHook jar is causing sqoop job to fail. (CUSTOMER-1196)
Need to support ADLS Gen 2. (CUSTOMER-888)
Filter Impala queries by cluster in multi-cluster CM deployments. (CUSTOMER-1216)
Impala apps ignored due to error: DocFieldValue of "counters" is too large <= 32766 (CUSTOM- ER-1007)
On-demand Install fails because SUDO does not exist and ondemand_quick_install.sh and ondemand_install.sh contain SUDO. (CUSTOMER-1303)
Missing Swagger entries and documentation about Unravel APIs for Kafka. (CUSTOMER-906)
Display latest cumulative value for the metrics across all kafka brokers. (Customer-767)
Merged forever_ngui.log with unravel_ngui.log. (CUSTOMER-1246)
LDAP configuration is correct yet users cannot see Admin/Manage functionality. (EAR-51)
MapReduce Application detail page does not load for non-admin users. (CUSTOMER-1101)
Unravel sensor script assumes Spark is installed. (CUSTOMER-404)
SENSOR: Unravel Spark sensor appears to be causing java.lang.ClassCircularityError errors when firing against Data Profiler Agent spark jobs. (EAR-18)
Restrict TLS protocols for Log Receiver communication. (CUSTOMER-669)
Pushlogs.sh fails to create a tar file for log file. (CUSTOMER-1126)
LDAP: Slow group queries (CUSTOMER-1176)
OpenJDK8U-jre_x64_linux_hotspot_8u232b09 in 4601 branch. (CUSTOMER-1244, CUSTOMER-1245)
Support cron job that updates Unravel keytab without requiring Unravel restart for password rotation. (CUSTOMER-828)
Streamline Installation Process for On-Premise Hadoop. (CUSTOMER-87)
Tagging based on Database Names in SQL coming from Hive, Impala, and SparkSQL. (CUSTOMER-1005)
Change connector log entries for missing/incorrect configs to ERROR. (CUSTOMER-1030)
Ability to tag based on tables, databases used by applications. (CUSTOMER-1080)
Realuser tagging configured during installation. (CUSTOMER-116)
Many applications (Spark, MapReduce) are missing logs and metrics within the cluster. (CUSTOMER-152)
Unravel showing MapReduce job in RUNNING state, days after the job actually completed. (CUSTOMER-228)
Add the ability to retain ES data longer and ensure it is accessible from UI and is responsive. (CUSTOMER-266)
Unravel logs are filling up the disk. (CUSTOMER-982)
Fix for benchmarks to work with HDInsight, EMR platforms. (CUSTOMER-1106)
UnravelListener throws an exception. (CUSTOMER-1118)
No Hive jobs are showing up within the cluster and errors are seen within unravel_lr (CUSTOMER-506)
Support custom location for Kerberos client configuration. (CUSTOMER-774)
SmallFiles Report Failed with unravel-udf-0.2.jar missing error. (CUSTOMER-676)
Added Cluster Name to identify Cluster KPI reports. (CUSTOMER-1079)
Small files: Concerns on configuring Small Files / Files Report. (CUSTOMER-783)
Provide reporting on I/O usage. (CUSTOMER-1336)
The Cluster Compare tab the Time Range and the Compare With Range are both set to a default of 7 days. (CUSTOMER-928)
Small Files - Recommendations. (CUSTOMER-818)
Queue Analysis tab is throwing error SQL if queue names have quotes or special characters. (CUSTOMER-657)
Migration Planning reports fail to generate when RM HA is enabled (success or failure depends on which RM is active). (CUSTOMER-1304)
Hive / Spark analysis notebooks: CLI report generation fails with cryptic error message there were no matching apps in the filter criteria. (CUSTOMER-1339)
Spark Workload Analysis Notebook: allow custom processing of app names to handle a specific app name convention to group runs together. (CUSTOMER-1340)
Capacity Forecasting Report showing incorrect total HDFS Capacity. (EAR-57)
Reports load slowly in Anthem environment. (CUSTOMER-181)
Incorporate Hive table to a path in small files report (CUSTOMER-461)
0 apps in the heatmap is too red in the playground 3, which misleads the user to assume that the cluster is hot (CUSTOMER-908)
Creating/Analyze Sessions periodic fails with message: UnravelLogger is not callable. (CUSTOMER-1233)
Spark application hangs and fails to exit after Unravel Sensor is installed. (CUSTOMER-1201)
Timeline for Spark Stages do not render. (CUSTOMER-1200)
The user is unable to go to spark-shell when Unravel-Kafka is down. (CUSTOMER-1152)
Spark recommendations default to spark.default.parallelism. (CUSTOMER-1177)
Informatica Integration: Spark job ran by Informatica is showing no data in Unravel. (CUSTOMER-1053)
API Call Consolidation: Bulk endpoint for apps with summary. (CUSTOMER-676)
Do not show Athena Preview from the Applications List if Athena is not set up for that Unravel deployment. (CUSTOMER-930)
The Result section displays the default web page for the Web server. (CUSTOMER-1287)
Need for a property that disables the Kill/Move feature in UI. (CUSTOMER-889)
Usability concerns with date pickers. (CUSTOMER-474)
Pig jobs show 0 as the duration for all MR jobs. However, the Gantt chart view shows the duration. (CUSTOMER-1235)
Added error handling for generate_app_token.sh. (CUSTOMER-1306)
Enabled RBAC for Cluster summary and compare for user role. (CUSTOMER-165)
Auto-refresh should save state and can be configurable (New UX Preview). (CUSTOMER-174)
Selecting timeframe clears previously selected parameters in the Usage Details Report (New UX Preview). (CUSTOMER-206)
More advanced search capabilities and consistent behavior across all search boxes (New UX Preview). (CUSTOMER-341)
Keep user preferences in UI (New UX Preview). (CUSTOMER-744)
Save page filters for the duration of a session (New UX Preview). (CUSTOMER-489)
The Chargeback report's CSV download only gets 1000 pages. (CUSTOMER-748)
Running and submitted applications are both blue in the UI. A better color can be chosen instead. (CUSTOMER-823)
Updates on the Applications page removes any customizations to the table. (New UX Preview). (CUSTOMER-84)
Ability to list all events in application screen for recommended settings or run report vs clicking on individual jobs. (New UX Preview) (CUSTOMER-898)
Spark - Job-id search inside Spark Navigation tab is not working. (CUSTOMER-1327)
Better ergonomics within the UI when viewing Spark applications (pySpark). (CUSTOMER-299)
Ability to search inefficient applications list by app name, user, tables, etc. (CUSTOMER-581)
Application filter ignored when FILTER BY APP NAME is also used. (CUSTOMER-807)
Persist the selected time period when navigating between tabs within Unravel. (CUSTOMER-86)
Data Correctness - Unravel UI doesn't match the resource manager. (CUSTOMER-395)
Support API Tokens when using SAML Authentication for UI. (CUSTOMER-908)
Improved information when the sensor metrics are missing, especially when the sensor data is not live or sensor configuration was overwritten. (CUSTOMER-831)
API Call Consolidation: Bulk endpointWorkflow search still not working. (CUSTOMER-997)
Workflow search still not working. (CUSTOMER-997)
Many completed workflows are stuck in RUNNING state, and the workflow duration statistics are incorrect. (CUSTOMER-1251)
Spark and Tez apps not showing in Oozie workflows. (CUSTOMER-1049)
Workflow tagging for Impala queries does not work post CDH 5.13. (EAR-38)
Bug fixes
Data page
Added paging for scalable retrieval of Impala queries in building data page, index on column d2 for table dashboard_summaries. Use start_time field instead of created_at in retrieveImpalaQueryInfoByTimeRange().
Reduced sleep time from thread in unravel_tw.
Reporting
Implement retention for values in Master Fsimage. (REPORT-1270)
Memory and CPU data are missing for Hive on MR apps in the TopX report. (REPORT-1165)
get_hive_query_status(): internal error message not found in unravel_ondemand.out. (REPORT-732)
One of the small file reports shows No Data Found when we run small files report parallel (race condition)(REPORT-660).
Queue analysis now receives the correct metrics when the secondary resource manager is active in the HA configuration. (REPORT-393)
Queue analysis failed with list indices error when a specific cluster is selected. (REPORT-370)
Queue analysis report graphs are not legible. (REPORT-342)
Analyzing queues for multiple clusters may cause overlapping of metrics. (REPORT-297)
Capacity forecasting, cluster discovery, and migration planning reports were failing if the value from the property- com.unraveldata.cluster.name did not match the actual cluster name/id. A new property is introduced - property- unravel.python.reporting.cluster.name that indicates the cluster for which these reports should be run. (REPORT-1424)
Queues missing in Queue analysis report in HDP environments. (REPORT-294)
Impala
A loop is executed while generating events for failed Impala queries. (IMPALA-209)
Killed Impala queries are incorrectly classified as failed. (IMPALA-206)
Platform
The tagged workflow shows the non-tagged Tez application. (PLATFORM-2158)
Exception while downloading event log on clusters where Kerberos is not enabled. (PLATFORM-1646)
Queue Metrics Sensor Stops polling after sometime when higher polling rates are set. (PLATFORM-1563)
Operations dashboards do not support multi-cluster and have incorrect aggregations.
Operations Nodes Dashboard does not capture cluster inactivity in graphs.
Spark Application with the same application ID is captured as one.
Spark Program / Query graph for Notebook and Python tasks is not supported.
Spark default Databricks extraJavaOptions are overwritten by Unravel for spark-submit tasks.
DriverOOME and ExecutorOOME events are not generated for the Databricks notebook task.
Recommended Azure instances available in Cluster page but not at run time.
Recommended Azure instances could be in Beta mode only.
Instance recommendation is missing when EMDB is used.
The Violation Badge functionality for AutoAction is not working for Impala queries (Running, Killed). (AA-44)
EMR: Hive metrics are not published in RUNNING state. (HIVE-135)
Latency in fetching the data for MR jobs. (PLATFORM-1613)
API connection error while Polling impalad metrics from CM. (PLATFORM-1567)
conflicted ephemeral node' or 'Corrupt index found'(PLATFORM-702)
gc load metric sensor for MR application will not load on EMR.
For PySpark applications, the processCPUTime and the processCPULoad are not captured properly. (USPARK-626)
Partition size 0 is shown in the insight message on the timings tab. (USPARK-647)
# of Apps is incorrect (PLATFORM-2403)