v4.7.4.0 Release notes
Software version
Release Date: 27/April/2022
See 4.7.4.0 for download information.
Software upgrade support
Fresh installations are supported along with the following upgrade path:
v4.7.0.x, v4.7.1.x, v4.7.2.x, v4.7.3.x → v4.7.4.0
v4.6.2.x → v4.7.4.0
v4.6.1.9 → v4.7.4.0
v4.6.1.8 or earlier → v4.6.1.9 → v4.7.4.0
Refer to Upgrading Unravel server for instructions to upgrade to Unravel 4.6.1.9 version.
Refer to Upgrading Unravel for instructions to upgrade to Unravel 4.7.4.0 version.
Refer to Installing Unravel for fresh installations.
Sensor upgrade
Sensor upgrade is mandatory.
Refer to Upgrading Sensors.
Certified platforms
The following platforms are tested and certified in this release:
Cloudera Distribution of Apache Hadoop (CDH)
Cloudera Data Platform (CDP)
Hortonworks Data Platform (HDP)
Amazon Elastic MapReduce (EMR)
Databricks (Azure and AWS)
Google Cloud Platform (Dataproc, BigQuery)
Review your platform's compatibility matrix before you install Unravel.
Updates to Unravel's configuration properties
Refer to 4.7.x - Updates to Unravel properties.
New features
Ansible
Ansible upgrade from 4.6.1.9 to 4.7.x for on-prem and cloud platforms.
LR Authentication
Support is provided for HTTP basic authentication with TLS for Log Receiver(LR). This is currently supported only on Databricks.
Improvements and enhancements
Billing
Billing alert is raised only during the month of expiry. (CDI-466)
Support for capturing discounts on VM and DBU prices as a percentage of public prices. (DT-1118)
Customers can download the latest price information from Unravel. (DT-1117)
BigQuery
Support is provided for cached queries.
Support of insights for BigQuery.
Databricks
Support for Global Init script. (DT-807)
Redact property values containing passwords in the Cluster Configuration tab on Unravel UI. (DT-900)
Process Databricks Run Annotation metrics in real time. (DT-1097)
Skip storing Spark Job IDs in Databricks Run Annotation. (DT-1099)
Data page
Support mix of metastore types (For example, BigQuery and DataProc in GCP).
EMR
Added new IAM permissions needed for AWS account usage. (EMR-269)
Node downsizing event recommendation accounts for task nodes and fleet instances. (EMR-253)
Hive
Store viz_json as a compressed field in the hive_queries table in the database instead of text to save space. (ASP-1399)
Impala
The Impala pipeline is improved to scale for huge impala workloads.
Install
Added show commands (config tls show and config tls trust show) for TLS/Trust manager command. (INSTALL-2444)
Usability improvement for interactive precheck and manual configuration.
Added capability to automatically accept TLS certificate and chain via the manager command. (INSTALL-2019)
Migration
Update resource files for AWS to include the following: (REPORT-2008)
New regions for ec2, emr, s3, ebs
New instances for ec2 and emr
New services for emr
Instances
Instances and storage prices for AWS
Update resource files for Azure to include the following: (REPORT-2007)
New regions for Azure
New VM instances for Azure
New services for Azure HDInsight
Updated Instances and storage prices for Azure
Enable diagnostics for migration reports to improve debuggability. (REPORT-2025)
Monitoring
Improve database monitoring with the MyBatis plugin that monitors query duration and collects statistics. (CDI-432)
Log pipeline metrics for spark and event worker. (ASP-1404)
Security
Kafka and Zookeeper upgraded to address the Log4J vulnerability.
Kafka upgraded to 3.1.0
Zookeeper upgrade to 3.6.3
JAVA upgraded to 8u322.
Python upgraded to 3.8.12.
Capability to generate a token from UI. (UIX-4476)
Spark
New property (com.unraveldata.spark.query.size.max) was added to truncate the query if it is more than the configured length.
Add protobuf support for Spark Btrace sensor data. (ASP-1362)
UI
Cost tabs - Update date range to support broader options. (UIX-4492)
Remove the Close button for the Compute Details page. (UIX-4475)
Multiple UI fixes for Cluster Insights page.
Generate token for API from UI. (UIX-4476)
Unsupported
Unravel does not support Billing on-prem platforms.
On the Data page, File Reports, Small File reports, and file size information are not supported for MapR, and cloud (EMR, Databricks, GCP) clusters.
Impala jobs are not supported on the HDP platform.
Monitoring the expiration of the SSL Certificates and Kerberos principals in Unravel multi-cluster deployments.
LR authentication is not supported for on-prem platforms and for cloud platforms it is supported only for Databricks.
The Program tab does not get populated for a notebook job that is attached to an interactive cluster. (ASP-1432)
The following features are not supported for MapR:
Impala applications
Kerberos
The following features are supported on the Data page:
Forecasting
Small Files
File Reports
The following reports are not supported on MapR:
File Reports
Small Files Report
Capacity Forecasting
Migration Planning
The Tuning report is supported only for MR jobs.
Migration Planning
AutoAction is not supported for Impala applications
Migration
Billing
Insights Overview
Migration Planning is not supported for the following regions for Azure Data Lake:
Germany Central (Sovereign)
Germany Northeast (Sovereign)
Forecasting and Migration: In a multi-cluster environment, you can configure only a single cluster at a time. Hence reports are generated only for that single cluster.
Migration Planning is not supported for MapR.
Unravel does not support multi-cluster management of combined on-prem and cloud clusters.
Unravel does not support apps belonging to the same pipeline in a multi-cluster environment but is sourced from different clusters. A pipeline can only contain apps that belong to the same cluster.
All the reports, except for the TopX report, are not supported on Databricks and EMR.
Memory and CPU usage metrics are not supported for TopX reports on Databricks.
In Jobs > Sessions, the feature of applying recommendations and then running the newly configured app is not supported.
Pig and Cascading applications are not supported.
Bug fixes
AppStore
AppStore fails to start when there are special characters in the database username/password. (APP-567)
AutoAction
AutoAction (AA) policy scope is set as app instead of apps when AA is created without a template. (ASP-1419)
Healthcheck
Exceptions are shown while running AppStore healthcheck. (INSTALL-2291)
Insights
Neglect join operator if valid column id not found in previous scan operators. (INSIGHTS-280)
Change of impact for events in BigQuery applications. (INSIGHTS-301)
The applications that show Total app time as zero are not analyzed for insights. (INSIGHTS-303)
Platform
Group filtering is ignored when member-of search method is used for LDAP groups. (CDI-438)
Report
Error while running Queue Analysis report. (REPORT-2022)
Top X Report fails with HTTPError: HTTP Error 404: Not Found error. (REPORT-2014, REPORT-1979)
UI
Unessential horizontal scrolling when you click the full log in Databricks playground for Spark app. (UIX-4510)
On the Manage page, the DB Stats are not displayed for untracked clusters. (UIX-4171)
Workflow/Jobs page displays empty for Analysis, Resources, Daggraph, and Errors tab. (DT-1093)
Event logs and YARN logs are not loaded for some applications in Google Dataproc clusters. (PG-170)
Mark duplicate insights as stale for BigQuery (INSIGHTS-305)
Incorrect data is displayed in the Number of Queries KPI/Trend graph on the Overview page. (DATAPAGE-502)
Create time of partitions does not get captured in hive metastore if the partition is created dynamically. This limits Unravel to show Last Day KPIs for the partition section.
Wrong data displayed for Number of Partitions Created KPI/trend graph under Partitions KPIs - Last Day section in theData page. (DATAPAGE-473)
When a job fails to submit a spark application, the failed DataBricks job is missing from Unravel UI. (ASP-1427)
In Databricks, when a job in a workflow fails and a new job is launched instead, as a new attempt, the new job will not be part of the same workflow. (PG-269)
Cyclical dependency error is seen in the event_worker_1.err.log file. (DT-1127)
Workaround: Set the com.unravel.workflows.compareupdate.disable property to true.
DataBricks jobs are being missed intermittently in Unravel. (PG-232)
In the Databricks view, the application is shown in a running state, even though the corresponding Spark application is marked as finished. (ASP-1436)
Google Cloud Datapro: Executor Logs are not loaded for spark applications. (PG-229)
Exception: Problem when retrieving bootstrap actions for cluster is seen in the aws_worker daemon logs.
Workaround: While creating an AWS account for EMR Chargeback/Insights overview feature, you must include an additional entry in the
Policy JSON
file for"elasticmapreduce:ListBootstrapActions"
, as follows:{ “Version”: “2012-10-17", “Statement”: [ { “Effect”: “Allow”, “Action”: [ “pricing:GetProducts”, “elasticmapreduce:ListClusters”, “elasticmapreduce:DescribeCluster”, “elasticmapreduce:ListInstanceFleets”, “elasticmapreduce:ListInstanceGroups”, “elasticmapreduce:ListBootstrapActions“, “elasticmapreduce:ListInstances”, “ec2:DescribeSpotPriceHistory” ], “Resource”: “*” } ] }
Even though the AWS account was already created without this entry (
elasticmapreduce:ListBootstrapActions
), you can always include this policy later.The workflow of multiple transient clusters (EMR) is not supported. (ASP-1424)
Unravel node fails to send email notifications. (INSTALL-1694)
The Insights Overview tab uses UTC as the timezone, while other pages use local time. Hence, the date and time that are shown on the Insights Overview tab and the other pages after redirection can be different. (UIX-4176)
Kerberos can only be disabled manually from the
unravel.yaml
file.kerberos: enabled: False
WorkloadFit report
A large number of tags can cause the Workload Fit report to fail. (PG-265)
WorkloadFit report > Heatmap: The job count has data but Vcore and memory are empty. (MIG-262)
Cluster discovery
If the metric retrieval for a host fails, then the CPU and memory capacity/usage graphs and heatmaps are not displayed.
This happens on a CDH cluster when the Cloudera Manager agent of a host does not send any heartbeats to the Cloudera Manager server. Such a host is shown as Bad Health in Cloudera Manager. (REPORT-1706)
Workaround: Ensure that the Cloudera Manager agent sends heartbeats to the Cloudera Manager on all hosts and that none of the hosts are shown as Bad Health.
The On-prem Cluster Identity may show an incorrect Spark version on CDH. The report may incorrectly show Spark 1 when Spark 2 is installed on the CDH cluster. (REPORT-1702)
When using PostgreSQL, the % sign is duplicated and displayed in the Workload Fit report > Map to single cluster tab. (MIG-42)
Cloud Mapping Per Host report scheduled in v4.6.1.x will not work in v4.7.1.0. Users must schedule a new report. (REPORT-1886)
The TopX report email contains a link to the Unravel TopX report instead of showing the report content in the email as in the old reports.
Queue analysis: The log file name (
unravel_us_1.log
) displayed in the error message is incorrect. The correct name of the log file isunravel_sensor.log
. (REPORT-1663)
There is a lag seen for SQL Streaming applications. (PLATFORM-2764)
If the customer uses an active directory for Kerberos and the samAccountName and principal do not match, this can cause errors when accessing HDFS. (DOC-755)
In AAD login mode when external logout happens, the user still has access to his current logged-in UI. (UIX-4125)
For PySpark applications, the processCPUTime and the processCPULoad are not captured properly. (ASP-626)
SQL events generator generates SQL Like clause event if the query contains a like pattern even in the literals. (TEZLLAP-349)
Notebooks will not work after upgrading to v4.7.1.0. You can configure them separately. (REPORT-1895)
In case you have configured a single cluster deployment for Unravel and the cluster name is not default, then the Datapage feature may not work properly.
For this, you must explicitly set the following property after upgrading. (INSTALL-2151)
<Unravel installation directory>/unravel/manager stop <Unravel installation directory>/unravel/manager config properties set hive.metastore.cluster.ids=
<cluster-name>
<Unravel installation directory>/unravel/manager apply <Unravel installation directory>/unravel/manager startAfter upgrading from v4.6.x to v4.7.1.0, the Tez application details page does not initially show DAG data. The DAG data is visible only after you refresh the page. (ASP-1126)
The new user interface (UI) can be accessed only from Chrome.
In the App summary page for Impala, the Query> Operator view is visible after scrolling down. (UIX-3536).
Jobs getting falsely labeled as a Tez App for Oozie Sqoop and Shell actions. (PLATFORM-2403)