v4.7.9.0 Release notes

Software version

Release date: 08/May/2023

See 4.7.9.0 for download information.

Software upgrade support

The following upgrade paths are supported:

4.7.x.x → 4.7.9.0
4.6.1.9 → 4.7.9.0
4.6.1.8 or earlier → 4.6.1.9 → 4.7.9.0

For instructions to upgrade to Unravel v4.6.1.9, see Upgrading Unravel server.

For instructions to upgrade to Unravel v4.7.9.x, see Upgrading Unravel.

For fresh installations, see Installing Unravel.

Certified platforms

The following platforms are tested and certified in this release:

On-premise platforms

Cloudera Distribution of Apache Hadoop (CDH)
Cloudera Data Platform (CDP)
Hortonworks Data Platform (HDP)

Review your platform's compatibility matrix before you install Unravel.

Updates to Unravel's configuration properties

See 4.7.x - Updates to Unravel properties.

Updates to upgrading Unravel to v4.7.9.0

Obtain License file

An existing license for any previous version does not work with the newer version of Unravel. Therefore, before upgrading Unravel, you must obtain a license file from Unravel Customer Support. For information about setting the license, see Upgrading Unravel from version 4.7.x to 4.7.9.x section in Upgrading Unravel.

Re-configure the custom ports after upgrading from versions before v4.6.1.9

In the case of an Unravel version before v4.6.1.9, you have a customized UI port, for example, 4000, then after the upgrade, this port reverts to the default 3000. Hence, after you upgrade to 4.7.9.0, you must reconfigure the custom port.

Stop Unravel.

<Unravel installation directory>/unravel/manager stop

Display the list of ports and keys.

<Unravel installation directory>/unravel/manager config ports show

Set the port using the following command with the port and port key.

<Unravel installation directory>/unravel/manager config ports set <port key> <port>

For example, if you run the following command, NGUI listens on port 1234.

<Unravel installation directory>/unravel/manager config ports set /hosts/host_main/instances/ngui_1/config/network/port 1234

Note

Run manager config ports unset <port key> command to return the ports to their default values.

<Unravel installation directory>/unravel/manager config ports unset /hosts/host_main/instances/appstore_1/config/flask/port

Apply the changes.

<Unravel installation directory>/unravel/manager config apply

Start Unravel.

<Unravel installation directory>/unravel/manager start

Unset/set properties on the HDP to CDP migrated cluster

After you have migrated from an Hortonworks Data Platform (HDP) to a Cloudera Data Platform (CDP) cluster, there are some properties you must unset and new properties that you must set on an edge node of the migrated cluster.

Single-cluster (HDP to CDP migration)

On the edge node, set the following property:

<Unravel installation directory>/unravel/manager config properties set com.unraveldata.cluster.type CDP

In case HBase was configured earlier, then set the following property:

<Unravel installation directory>/unravel/manager config properties set com.unraveldata.hbase.source.type=CDP

Unset the following properties:

<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.url
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.username
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.password
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.yarn.timeline-service.webapp.address
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.yarn.timeline-service.port

Run manager config auto command to automatically pull in all the Hadoop configurations. You will be prompted to provide the location and credentials for the CDP Cloudera manager URL.
```
<Unravel installation directory>/unravel/manager config auto
```
If there are more than one clusters that are handled by the CDP Cloudera manager, then you are prompted to enable the cluster that you want to monitor. Run the following command to enable a cluster:
```
<Unravel installation directory>/unravel/manager config cluster enable <CLUSTER KEY>
```
Example: /opt/unravel/manager config cluster enable cluster1

Apply the changes.

<Unravel installation directory>/unravel/manager config apply

Start Unravel.

<Unravel installation directory>/unravel/manager start

Ensure to make the following changes in the Cloudera manager:
- HDP/CDH has hive hooks under HIVE service but for CDP it is under HIVE_ON_TEZ Hence, you must update the properties accordingly.
- Update CDP Version for sensor parcel in Parcel Repository & Network from Cloudera Manager to https://xxx.unraveldata.com:3000/parcels/cdh7.1/
- Check for new parcels and distribute them.

Multi-cluster (HDP to CDP migration)

In case of a multi-cluster environment, do the following:

Unset the following properties on the core node:

<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.cluster.type
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.url
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.username
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.password
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.<EDGE KEY>.url
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.<EDGE KEY>.username
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.<EDGE KEY>.password
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.ambari.manager.list

Tip

Run the following commands to obtain the <EDGE KEY>.

<unravel_installation_directory>/unravel/config edge show

Run the following commands to obtain the <CLUSTER KEY>.

<unravel_installation_directory>/unravel/manager support show cluster_access_id/unravel/config edge show

Note

For a multi-cluster environment, if your cluster name changes after the migration from HDP to CDP you must unset the following properties on the core node:

<Unravel installation directory>/unravel/manager config properties unset javax.jdo.option.<EDGE KEY>_<CLUSTER KEY>_HIVE.ConnectionURL
<Unravel installation directory>/unravel/manager config properties unset javax.jdo.option.<EDGE KEY>_<CLUSTER KEY>_HIVE.ConnectionDriverName
<Unravel installation directory>/unravel/manager config properties unset javax.jdo.option.<EDGE KEY>_<CLUSTER KEY>_HIVE.ConnectionUserName
<Unravel installation directory>/unravel/manager config properties unset javax.jdo.option.<EDGE KEY>_<CLUSTER KEY>_HIVE.ConnectionPassword
<Unravel installation directory>/unravel/manager config properties unset hive.metastore.<EDGE KEY>_<CLUSTER KEY>_HIVE.cluster.ids
<Unravel installation directory>/unravel/manager config properties unset hive.metastore.<EDGE KEY>_<CLUSTER KEY>_HIVE.cluster.ids
<Unravel installation directory>/unravel/manager config properties unset com.unraveldata.hive.metastore.list

Run the following command:

<Unravel installation directory>/unravel/manager config edge auto <EDGE KEY>

Apply the changes.

<Unravel installation directory>/unravel/manager config apply

Start Unravel.

<Unravel installation directory>/unravel/manager start

New features

Horizontal scaling of Log Receiver

Unravel now supports deploying Log Receiver on a dedicated node. This support allows Log Receiver to process much more requests. In addition, you can deploy multiple instances of Log Receiver with a load balancer to distribute the workload efficiently.

For information, see the following topics:

Topics	Guide name
New topics Log Receiver (LR) Load Balancer Configuring load balancing for Log Receiver (LR) Moving Log Receiver (LR) from core node to worker node Log Receiver (LR) performance statistics	Configuration Guide
New topic Load balancer FAQs Updated topic Log Receiver (LR) properties	Reference Guide

Topics

Guide name

New topics

Log Receiver (LR) Load Balancer
Configuring load balancing for Log Receiver (LR)
Moving Log Receiver (LR) from core node to worker node
Log Receiver (LR) performance statistics

Configuration Guide

New topic

Load balancer FAQs

Updated topic

Log Receiver (LR) properties

Reference Guide

Improvements and enhancements

Redesign of the user interface for AutoActions
The user interface of the AutoActions page has been improved to align with the AutoActions page of the cloud platform.
For more information, refer to AutoActions > AutoActions topic in User Guide.
API Token
- The user interface of the API token page is enhanced for consistency. (UIX-5612)
- API tokens can be generated for a user role. Moreover, you can select tags to associate with the API tokens generated for the user roles. (UIX-5854, UIX-5739)
Other enhancements
- Support for downloading as a CSV option has been added on multiple pages of the Unravel UI. (CUSTOMER-2121, CUSTOMER-2139, CUSTOMER-2069)
  You can now download the following details:
  - The cost summary, cost of clusters, and VM cost details from the Workload fit report.
  - Inefficient jobs from the Jobs > Applications page.
  - Topic summary from the Kafka > Metrics tab.
  - Export details of the resources used for the running jobs from the Clusters > Job Trends page and Clusters > Resources pages.
- Provide AutoAction support for Impala-tagged workflows. (ASI-735)
- Node.js has been upgraded to the 16.19.1 version. (UIX-5751)
- Multi-cluster support is enabled (only for CDH and CDP platforms) for migration reports. (CUSTOMER-2402) and (CUSTOMER-2399)
- To replicate AutoActions, a new Duplicate option has been provided on the AutoAction list and AutoAction Details pages. This option replaced the existing Expert Rule functionality. (CUSTOMER-2359)
- In a multi-cluster setup, you can choose the cluster from where you want to generate the cloud migration reports. (CUSTOMER-2179)
- On the Clusters > Overview, Clusters > Resources, and Clusters > Job Trends pages, for certain selected time ranges, you can now select an interval that you want for the data points from a drop-down list.
- The Tez App details page has been revamped for the Hive pipeline improvements. For failed or killed apps, errors are displayed on the Errors tab instead of the Diagnostics tab. (PIPELINE-1764)

Unsupported

App store

Appstore does not support PostgreSQL over SSL.

Billing

Unravel does not support Billing for on-prem platforms.

Data

On the Data page, File Reports, Small File reports, and file size information are not supported for EMR clusters.
On the Data page, File Reports, Small File reports, and file size information are not supported for Dataproc clusters.

Healthcheck

Monitoring the expiration of the SSL Certificates and Kerberos principals in Unravel multi-cluster deployments.

Platforms

AutoActions

Sustained Violation is not supported for Databricks AutoAction.

MapR

The following features are not supported for MapR:

Impala applications
Kerberos
The following features are not supported on the Data page:
- Forecasting
- Small Files Report
- File Reports
The following reports are not supported on MapR:
- File Reports
- Small Files Report
- Capacity Forecasting
- Cloud Migration reports
AutoAction is not supported for Impala applications.
Billing
Insights Overview

Amazon EMR

Unravel does not support the Insights Overview tab on the UI for the Amazon EMR platform.

Migration planning

Migration planning is not supported for the following regions for Azure Data Lake:
- Germany Central (Sovereign)
- Germany Northeast (Sovereign)
Forecasting and Migration: In a multi-cluster environment, you can configure only a single cluster at a time. Hence, reports are generated only for that single cluster.
Migration planning is not supported for MapR.

Multi-cluster deployment

Unravel does not support multi-cluster management of combined on-prem and cloud clusters.

Pipeline

Unravel does not support apps belonging to the same pipeline in a multi-cluster environment but is sourced from different clusters. A pipeline can only contain apps that belong to the same cluster.

Reports

All the reports, except for the TopX report, are not supported on Databricks and EMR.

Data

In GCP - BigQuery, for the Data page, a count of more than 100 projects is not supported.

BigQuery pricing

For BigQuery pricing, Unravel only supports On-demand analysis pricing. Flat-rate analysis pricing and Storage pricing (Active and Long Term storage) is not supported.

Bug fixes

Report
- The sensor log file name in the AppStore log needs to be updated for accuracy. (REPORT-2103)
UI
- On the Clusters > Resources page, the selected cluster value is modified if the value in the Resource Usage/Resource Type list changes (from Impala to Yarn) or if the tabs are switched. (CUSTOMER-2161)
- On the Clusters > Resources page, the page loading time is unusually long when a cluster is selected from the Cluster drop-down list. (CUSTOMER-2163)

Known issues

Applications

Workflow/Job page displays empty for Analysis, Resources, Daggraph, and Errors tabs. (DT-1093)
Event logs and YARN logs are not loaded for some applications in Google Dataproc clusters. (ASP-1372)
On the Table Details page under the Applications tab, inaccurate data is displayed for a table. This issue occurs if a table is deleted and recreated multiple times and applications executed access the table before the next cycle of the table worker. (PG-156)

AutoActions

AutoActions stop responding due to an invalid or unsupported HTTP URL or webhook. (AA-575)

App store

App store tasks fail to start with SSL enabled on MySQL database. (APP-614)
Workaround
To resolve this issue, do the following:
1. Stop Unravel.
```
<Unravel installation directory>/unravel/manager stop
```
2. Use an editor to open <Installation_directory>/unravel/data/conf/unravel.yaml file.
3. In the unravel.yaml file, under the database > advanced > python_flags block, enter the path to the trusted certificates. For example, if Unravel is installed at /opt/unravel, you must edit the unravel.yaml file as follows:
```
unravel:
...snip...
  database:
...snip...
    advanced:
      python_flags:
        ssl_ca: /opt/unravel/data/certificates/trusted_certs.pem
```
4. Use the manager utility to upload the certificates.
```
<Unravel installation directory>/manager config tls trust add --pem /path/to/certificate
```
  For example: /opt/unravel/manager config tls trust add --pem /path/to/certificate
5. Enable the Truststore.
```
<Unravel installation directory>/manager config tls trust enable
```
6. Apply the changes and restart Unravel.
```
<Unravel installation directory>/unravel/manager config apply --restart
```

BigQuery

On the Application details page, the original query link is missing for some cached queries due to the parallel processing of original and cached queries. (BIGQ-61)
Issue: Sometimes, when you process a large number of BigQuery projects with the manager config bigquery integrate command, you may see the following error:
Provider produced inconsistent result after apply
Workaround: Wait for a few minutes and re-run the command. (INSTALL-2860, INSTALL-2934)

Data page

If tables are created with the same name and are accessed, deleted, and re-created, and if those tables are re-accessed, then their query and app count does not match.(DATAPAGE-502)
For Hive metastore 3.1.0 or earlier versions, the creation time of partitions is not captured if a partition is created dynamically. Therefore, in Unravel, the Last Day KPI for the partition section is not shown. (DATAPAGE-473)
On the Data page, size data is missing for certain tables in databases, although the partition size is correctly displayed in the Partition Detail section. (DATAPAGE-695)

Databricks

The Job Run details page displays a duplicate entry for tasks executed during the job. (DT-1461)
Issue: Jobs created for PySpark application with UDF on a JobCluster fail after applying the recommendations for node downsizing. (DT-1404)
Workaround:
1. In your Databricks workspace, go to Configure Cluster > Advanced Options > Spark config
2. Add and set the following property to true for spark.driver.extraJavaOptions and spark.executor.extraJavaOptions spark configurations:
  - -Dcom.unraveldata.metrics.proctree.enable=true
  For example:
```
spark.executor.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-3.0 spark.driver.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-3.0
```
Data is not displayed when you click the Optimize button corresponding to OTHERS for the Cost > Chargeback results shown in the table. (UIX-5624)
On the Job Runs page, the Cost slider stops working when the cost for the selected user is less than $1. (UIX-5508)
The job run count displayed on the Chargeback page differs from the job count shown on the Workflow page. (UIX-5581)
Errors and Logs data are missing in specific scenarios for a failed Spark application, such as an application failing with OutOfMemoryError. (ASP-1624)
On the Cost > Trends and Cost > Chargeback pages, the tooltip for the Last <number of> days field includes more days than the displayed days. (UIX-5042)
Clusters with an Unknown status are excluded from the dashboards often used for monitoring systems. (DT-1445)
When the Interactive cluster is restarted, the cluster count is increased on the Databricks Cost > Trends page. (DT-1275)
After navigating from the Trends and Chargeback pages with Tag filters, the No data available message is displayed on the Compute page. (DT-1094)
Inconsistent data is displayed for the cluster Duration and Start Time on the Compute page. (ASP-1636)
The DriverOOME and ExecutorOOME events are not generated for the Databricks notebook task. (DT-533)
When a job fails to submit a Spark application, the failed DataBricks job is missing from the Unravel UI. (ASP-1427)
In the Databricks view, the application is shown in a running state, even though the corresponding Spark application is marked as finished. (ASP-1436)
On the Workflows > Jobs page, you can view only up to 100 records or jobs. (ASI-695)
The Email template for the AutoAction policy contains the unformatted table. The issue occurs when the threshold value is zero (0). (ASI-688)
Azure Databricks jobs are randomly missing on the Unravel UI due to Azure Databricks File System (DBFS) mount issues. (PIPELINE-1626)

Dataproc

Google Cloud Dataproc: Executor Logs are not loaded for Spark applications. (ASP-1371)

EMR

When you click the View Clusters link on the Cost-based pages and navigate to the Clusters page, the cluster numbers shown can vary. Sometimes fewer clusters are listed, and at times no clusters are shown. This is a known limitation due to differences in the definition of the time range selector for these pages. (UIX-5328)
- Cost page
  Shows all the clusters that have accrued cost in the selected period, which may be running or terminated in the selected period, irrespective of their start date.
- Cluster page
  Shows only those clusters that have started in the selected period.
Due to the known limitations, node downsizing recommendations are not suggested for the following scenarios. (EMR-519, EMR-513, EMR-424)
- When only cluster recommendations are applied without applying Application recommendations.
- When the workloads require high IO and partitioning.
- When the Spark configuration spark.dynamicAllocation.enabled is true.
- When the AWS EMR autoscaling is enabled.
- When the workload must need parallelism (multiple CPU cores).
On the Clusters details page, the Insights chart is not synchronized with all other graphs displayed on the Cost tab. (EMR-618)
On the Cost > Trends page, users with readonly permission can create a budget. (EMR-576)
The workflow of multiple transient clusters (EMR) is not supported. (ASP-1424, EMR-460)
Unable to run Spark applications on all the master nodes after Unravel bootstrap for high availability clusters. (EMR-49)
Support for high availability EMR master nodes. (EMR-31)
For the MapReduce failed job, error details are missing on the Errors tab. (UIX-5416)

Installation

Issue: You can encounter a NoIndexFound exception for fresh installations of Unravel on GCP-BigQuery. (BIGQ-104)
Workaround: Run the following CURL command on the Unravel node after the installation.
```
curl -XPUT http://localhost:4171/app-19700101_07
```
An exception occurs when installing Unravel version 4.7.6.0 with the Azure MySQL database (SSL Enabled). (INSTALL-2799)
During precheck and healthcheck, the Hadoop check fails for the MapR cluster. You can ignore the messages. (INSTALL-2603)

Insights Overview

The Insights Overview tab uses UTC as the timezone, while other pages use local time. Hence, the date and time shown on the Insights Overview tab and the other pages after redirection can differ. (UIX-4176)

Kerberos

Kerberos can only be disabled manually from the unravel.yamlfile.
```
 kerberos:
      enabled: False
```

Migration

WorkloadFit report
- A large number of tags can cause the Workload Fit report to fail. (PG-265, CUSTOMER-2084)
- WorkloadFit report > Heatmap: The job count has data, but Vcore and memory are empty. (MIG-262)
Inconsistency between the regions displayed on the Unravel user interface and the ones included in AWS EMR. (MIG-280, MIG-281)
The Cloud Mapping Per Host migration report fails for some regions. (MIG-303)

Reports

Cluster discovery
- The On-prem Cluster Identity might show an incorrect Spark version on CDH. The report may incorrectly show Spark 1 when Spark 2 is installed on the CDH cluster. (REPORT-1702)
TopX report
- The TopX report email links to the Unravel TopX report instead of showing the report content in the email as in the old reports.
Queue analysis:
- The log file name (unravel_us_1.log) displayed in the error message is incorrect. The correct name of the log file is unravel_sensor.log. (REPORT-1663)
Cloud Mapping Per Host report scheduled in v4.6.1.x does not work in v4.7.1.0. Users must organize a new report. (REPORT-1886)
When using PostgreSQL, the percentage (%) sign is duplicated and displayed in the Workload Fit report > Map to single cluster tab. (MIG-42)

Spark

If the Spark job is not running for Databricks, the values for the Duration and End time fields are not updated on the Databricks Run Details page. (ASP-1616)
You can see a lag for SQL Streaming applications. (PLATFORM-2764)
On the Spark application details page, the timeline histogram is not generated correctly. (UX-632)

Security

If the customer uses an active directory for Kerberos and the samAccountName and principal do not match, this can cause errors when accessing HDFS. (DOC-755)

Spark insights

For PySpark applications, the processCPUTime and processCPULoad values are not captured properly. (ASP-626)

Tez

SQL events generator generates SQL Like clause event if the query contains a like pattern even in the literals. (TEZLLAP-349)

Upgrade

After upgrading from v4.7.1.1 to v4.7.5.0, the Hive jobs running with the Tez application as an execution engine are not linked. (EMR-406)
After upgrading to v4.7.1.0, Notebooks do not work. You can configure them separately. (REPORT-1895)
After upgrading from v4.6.x to v4.7.1.0, the Tez application details page does not initially show DAG data. The DAG data is visible only after you refresh the page. (ASP-1126)

UI

In the App summary page for Impala, the Query> Operator view is visible after scrolling down. (UIX-3536).
When you click the hive query, which was executed as part of the Hive on Spark application, a blank page is shown. (UIX-6037)
On the Clusters > Resources page, in the Group By drop-down list, the Application Type, User, and Queue options are duplicated for the YARN/IMPALA resource job type. The issue occurs if identical user-defined tags are used. (UIX-5898)
On the Clusters > Workload page, after modifying the Display items per page setting from 10 to 50, selecting any insight (such as Policy Violation) redirects to the default setting of 10 items per page. (UIX-6033).

Workflow

Jobs are falsely labeled as a Tez App for Oozie Sqoop and Shell actions. (PLATFORM-2403)

Deprecated features

Expert rule (AutoActions)
Creating an AutoActions policy in the JSON data format and then using it through the Expert rule option has been deprecated. This functionality has been replaced with the Duplicate option on the AutoActions list and detail view pages. You can select the Duplicate option to replicate the selected AutoAction policy.
Tuning report
The Tuning report is deprecated.
Sessions feature
The Sessions feature is deprecated.

Support

For support issues, contact Unravel Support.

In this section:

Home