v4.7.8.0 Release notes
Software version
Release date: 20/February/2023
See 4.7.8.0 for download information.
Software upgrade support
The following upgrade paths are supported:
4.7.x.x → 4.7.8.0
4.6.1.9 → 4.7.8.0
4.6.1.8 or earlier → 4.6.1.9 → 4.7.8.0
For instructions to upgrade to Unravel v4.6.1.9, see Upgrading Unravel server.
For instructions to upgrade to Unravel v4.7.8.x, see Upgrading Unravel.
For fresh installations, see Installing Unravel.
Certified platforms
The following platforms are tested and certified in this release:
Amazon EMR
Databricks (Azure, AWS)
Review your platform's compatibility matrix before you install Unravel.
Updates to Unravel's configuration properties
See 4.7.x - Updates to Unravel properties.
Updates to upgrading Unravel to v4.7.8.0
An existing license for any previous version (before 4.7.7.x) does not work with the newer version of Unravel. Therefore, before upgrading Unravel , you must obtain a license file from Unravel Customer Support. For information about setting the license, see Upgrading Unravel from version 4.7.x to 4.7.8.x section in Upgrading Unravel.
Optionally, you can regroup multiple Spark worker instances for enhanced performance after upgrading to v4.7.8.0.
Caution
This task requires planning and can be performed only in collaboration with Unravel support team. This is a one-time task.
New features
Data quality integration with Great Expectations
Great Expectations is a product quality tool that enables you to run validations against your data asset by running an Expectation Suite (quality assertions) against it. Great Expectations when integrated with Unravel extends the measure of data quality into Unravel. At the same time, Unravel provides unified visibility of any expectations validated while running the Expectation Suite. Thus adding data quality insights to Unravel's current single-pane data monitoring aspect.
You can view the Data Quality insights of Great Expectations from the Unravel UI > Data > Tables detail page> Analysis tab and also from the Unravel UI > Jobs > Job details page > Analysis tab.
Multi-node deployment of Spark workers for high-volume data processing
You can deploy additional Spark workers on a separate server, other than the server where Unravel is installed, with services to process high-volume data.
Notification channels
A new Notification channels option has been added to the Manage menu, using which you can set up notification channels to receive alerts when certain conditions are triggered. Use notifications to send alerts through email addresses or Slack messages to users or user groups.
For information about the Notification channel, see the following topics:
Topics
Guide name
New topics
Notification channels
Creating a notification channel
Modifying the existing notification channel
Viewing notification channels
Updates to the existing topics
Cost Budget (EMR)
Creating a budget
Viewing a budget and its details
Updates to the existing topics
Cost Budget (Databricks)
Setting a budget
AutoActions support EMR apps and clusters to optimize cost
AutoActions can now monitor EMR apps and clusters. You can set the AutoAction policy to generate alerts for EMR apps and clusters. AutoActions can monitor EMR clusters based on cost, duration, and idle checks and send alerts.
For more information, refer to AutoActions > AutoActions (EMR) topic in User Guide.
Improvements and enhancements
Databricks enhancements
A Databricks Job can be associated with multiple clusters. Each job entry now corresponds to a Databricks job. The following enhancements have been made to the Databricks Workflows > Jobs page (DT-1187):
Pages
Changes
Workflows>Jobs
Removed the Clusters Name column
Workflows>Job Runs
Removed the Clusters Name and Cluster Type columns
Removed the Job name link from the Run Name / ID column.
Renamed the Run Name / ID to Job name / ID
Provided a link to the Run ID. After clicking the Run ID, the job run detail page is displayed.
Updated the Search by ID, Keyword field to Search by keyword. You can search for the job name by typing the keyword.
Changed the Filter by Cluster Name search to Filter by Job name or ID
The following enhancement is done for the Resources tab on the Spark details page. (DT-1456)
Pages
Changes
Compute>Spark>Resources>Host Metrics and Workflow>Job>Task>Resources>Host Metrics
The following new metrics are added to Host metrics:
Total memory
Free memory
You can use these metrics to evaluate the memory spent on processes other than those of Spark.
For information, see User Guide.
Other enhancements
Node count and duration values are provided for the aggregated cost savings for each recommended node type. (EMR-620)
The new Account Id column has been added to the AWS Account Settings page to view configured AWS account ID in the Unravel UI. (UIX-5469)
On the Clusters page, the ID filter has been relocated to the top and is separate. You cannot combine other filters (such as Date and time range) with an ID search. (UIX-5332)
For information, see the Monitor EMR clusters section in the User Guide.
The MySQL client library has been updated to the 12.0 version on the user interface. (UIX-5383)
Enhanced performance by reducing the lag in the Impala pipeline. (ASP-1677)
Support for downloading as CSV option for EMR Clusters and EMR AutoAction pages. (UIX-4853)
For information, see Viewing AutoAction and its details and Monitor EMR clusters sections in the User Guide.
Support for the EMR cluster
idle
state (EMR-465)Unravel now supports AutoAction for the
idle
state of the cluster. You can set AutoAction when the EMR cluster exceeds the idle duration threshold. For information, see Creating AutoActions in User Guide.
Unsupported
Appstore does not support PostgreSQL over SSL.
Unravel does not support Billing for on-prem platforms.
On the Data page, File Reports, Small File reports, and file size information are not supported for EMR clusters.
On the Data page, File Reports, Small File reports, and file size information are not supported for Dataproc clusters.
Impala jobs are not supported on the HDP platform.
Monitoring the expiration of the SSL Certificates and Kerberos principals in Unravel multi-cluster deployments.
Sustained Violation is not supported for Databricks AutoAction.
The following features are not supported for MapR:
Impala applications
Kerberos
The following features are supported on the Data page:
Forecasting
Small Files
File Reports
The following reports are not supported on MapR:
File Reports
Small Files Report
Capacity Forecasting
Migration Planning
The Tuning report is supported only for MR jobs.
Migration Planning
AutoAction is not supported for Impala applications.
Migration
Billing
Insights Overview
Unravel does not support the Insights Overview tab on the UI for the Amazon EMR platform.
Migration planning is not supported for the following regions for Azure Data Lake:
Germany Central (Sovereign)
Germany Northeast (Sovereign)
Forecasting and Migration: In a multi-cluster environment, you can configure only a single cluster at a time. Hence, reports are generated only for that single cluster.
Migration planning is not supported for MapR.
Unravel does not support multi-cluster management of combined on-prem and cloud clusters.
Unravel does not support apps belonging to the same pipeline in a multi-cluster environment but is sourced from different clusters. A pipeline can only contain apps that belong to the same cluster.
All the reports, except for the TopX report, are not supported on Databricks and EMR.
In Jobs > Sessions, the feature of applying recommendations and running the newly configured app is not supported.
In GCP - BigQuery, for the Data page, a count of more than 100 projects is not supported.
For BigQuery pricing, Unravel only supports On-demand analysis pricing. Flat-rate analysis pricing and Storage pricing (Active and Long Term storage) is not supported.
Bug fixes
AutoActions
When multiple AutoActions policies are created with the Overlapping ruleset and scopes, only one of the AutoAction policies is triggered. (AA-498)
Databricks
The duplicate job runs (with the same run IDs) are generated on the Job Runs page. (DT-1190)
On the Compute page, inaccurate information is displayed for clusters in the Inefficient category. (UIX-5064)
The downloaded TopX Report (in JSON format) lists the incorrect type of Spark app. (REPORT-2094)
In Databricks, when a job in a workflow fails and a new job is launched instead of a new attempt, the new job cannot be part of the same workflow. (PG-269)
On the Chargeback page, when you group by clusters, Unravel has a limitation of only grouping a maximum of 1000 clusters. (SUPPORT-1570)
EMR
After clicking the Hive Query link on a cluster using the bootstrap script, the No apps found with the Id message is displayed. (CLOUD-532)
On the Clusters page, search by cluster name returns incomplete search results. (UIX-5345)
On the Clusters page, the Name and Cluster tags filters return incomplete search results. (EMR-595)
On the Clusters page, the following issues are observed (EMR-588):
The cluster list omits clusters with a zero cost when the custom date range is selected
The cluster list omits the latest cluster cost when the custom date range is selected
If clusters terminate with errors without generating NodeDownSizingEvent, then such clusters are displayed in the Inefficient category on the Clusters page. (EMR-542)
The Spark sensor fails to start. (EMR-485)
On the Clusters page, a mismatch in the cluster IDs displayed in the ID drop-down list with the selected cluster category in the left panel. (EMR-435)
For clusters terminated with errors, the node downsizing recommendations are shown. (EMR-422)
Insights
Clicking the links for operators and stages in the SQLTooManyGroupByEvent does not result in any action. (INSIGHTS-355)
An exception occurs when generating memory insights for a Spark application. (INSIGHTS-363)
Installation
Databricks Healthcheck App Store celery daemon fails to start. (INSTALL-2945)
Installing Unravel fails when connecting with SSL-enabled MariaDB. (INSTALL-3071)
Spark
A blank page is displayed on the Databricks Run Details page for Spark structured streaming applications. (ASP-1629, UIX-5124)
UI
On the Clusters page, a discrepancy exists between the cost of clusters and the minimum and maximum cost displayed in the left pane. (UIX-5270)
From the Clusters page, after clicking the Spark action, refreshing the Spark details page takes longer than expected. (UIX-5247)
When you return from the application details > SQL tab> Stage page to the application details > Attempt page, the Duration, Data I/O, and Jobs Count fields are not displayed. (UIX-5048)
Workflow/Job page displays empty for Analysis, Resources, Daggraph, and Errors tabs. (DT-1093)
Event logs and YARN logs are not loaded for some applications in Google Dataproc clusters. (ASP-1372)
AutoActions stop responding due to an invalid or unsupported HTTP URL or webhook. (AA-575)
App store tasks fail to start with SSL. (APP-614)
Workaround
To resolve this issue, do the following:
Stop Unravel.
<Unravel installation directory>/unravel/manager stop
Use an editor to open
<Installation_directory>/unravel/data/conf/unravel.yaml
file.In the
unravel.yaml
file, under the database > advanced > python_flags block, enter the path to the trusted certificates. For example, if Unravel is installed at /opt/unravel, you must edit theunravel.yaml
file as follows:unravel: ...snip... database: ...snip... advanced: python_flags: ssl_ca: /opt/unravel/data/certificates/trusted_certs.pem
Use the manager utility to upload the certificates.
<Unravel installation directory>/manager config tls trust add --pem
/path/to/certificate
For example: /opt/unravel/manager config tls trust add --pem /path/to/certificate
Enable the Truststore.
<Unravel installation directory>/manager config tls trust enable
Apply the changes and restart Unravel.
<Unravel installation directory>/unravel/manager config apply --restart
On the Application details page, the original query link is missing for some cached queries due to the parallel processing of original and cached queries. (BIGQ-61)
Issue: Sometimes, when you process a large number of BigQuery projects with the manager config bigquery integrate command, you may see the following error:
Provider produced inconsistent result after apply
Workaround: Wait for a few minutes and re-run the command. (INSTALL-2860, INSTALL-2934)
If tables are created with the same name and are accessed, deleted, and re-created, and if those tables are re-accessed, then their query and app count does not match.(DATAPAGE-502)
For Hive metastore 3.1.0 or earlier versions, the create time of partitions is not captured if a partition is created dynamically. Therefore, in Unravel, the Last Day KPI for the partition section are not shown. (DATAPAGE-473)
The Job Run details page displays a duplicate entry for tasks executed during the job. (DT-1461)
Issue: Jobs created for PySpark application with UDF on a JobCluster fail after applying the recommendations for node downsizing. (DT-1404)
Workaround:
In your Databricks workspace, go to Configure Cluster > Advanced Options > Spark config
Add and set the following property to true for spark.driver.extraJavaOptions and spark.executor.extraJavaOptions spark configurations:
-Dcom.unraveldata.metrics.proctree.enable=true
For example:
spark.executor.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=executor,libs=spark-3.0 spark.driver.extraJavaOptions -Dcom.unraveldata.metrics.proctree.enable=true -javaagent:/dbfs/databricks/unravel/unravel-agent-pack-bin/btrace-agent.jar=config=driver,script=StreamingProbe.btclass,libs=spark-3.0
Data is not displayed when you click the Optimize button corresponding to OTHERS for the Cost > Chargeback results shown in the table. (UIX-5624)
On the Job Runs page, the Cost slider stops working when the cost for the selected user is less than $1. (UIX-5508)
The job run count displayed on the Chargeback page differs from the job count shown on the Workflow page. (UIX-5581)
Errors and Logs data are missing in specific scenarios for a failed Spark application, such as an application failing with
OutOfMemoryError
. (ASP-1624)On the Cost > Trends and Cost > Chargeback pages, the tooltip for the Last <number of> days field includes more days than the displayed days. (UIX-5042)
Clusters with an Unknown status are excluded from the dashboards often used for monitoring systems. (DT-1445)
When the Interactive cluster is restarted, the cluster count is increased on the Databricks Cost > Trends page. (DT-1275)
After navigating from the Trends and Chargeback pages with Tag filters, the
No data available
message is displayed on the Compute page. (DT-1094)Inconsistent data is displayed for the cluster Duration and Start Time on the Compute page. (ASP-1636)
The DriverOOME and ExecutorOOME events are not generated for the Databricks notebook task. (DT-533)
When a job fails to submit a Spark application, the failed DataBricks job is missing from the Unravel UI. (ASP-1427)
In the Databricks view, the application is shown in a running state, even though the corresponding Spark application is marked as finished. (ASP-1436)
On the Workflows > Jobs page, you can view only up to 100 records or jobs. (ASI-695)
The Email template for the AutoAction policy contains the unformatted table. The issue occurs when the threshold value is zero (0). (ASI-688)
Azure Databricks jobs are randomly missing on the Unravel UI due to Azure Databricks File System (DBFS) mount issues. (PIPELINE-1626)
Google Cloud Dataproc: Executor Logs are not loaded for Spark applications. (ASP-1371)
When you click the View Clusters link on the Cost-based pages and navigate to the Clusters page, the cluster numbers shown can vary. Sometimes fewer clusters are listed, and at times no clusters are shown. This is a known limitation due to differences in the definition of the time range selector for these pages. (UIX-5328)
Cost page
Shows all the clusters that have accrued cost in the selected period, which may be running or terminated in the selected period, irrespective of their start date.
Cluster page
Shows only those clusters that have started in the selected period.
Due to the known limitations, node downsizing recommendations are not suggested for the following scenarios. (EMR-519, EMR-513, EMR-424)
When only cluster recommendations are applied without applying Application recommendations.
When the workloads require high IO and partitioning.
When the Spark configuration spark.dynamicAllocation.enabled is true.
When the AWS EMR autoscaling is enabled.
When the workload must need parallelism (multiple CPU cores).
On the Clusters details page, the Insights chart is not synchronized with all other graphs displayed on the Cost tab. (EMR-618)
On the Cost > Trends page, users with readonly permission can create a budget. (EMR-576)
The workflow of multiple transient clusters (EMR) is not supported. (ASP-1424, EMR-460)
Unable to run Spark applications on all the master nodes after Unravel bootstrap for high availability clusters. (EMR-49)
Support for high availability EMR master nodes. (EMR-31)
For the MapReduce failed job, error details are missing on the Errors tab. (UIX-5416)
Issue: You can encounter a NoIndexFound exception for fresh installations of Unravel on GCP-BigQuery. (BIGQ-104)
Workaround: Run the following CURL command on the Unravel node after the installation.
curl -XPUT http://localhost:4171/app-19700101_07
An exception occurs when installing Unravel version 4.7.6.0 with the Azure MySQL database (SSL Enabled). (INSTALL-2799)
During precheck and healthcheck, the Hadoop check fails for the MapR cluster. You can ignore the messages. (INSTALL-2603)
The Insights Overview tab uses UTC as the timezone, while other pages use local time. Hence, the date and time shown on the Insights Overview tab and the other pages after redirection can differ. (UIX-4176)
Kerberos can only be disabled manually from the
unravel.yaml
file.kerberos: enabled: False
WorkloadFit report
A large number of tags can cause the Workload Fit report to fail. (PG-265, CUSTOMER-2084)
WorkloadFit report > Heatmap: The job count has data, but Vcore and memory are empty. (MIG-262)
Cluster discovery
The On-prem Cluster Identity might show an incorrect Spark version on CDH. The report may incorrectly show Spark 1 when Spark 2 is installed on the CDH cluster. (REPORT-1702)
TopX report
The TopX report email links to the Unravel TopX report instead of showing the report content in the email as in the old reports.
Queue analysis:
The log file name (
unravel_us_1.log
) displayed in the error message is incorrect. The correct name of the log file isunravel_sensor.log
. (REPORT-1663)
Cloud Mapping Per Host report scheduled in v4.6.1.x does not work in v4.7.1.0. Users must organize a new report. (REPORT-1886)
When using PostgreSQL, the percentage (%) sign is duplicated and displayed in the Workload Fit report > Map to single cluster tab. (MIG-42)
If the Spark job is not running for Databricks, the values for the Duration and End time fields are not updated on the Databricks Run Details page. (ASP-1616)
You can see a lag for SQL Streaming applications. (PLATFORM-2764)
On the Spark application details page, the timeline histogram is not generated correctly. (UX-632)
If the customer uses an active directory for Kerberos and the samAccountName and principal do not match, this can cause errors when accessing HDFS. (DOC-755)
For PySpark applications, the
processCPUTime
andprocessCPULoad
values are not captured properly. (ASP-626)
SQL events generator generates SQL Like clause event if the query contains a like pattern even in the literals. (TEZLLAP-349)
After upgrading from v4.7.1.1 to v4.7.5.0, the Hive jobs running with the Tez application as an execution engine are not linked. (EMR-406)
After upgrading to v4.7.1.0, Notebooks do not work. You can configure them separately. (REPORT-1895)
After upgrading from v4.6.x to v4.7.1.0, the Tez application details page does not initially show DAG data. The DAG data is visible only after you refresh the page. (ASP-1126)
In the App summary page for Impala, the Query> Operator view is visible after scrolling down. (UIX-3536).
Jobs are falsely labeled as a Tez App for Oozie Sqoop and Shell actions. (PLATFORM-2403)