Multi-cluster configurations

The Multi-cluster feature allows you to manage multiple independent clusters from a single Unravel installation. You can dynamically add or remove the clusters.

Unravel can manage one default cluster and along with it either multiple on-prem clusters or multiple cloud clusters. Unravel does not support multi-cluster management of combined on-prem and cloud clusters.

Note

Unravel multi-cluster support is available only for fresh installs.

Multi-cluster deployment consists of installing Unravel on the Core node and Edge node. The following image depicts the basic layout of multi-cluster deployment.

Configuring components accessed from the Core node

The following components are accessed from the Core node, where Unravel is installed.

Cloudera Manager (CM)
Ambari
Hive Metastore
Kafka
Pipeline (Workflows)

The following properties must be configured on the Core node in a multi-cluster setup.

Cloudera Manager - Multi-cluster

Property/Description	Set by user	Unit	Default
com.unraveldata.cloudera.manager.list Comma-delimited list of variables to designate one or more CM instances.	Required	CSL	-
com.unraveldata.cloudera.manager.X.url Cloudera URL for variable `X`.	Required	URL	-
com.unraveldata.cloudera.manager.X.username Cloudera user name for variable `X`	Required	string	-
com.unraveldata.cloudera.manager.X.password Cloudera password for variable `X`	Required	string	-
com.unraveldata.cloudera.manager.X.unravel.cluster.ids Cloudera manager UUID and display name.	Optional	CSL

Ambari

Property/Description	Set by user	Unit	Default
com.unraveldata.ambari.manager.list Comma-delimited list of variables to designate one or more Ambari instances. This is used for configuration purposes only and does not need to correspond to Ambari hostnames or cluster ID/name.	Required	CSL	-
com.unraveldata.ambari.X.url Ambari URL for variable `X`.	Required	URL	-
com.unraveldata.ambari.X.username Ambari user name for variable `X`	Required	string	-
com.unraveldata.ambari.X.password Ambari password for variable `X`	Required	string	-

Examples

com.unraveldata.ambari.manager.list=prod,stage
com.unraveldata.ambari.manager.prod.url=https://unravel-prod:8443
com.unraveldata.ambari.manager.prod.username=admin
com.unraveldata.ambari.manager.prod.password=ENC(a3b20c0da)
com.unraveldata.ambari.manager.stage.url=http://unravel-stage:8080
com.unraveldata.ambari.manager.stage.username=admin
com.unraveldata.ambari.manager.stage.password=ENC(92f48a8cf)

Hive Metastore

Property/Description	Set by user	Unit	Default
com.unraveldata.hive.metastore.list Comma-delimited list of variables to designate one or more Hive metastore instances.	Required	CSL	-
com.unraveldata.cloudera.manager.X.unravel.cluster.ids Replace `X` in the property with one of the metastore variables listed in com.unraveldata.hive.metastore.list The value is a comma-delimited list of Unravel cluster IDs, and in the case of Impala, the Cloudera Manager (CM) cluster IDs. Refer to Obtaining CM cluster IDs	Required	CSL	-
hive.metastore.X.cluster.ids Replace `X` in the property with one of the metastore variables listed in com.unraveldata.hive.metastore.list. The value is a comma-delimited list of Unravel cluster IDs.	Required	CSL	-
javax.jdo.option.X.ConnectionURL Hive metastore URL for variable `X` listed in com.unraveldata.hive.metastore.list	Required	URL	-
javax.jdo.option.X.ConnectionDriverName Hive metastore JDBC driver name for variable `X` listed in com.unraveldata.hive.metastore.list	Required	string	-
javax.jdo.option.X.ConnectionUserName Hive metastore JDBC user name for variable `X` listed in com.unraveldata.hive.metastore.list	Required	string	-
javax.jdo.option.X.ConnectionPassword Hive metastore JDBC password for variable `X` listed in com.unraveldata.hive.metastore.list	Required	string	-

The cluster IDs of Cloudera Manager could be obtained by running the following curl command on the node with access to CM:

curl -u <CM_USER>:<CM_PASSWORD> <CM_HOST>:<PORT>/API/<API_VERSION>/clusters

This would return a list of properties. The UUID field denotes the CM cluster ID. For example:

> curl -u admin:admin "http://123.45.6.789:7180/api/v17/clusters"

{
"items" : [ {
"name" : "cluster",
"displayName" : "Sample Cluster",
"version" : "CDH5",
"fullVersion" : "5.12.2",
"maintenanceMode" : false,
"maintenanceOwners" : [ ],
"clusterUrl" : "http://xyz.unraveldata.com:7180/cmf/clusterRedirect/cluster",
"hostsUrl" : "http://xyz.unraveldata.com:7180/cmf/clusterRedirect/cluster/hosts",
"entityStatus" : "CONCERNING_HEALTH",
"uuid" : "94562003-ery8-4418-914c-f5738fc1133d"
} ]
}

Kafka

Property/Description	Set by user	Unit	Default
com.unraveldata.ext.kafka.clusters.list Comma-delimited list of variables to designate one or more Kafka instances.	Required	CSL	-
com.unraveldata.ext.kafka.X.bootstrap_servers Comma-separated list of Kafka brokers for that cluster.	Required	CSL	-
com.unraveldata.ext.kafka.<server>.jmx_servers Comma-separated list of JMX servers	Required	CSL	-
com.unraveldata.ext.kafka.<server>.jmx.kafka1.host Host name of the JMX server.	Required	CSL
com.unraveldata.ext.kafka.<server>.jmx.kafka1.port Port of JMX server.	Required	CSL	-
com.unraveldata.ext.kafka.<server>.jmx.kafka1.run_period Collection interval for JMX metrics.	Required	string	-

Pipelines

Property/Description	Unit	Default
com.unraveldata.workflow.num.past.instances The number of past workflow instance for the compare operation.	count	5
com.unraveldata.workflow.num.future.instances The number of future workflow instance for the compare operation.	count	5
com.unraveldata.analytics.max.past.samples The number of workflows to use during a SLA Analysis.	count	5
com.unravel.workflows.compareupdate.disable In a multi-cluster setup, this property must be set to true mandatorily on the Core node.	boolean

Configuring components accessed from the Edge node

In a multi-cluster deployment for on-prem platforms, the following properties must be added to the Edge node server, for MR jobs to load jhist and logs, HDFS path for jhist/conf, and yarn logs:

Property/Description	Default
com.unraveldata.min.job.duration.for.attempt.log Minimum duration of a successful application or which executor logs are processed (in milliseconds).	600000 (10 mins)
com.unraveldata.job.collector.log.aggregation.base HDFS path to the aggregated container logs (logs to process). Do not include the hdfs://prefix. The log format defaults to TFile. You can specify multiple logs and log formats (TFile or IndexedFormat.) Example: com.unraveldata.job.collector.log.aggregation.base=TFile:/tmp/logs//logs/,IndexedFormat:/tmp/logs//logs-ifile/	/tmp/logs/*/logs/
com.unraveldata.job.collector.done.log.base HDFS path to done directory of MR logs as per cluster configuration. Don't include the hdfs:// prefix For example: com.unraveldata.job.collector.done.log.base=/mr-history/done	/user/history/done
com.unraveldata.spark.eventlog.location All the possible locations of the event log files. Multiple locations are supported as a comma-separated list of values. This property is used only when the Unravel sensor is not enabled. When the sensor is enabled, the event log path is taken from the application configuration at runtime.	hdfs:///user/spark/applicationHistory/

The following properties must be added for Tez to the Edge node server in a multi-cluster deployment for on-prem platforms.

Property/Description	Set by user	Unit	Default
com.unraveldata.yarn.timeline-service.webapp.address The HTTP address of the Timeline service web application.	Optional	string (URL)	-
com.unraveldata.yarn.timeline-service.port Timeline service port.		number	8188

Property/Description

Set by user

Unit

Default

com.unraveldata.yarn.timeline-service.webapp.address

The HTTP address of the Timeline service web application.

Optional

string

(URL)

com.unraveldata.yarn.timeline-service.port

Timeline service port.

number

8188

Note

In a multi-cluster environment, you must add these properties to the Edge node.

In this section:

Home

Multi-cluster configurations

Note

Configuring components accessed from the Core node

Cloudera Manager - Multi-cluster

Ambari

Hive Metastore

Kafka

Pipelines

Configuring components accessed from the Edge node

Note

Search results