Cluster wide report
This topic explains how to set up and run Unravel's Cluster Wide Report.
Cluster Wide Report is a java App designed to help you fine-tune your cluster wide parameters to maximize its efficiency based upon your cluster's typical workload.
Cluster Wide Report
Collects performance data of prior completed jobs.
Analyzes the jobs relative to the cluster's current configuration.
Generates recommended cluster parameter changes.
Predicts and quantifies the impact the changes will have on future runs of the jobs.
Most of the recommendations revolve around the parameters
MapSplitSizeParams
HiveExecReducersBytesParam
HiveExecParallelParam
MapReduceSlowStartParam
MapReduceMemoryParams
You can choose to implement some or all of the recommended settings.
Step-by-step guide
Download the report tarball from the Unravel preview site.
curl -v https://preview.unraveldata.com/img/ClusterReportSetup.tar.gz -o ClusterReportSetup.tar.gz
Unpack and run the setup script to install the app in
/usr/local/unravel/install_bin
.tar zxvf ClusterReportSetup.tar.gz cd ClusterReportSetup sudo ./setup.sh /usr/local/unravel/install_bin
The app is now installed in
/usr/local/unravel/install_bin/ClusterReport
. cd to the installation directory.ls dbin/ etc/ origJars/ dlib/ logs/ origJars.tar.gz
cd to dbin and edit
Input.txt
.cd /usr/local/unravel/install_bin/ClusterReport/dbin vi Input.txt
Configure
Input.txt
for your cluster and report parameters.cluster_id = queue = start_date=2018-01-01 end_date=2018-03-28 mapreduce.map.memory.mb=2048 mapreduce.reduce.memory.mb=2048 hive.exec.reducers.bytes.per.reducer=268435456 mapreduce.input.fileinputformat.split.maxsize=256000000
Run the report.
su - hdfs ./cluster_report.sh