Troubleshooting
This section provides information for troubleshooting and recovery.
Upgrading from 4.6.2x, the Precheck fails for Hadoop when you activate the 4.7x version
Issue
When you are upgrading from Unravel version 4.6.2x multi-cluster environment and activate the v4.7x version, the Precheck fails with the following Hadoop error:
Solution
This is because of the com.unraveldata.multicluster.default_cluster.enabled property which indicates whether the core node is directly monitoring the Hadoop cluster or not. By default, this is property is set to true in Unravel 4.6.2x.
However, if you are not using the core node for hadoop monitoring, you must manually set this property to false before performing the upgrade in a multi-cluster environment. This will eliminate the Hadoop error in Precheck when you are upgrading in a multi-cluster environment from Unravel version 4.6.2x to 4.7x.
Before you upgrade to v4.7x, do the following:
Stop Unravel
<Unravel installation directory>/unravel/manager stop
Set the com.unraveldata.multicluster.default_cluster.enabled property to false.
<Unravel installation directory>/unravel/manager config properties set com.unraveldata.multicluster.default_cluster.enabled false
Apply the changes.
<Unravel installation directory>/unravel/manager refresh files
Start Unravel.
<Unravel installation directory>/unravel/manager start
Diagnosing issues from log files
Whenever you face any issues during installation, you should first check the following log files to diagnose the issues:
The installation process is broken
Issue:
The installation process gets broken.
Solution:
Whenever the installation process gets broken, do the following:
Stop Unravel.
manager stop
If the manager does not work, open the
services
directory, each service has a stop.sh script. Stop the service monitor (monit). and then run the stop.sh script.In case you do not have stop.sh scripts, send SIGTERM to all the services starting with the service monitor (monit)
Caution
Avoid using SIGKILL since that may cause some file corruption.
Reinstall Unravel using the content in the
data
directory.
Files got deleted or corrupted
Issue:
The files got deleted or corrupted
Solution:
Stop Unravel.
Assuming that you have installed Unravel in
/opt
, run the following command:/opt/unravel/manager refresh files
This regenerates all the scripts and configuration files.
In case the refresh command did not regenerate the files or the manager is broken, then check
<Unravel installation directory>/data/conf/current.yaml
and run the following. The current.yaml file shows the current version that is installed.<Unravel installation directory>/versions/X.Y.Z/setup --config=<Unravel installation directory>/data/conf/unravel.yaml
Start Unravel.
<Unravel installation directory>/unravel/manager start
Unravel software got deleted
Issue:
Unravel software got deleted.
Solution:
Stop Unravel.
Check
<Unravel installation directory>/data/conf/current.yaml
for the current version that is installed.Unpack that same version in the exact location where it was deployed earlier.
tar zxf unravel-SAME-VERSION.tar.gz -C /opt
Run the following:
<Unravel installation directory>/versions/X.Y.Z/setup --config=<Unravel installation directory>/unravel/data/conf/unravel.yaml
Start the manager.
<Unravel installation directory>/unravel/manager start
Restoring Unravel from a backup
Issue:
How to restore Unravel from a backup?
Solution:
Stop Unravel.
Restore the backup of the data directory.
Open
data/conf/unravel.effective.yaml
and check for the following key paths:base:
<Unravel installation directory>
data:
<Unravel installation directory>/data
Make sure that the
data
is restored to the right location.Make sure the unravel user has full access and ownership of the
base
location and everything in it.Check< Unravel installation directory>
/data/conf/current.yaml
for the current version that is installed.Unpack that same version in the exact location where it was deployed earlier.
tar zxf unravel-SAME-VERSION.tar.gz -C /opt
Run the following:
<Unravel installation directory>/versions/X.Y.Z/setup --config=<Unravel installation directory>/data/conf/unravel.yaml
Start Unravel.
<Unravel installation directory>/manager start
Troubleshooting Cloudera Distribution of Apache Hadoop (CDH) issues
Symptom | Problem | Remedy |
---|---|---|
|
| Install Unravel RPM on Unravel host. or Verify that user |
| Unravel hive hook JAR was not found in in | Confirm that the or Put the Unravel hive-hook JAR corresponding to cd /usr/local/unravel/hive-hook/; cp unravel-hive- |
Oozie shell action fails with ClassNotFoundException on Hcat call after Unravel Hive Hooks were added to the cluster
HCatalog is part of Apache Hive. In such a case, the Hive Hook configuration is found, but the libraries that execute Hive Hook are missing.
Since this is a shell action, libraries need to exist on every node locally so that Sqoop command can locate it during command execution. You can add Unravel Hive Hook jar in /var/lib/sqoop
or wherever the hive-hcatalog jars are located in the cluster.
Unravel stop and start fails with an error
Issue:
When Unravel is stopped and restarted immediately, the following error is displayed:
[Errno 1] Operation not permitted [Errno 1] Operation not permitted INS00160: Process '3366' is not owned by unravel INS00161: Process '3366' is not owned by unravel, this can come from a stale pid file '/opt/unravel/run/mysql.pid'
Solution
When you do an ungraceful shutdown, the PID files will remain and if the PID is reused it may cause problems. You should ensure that unravel is stopped (it will if the server was just restarted) and delete the PID files in /opt/unravel/run
Amazon EMR: Unravel sensor properties are overwritten when a configuration is supplied for an Instance group on a running cluster
Issue:
When you supply a configuration for an Instance group in a running cluster, the Unravel sensor properties added by the bootstrap script get overwritten.
Solution
Add Unravel properties along with the new configurations.