The report provides a detailed analysis of HDFS file usage.
Click Generate Reports > New button.
In the New Report dialog box, enter the following details.
Click OK. The generated reports will be listed under Reports on the App UI.
Select the generated report and then click Run. After the report is successfully run, the details of the report runs are listed in the Run box on the right.
Click the following:
HTML files link to view the report details.
Input parameters link to view the parameters you chose to run the report.
Log file link to view the logs of the report.
The HDFS Usage Analysis report displays the following sections:
This section provides the total HDFS file size.

This section provides the breakdown of hot, cold, and warm files based on their corresponding accesses.
Hot: Files that are often accessed are considered as hot.
Warm: Files that are not accessed for a while are considered as warm.
Cold: Files that are no longer used, or files that need to be archived are considered as cold.


You can do the following in this section:
View the pie chart and table details, which analyze the type of usage (hot, warm, and cold) of the HDFS files.
In the Search box, of the table enter partially or entirely the file path to find the usage details of a specific HDFS file in a cluster.
Click the Filter Columns button and select the columns that you want to be listed in the Usage analysis table.
Click the Download CSV button to export the details of the HDFS files usage analysis in a CSV format.
This section provides the details of the top K storage-intensive users. The details are also plotted on a bar graph.


You can do the following in this section:
View the bar chart, which plots the top storage intensive user, and the table with details of each user and their corresponding usage of the HDFS files.
In the Search box, of the table enter partially or entirely the username to find the usage details of a specific user.
Click the Filter Columns button and select the columns that you want to be listed in the TopK table.
Click the Download CSV button to export the TopK details of the files usage by the user in a CSV format.
This section provides the details of the top K storage-intensive files. The details are also plotted on a bar graph.


You can do the following in this section:
View the bar chart, which plots the top storage-intensive files, and the table with details of the HDFS files.
In the Search box, of the table enter partially or entirely the file path to find the file size of a specific file.
Click the Filter Columns button and select the columns that you want to be listed in the TopK table.
Click the Download CSV button to export the TopK details of the files in a CSV format.