Install Unravel for GCP BigQuery - VM Authentication method
Unravel can be installed on a GCP instance and configured to monitor jobs, datasets, tables, and data with a service account located outside the Google Cloud environment. This section outlines the steps for installing Unravel for GCP BigQuery using the VM authentication method.
Follow the instructions to install and set up Unravel to receive BigQuery data.
Installing Unravel on the GCP Bigquery VM instance
Do the following to set up Unravel on the GCP VM.
From your GCP console, go to the Compute Engine dashboard and click Create Instance.
Select the following options based on Unravel's instance requirements:
Base OS
Instance type and size
Ports
Networking
The instance must be HTTPS and publicly accessible.
Firewall rules or policies
Sample inbound rule Type
Protocol
Port range
Source
All traffic
All
All
For example, 10.10.0.0/16
SSH
TCP
22
0.0.0.0/0 or trusted public IP for SSH access
Custom TCP Rule
TCP
3000
Custom TCP Rule
TCP
4043
TLS
TCP
4443
Sample outbound rule Type
Protocol
Port range
Source
All traffic
All
All
0.0.0.0/0
Note
The GCP VM should have all TCP access to the BigQuery cluster (server/parent or worker) nodes. You can grant access by inserting adding firewall rules of the BigQuery server/parent and worker with all TCP, all port ranges.
While creating the GCP VM, add the Firewall properties, Enable the HTTP and HTTPS traffic Go to Network tab, and add Network tags. (This is the firewall rule that is already created.)
Disable
selinux
.sudo setenforce Permissive
Edit
/etc/selinux/config
to ensure the setting persists after reboot and ensureSELINUX=permissive
.sudo vi /etc/selinux/config
Install
libaio.x86_64
,lzop.x86_64
, andntp.x86_64
.sudo yum install -y libaio.x86_64 sudo yum install -y lzop.x86_64 sudo yum install -y ntp.x86_64
Start ntpd and check the system time.
sudo service ntpd start sudo ntpq -p
Create a new Unravel user named unravel.
sudo useradd unravel
Download Unravel onto the VM instance that you have created.
Deploy Unravel on the GCP instance that you have created.
You can also manually install Unravel; refer to Run setup
Note
The HTTPS load balancer for Unravel endpoint must be configured only when using the Push model.
Unravel LR endpoint should be available over a publically accessible HTTPS endpoint to receive messages from BigQuery PubSub. The Load Balancer is an easier and more secure method to push the log messages between the Google Cloud Platform (GCP) and Unravel. Use the following instructions to configure an HTTPS load balancer for Unravel with public endpoint and SSL termination.
You must have the following information handy before you configure the Load Balancer:
Region and Zone where the Unravel VM is running.
Network and Subnet-network where the Unravel VM is running.
A valid SSL certificate in GCP.
Do the following to create a Load Balancer
Create an instance group. Refer to Create a managed instance group for detailed instructions.
In the New unmanaged instance group page, ensure to keep the following items the same as that of Unravel VM.
Location > Region
Location > Zone
Network and Instances > Network
Network and Instances > SubNetwork
Under Port Mapping, enter the following:
Port Name: http4043
Port Number: 4043
Set up an HTTPS Load Balancer. Refer to Set up an HTTPS Load Balancer for detailed instructions. Ensure to do the following:
Under Name, update the name as unravel-loadbalancer.
In Backends > New Backend > Instance groups, select the Unravel instance group that you had created in Step 1.
Under Health check, do the following:
Select Create a health check, and then add the name as unravel-4043-hc
Update the Protocol as HTTP and Port as 4043.
Update the Request Path as /lr/status.
Ensure that Port is set to 443 to allow HTTPS traffic.
After the Load Balancer is created, find the public IP address of the Load Balancer that is mentioned under Frontend section of the Load Balancer. Add the IP address of the Load Balancer to a valid DNS name.
Setting Unravel to receive BigQuery data
Unravel can be set up to automatically create and configure resources in more than 100 projects at a time. You can add projects either with customer-supplied credentials or with Unravel-generated credentials for Unravel monitoring. These can be single projects or multiple projects.
To integrate the BigQuery projects in Unravel, you must create a service account in only the project in which Unravel is hosted. This service account must be attached to the Unravel VM. Custom roles are created for the monitoring and administrator projects with required permissions. IAM binding of the role is done with the principal, which is the service account, in each of the projects.
Unravel ships the following resources, which are required to automatically set up Unravel to receive BigQuery data.
Terraform
This is an open-source software for infrastructure provisioning. The Terraform creates resources on the GCP account and facilitates the smooth integration of Unravel with your cloud platform.
You can choose to either use the Terraform, which is bundled with Unravel installer or edit and use the external Terraform, which is provided separately, independent of the installer.
gcloud CLI
Set of tools to create and manage Google Cloud resources.
BigQuery projects can be set by any one of the following. Ensure to complete the prerequisites before you set up BigQuery projects.
Customer-supplied credentials: For the customer-supplied credentials, the customer must create and provide all the resources. You can either do this manually from the GCP console, or you can use the external Terraform.
Unravel-generated credentials: For projects added with Unravel-generated credentials, all the resources are created and handled by Unravel.
You can configure the BigQuery projects using one of the following methods:
Manually from the GCP. Refer to Add BigQuery projects manually from the GCP console
Using external Terraform. Refer to Add BigQuery projects using external Terraform. All the steps that you perform manually to configure BigQuery projects from the GCP console are automatically handled by the external Terraform.
Stop Unravel.
<Unravel installation directory>/unravel/manager stop
Configure the BigQuery projects.
You can use one of the following methods to configure the BigQuery projects manually:
Add projects.
Attach the service account to Unravel VM manually. To execute this step, you must shut down the VM and restart the VM after attaching the service account to that VM.
On the GCP, go to VM instances and select the VM instance where Unravel is installed.
From More actions, click Stop and confirm. The VM is stopped.
Click the same VM link and then click Edit. The Edit <VM-pagename> page is displayed.
Scroll down to Service accounts section and select the service account that you want to attach to the VM.
Click Save.
From the VM instances page, restart the VM.
Run
<Unravel installation directory>
/unravel/manager config bigquery show to verify. The following output is shown:/opt/unravel/manager config bigquery show -- Running: config bigquery show BigQuery support: Enabled LR endpoint: Default Mode: pull Polling: Default Billing data location: Not configured Authentication mode: vm Project: unravel-prj-4810 Terraform integration: Enabled Project: prj3-394305 Subscription: unravel-bigquery-sub-final Is admin: False Is monitoring: True Integration: True Project: prj4 Subscription: unravel-bigquery-sub-final Is admin: False Is monitoring: True Integration: True Project: prj5-394305 Subscription: unravel-bigquery-sub-final Is admin: False Is monitoring: True Integration: True
Apply the changes.
<Unravel installation directory>
/unravel/manager config applyStart Unravel.
<Unravel installation directory>/unravel/manager start
Verify BigQuery integration
On the GCP console, run test queries from the project integrated with Unravel.
Using a supported web browser, navigate to Unravel URL (For example, https://
<unravel-host>
:3000) and log onto Unravel UI using the credentials.Navigate to Jobs tab and click All in the left panel. The details of the queries run from the GCP console will be listed.
To verify the integration of administrator projects, check the Reservation column from the Projects tab.
Stop Unravel.
<Unravel installation directory>/unravel/manager stop
Configure the BigQuery projects.
Integrate BigQuery projects for Unravel monitoring. This is a mandatory step for Unravel-generated credentials. The following command configures all the projects added Unravel.
Notice
Do not run the following command if you have used external Terraform to automatically create resources.
<Unravel installation directory>
/unravel/manager config bigquery integrateA URL will be provided in the output.
Note
If you want to skip the interactive gcloud authentication by Unravel and handle the gcloud authentication on your own, then run the command as follows:
<Unravel installation directory>
/unravel/manager config bigquery integrate --skip-authorizationAuthenticate gcloud CLI.
On a Google Chrome browser, copy the URL provided in the output, and in the sign-in dialog box, click Allow. Ensure to sign in to the gcloud CLI from the account that is authenticated with the required permissions.
From the Sign in to the glcoud CLI box, click Copy button to copy the authorization code.
Go back to the terminal and paste the authorization code in the Enter authorization code field, and press ENTER. This will run the following actions in the background:
Authenticate the user with Google Cloud.
Configure the required resources on the GCP.
Encrypt the credentials (service account keys) and then integrate them with Unravel.
Integrate all the added BigQuery projects with Unravel.
Securely sign out the end user from the gcloud session.
Attach the service account to Unravel VM manually. To execute this step, you must shut down the VM and restart the VM after attaching the service account to that VM.
On the GCP, go to VM instances and select the VM instance where Unravel is installed.
From More actions, click Stop and confirm. The VM is stopped.
Click the same VM link and then click Edit. The Edit <VM-pagename> page is displayed.
Scroll down to Service accounts section and select the service account that you want to attach to the VM.
Click Save.
From the VM instances page, restart the VM.
Start Unravel. Unravel is stopped as a part of VM restart.
<Unravel installation directory>/unravel/manager start
Run
<Unravel installation directory>
/unravel/manager config bigquery show to verify. The following output is shown:/opt/unravel/manager config bigquery show -- Running: config bigquery show BigQuery support: Enabled LR endpoint: Default Mode: pull Polling: Default Billing data location: Not configured Authentication mode: vm Project: unravel-prj-4810 Terraform integration: Enabled Project: prj3-394305 Subscription: unravel-bigquery-sub-final Is admin: False Is monitoring: True Integration: True Project: prj4 Subscription: unravel-bigquery-sub-final Is admin: False Is monitoring: True Integration: True Project: prj5-394305 Subscription: unravel-bigquery-sub-final Is admin: False Is monitoring: True Integration: True
Apply the changes.
<Unravel installation directory>
/unravel/manager config applyStart Unravel.
<Unravel installation directory>/unravel/manager start
Verify BigQuery integration
On the GCP console, run test queries from the project integrated with Unravel.
Using a supported web browser, navigate to Unravel URL (For example, https://
<unravel-host>
:3000) and log onto Unravel UI using the credentials.Navigate to Jobs tab and click All in the left panel. The details of the queries run from the GCP console will be listed.
To verify the integration of administrator projects, check the Reservation column from the Projects tab.
Note
Only a single billing account is supported.
You must separately configure the billing data. For this, you must enable GCP cloud billing export to a Bigquery table.
The following information is required for integrating the billing data:
Project ID in which billing data is exported. This project ID should be integrated with Unravel for monitoring.
Dataset in which the billing data is exported.
The table in which billing data is exported.
To export billing data for BigQuery monitoring, do the following:
Export Billing Data to a BigQuery DataSet. Refer to Set up Cloud Billing data export to BigQuery for detailed instructions.
Stop Unravel.
<Unravel installation directory>
/unravel/manager stopRun the following command and set the billing info.
<Unravel installation directory>/unravel/manager config bigquery set-billing-data
You are prompted to enter the following:
Project ID where billing export is enabled
Dataset ID where data is getting exported
Name of the table where data is getting exported
Based on your chosen method of resource creation (Manual, external terraform, Unravel managed resources), do the following. Refer to the BigQuery installation document based on the polling and authentication method you have selected for more details.
Manual creation of resources
If you have manually created resources, you should modify the IAM role by adding bigquery.tables.getData permissions to the role.
Creation of resources using external terraform
For the resources created using an external terraform, you should add the project IDs to the input.tfvars file as elements in a list under the billing_project_id keyword and run the terraform apply command.
Unravel managed resources
For Unravel managed resources, run the integrate command. When you run this command, the bigquery.tables.getData permissions and other required permissions are correctly provisioned for the IAM role.
<Unravel installation directory>
/unravel/manager config bigquery integrateFor example: /opt/unravel/manager config bigquery integrate
Note
An error may be shown when running Integrate or terraform apply command with external Terraform scripts. Refer to Error shown when running Integrate or terraform apply command with external Terraform scripts for the solution.
Stop Unravel.
<Unravel installation directory>/unravel/manager stop
Run the following command from the Unravel installation directory.
<Unravel installation directory>/unravel/manager config bigquery unset-billing-data
Based on your chosen method of resource creation (Manual, external terraform, Unravel managed resources), do the following:
Manual creation of resources
If you have manually created resources, you should modify the IAM role by deleting the bigquery.tables.getData permissions from the role.
Creation of resources using external terraform
For the resources created using an external terraform, you should remove the project IDs added to the input.tfvars file as elements in a list under the billing_project_id keyword and run the terraform apply command.
Unravel managed resources
For Unravel managed resources, run the integrate command. The bigquery.tables.getData permission, which is required to set the billing data, is removed from the existing permission list.
<Unravel installation directory>
/unravel/manager config bigquery integrate/opt/unravel/manager config bigquery integrate
Note
An error may be shown when running Integrate or terraform apply command with external Terraform scripts. Refer to Error shown when running Integrate or terraform apply command with external Terraform scripts for the solution.
Apply the changes.
<Unravel installation directory>/unravel/manager config apply
Start Unravel.
<Unravel installation directory>/unravel/manager start
To enable the Data page, include the BigQuery projects you wish to monitor from the Data page. The Data page on Unravel UI can show data for up to 100 BigQuery projects. You can add single projects as well as multiple projects to the Data page.
Also, refer to Remove BigQuery projects from the Data page on Unravel UI.
Add projects to the Data page.
To add single projects to the Data page, run the following:
<Unravel installation directory>
/unravel/manager config bigquery enable-datapage<project-id>
For example: /opt/unravel/manager config bigquery enable-datapage myproject
To add multiple projects to the Data page, run the following:
<Unravel installation directory>
/unravel/manager config bigquery enable-datapage --batch</path/to/project-id-file>
For example: /opt/unravel/manager config bigquery enable-datapage --batch /opt/unravel/project-id-file
Based on your chosen method of resource creation, (Manual, external terraform, Unravel managed resources), do the following:
Manual creation of resources
If you have manually created resources, you should modify the IAM role by adding bigquery.tables.getData permissions to the role.
Creation of resources using external terraform
For the resources created using an external terraform, you should add the project IDs to the input.tfvars file as elements in a list under the datapage_project_ids keyword and run the terraform apply command.
Unravel managed resources
For Unravel managed resources, run the integrate command. The bigquery.tables.getData permission, which is required to enable the Data page, is added to the existing permission list.
<Unravel installation directory>
/unravel/manager config bigquery integrate/opt/unravel/manager config bigquery integrate
Note
An error may be shown when running Integrate or terraform apply command with external Terraform scripts. Refer to Error shown when running Integrate or terraform apply command with external Terraform scripts for the solution.
Run
<Unravel installation directory>
/unravel/manager config bigquery show to verify if the project IDs are enabled for the Data page. The following sample output is shown:API Pull/Push polling method
/opt/unravel/manager config bigquery show -- Running: config bigquery show BigQuery support: Enabled Data access: Mode: pull Polling (seconds): Default Billing data location: Project id: unravel-flat-rate-test Dataset id: all_billing_data Table name: gcp_billing_export_resource_v1_016A85_0733E3_979331 Authentication: Mode: multi Project: bq-test-project Terraform integration: Enabled Projects: test-admin: Is admin: True Is monitoring: False Last modified: 2023-11-10 12:21:49 test-mon1: Subscription: unravel-bigquery-sub Is admin: False Is monitoring: True Last modified: 2023-11-10 12:22:05 test-mon2: Subscription: unravel-bigquery-sub Is admin: False Is monitoring: True Last modified: 2023-11-10 12:22:12 test-res: Subscription: unravel-bigquery-sub Is admin: True Is monitoring: True Last modified: 2023-11-10 12:21:57 unravel-flat-rate-test: Subscription: mallik-test Is admin: False Is monitoring: True Last modified: 2023-11-10 12:22:30 Datapage projects: 1 out of 100. -- OK
INFORMATION_SCHEMA based polling method
`/opt/unravel/manager config bigquery show -- Running: config bigquery show BigQuery support: Enabled Data access: Mode: schema Polling (seconds): 300 Polling threads: 5 Lookback days: 90 Max delay (seconds): 1800 Locations: US Billing data location: Project id: unravel-flat-rate-test Dataset id: all_billing_data Table name: gcp_billing_export_resource_v1_016A85_0733E3_979331 Authentication: Mode: vm Project: bq-test-project Terraform integration: Enabled Projects: test-admin: Is admin: True Is monitoring: False Last modified: 2023-11-10 10:31:56 test-mon1: Is admin: False Is monitoring: True Last modified: 2023-11-10 10:32:14 test-mon2: Is admin: False Is monitoring: True Last modified: 2023-11-10 10:32:22 test-res: Is admin: True Is monitoring: True Last modified: 2023-11-10 10:32:07 unravel-flat-rate-test: Is admin: False Is monitoring: True Last modified: 2023-11-10 10:32:27 Datapage projects: 1 out of 100. -- OK
Note
When you remove a BigQuery project from the Data page, then the associated data also gets deleted from OpenSearch.
Stop Unravel.
<Unravel installation directory>/unravel/manager stop
Run the following command from the Unravel installation directory.
For single project
<Unravel installation directory>/unravel/manager config bigquery delete-datapage
<project-ID>
For example: /opt/unravel/manager config bigquery delete-datapage my-project
For multiple projects
<Unravel installation directory>/unravel/manager config bigquery delete-datapage --batch
</path/to/project-id-file>
For example: /opt/unravel/manager config bigquery delete-datapage --batch /opt/unravel/my-projects.txt
Based on your chosen method of resource creation (Manual, external terraform, Unravel managed resources), do the following:
Manual creation of resources
If you have manually created resources, you should modify the IAM role by deleting the bigquery.tables.getData permissions from the role.
Creation of resources using external terraform
For the resources created using an external terraform, you should remove the project IDs added to the input.tfvars file as elements in a list under the datapage_project_ids keyword and run the terraform apply command.
Unravel managed resources
For Unravel managed resources, run the integrate command. The bigquery.tables.getData permission, which is required to enable the Data page, is removed from the existing permission list.
<Unravel installation directory>
/unravel/manager config bigquery integrate/opt/unravel/manager config bigquery integrate
Note
An error may be shown when running Integrate or terraform apply command with external Terraform scripts. Refer to Error shown when running Integrate or terraform apply command with external Terraform scripts for the solution.
Apply the changes.
<Unravel installation directory>/unravel/manager config apply
Start Unravel.
<Unravel installation directory>/unravel/manager start
Note
When you remove a BigQuery project from Unravel, then the associated data also gets deleted from OpenSearch.
You can perform the following steps to remove BigQuery projects from Unravel. In case you have integrated BigQuery projects using Terraform, then refer Disintegrating GCP projects with Unravel
Stop Unravel.
<Unravel installation directory>/unravel/manager stop
Run the following command from the Unravel installation directory.
For single project
<Unravel installation directory>/unravel/manager config bigquery remove
<project-ID>
For example: /opt/unravel/manager config bigquery remove my-project
For multiple projects
<Unravel installation directory>/unravel/manager config bigquery remove --batch
</path/to/project-id-file>
For example: /opt/unravel/manager config bigquery remove --batch /opt/unravel/my-projects.txt
Integrate BigQuery projects for Unravel monitoring. This is a mandatory step for Unravel-generated credentials. The following command configures all the projects added for Unravel monitoring at once i.e., projects with customer-supplied credentials as well as with Unravel-managed credentials.
Notice
Do not run the following command if you have used external Terraform to automatically create resources.
<Unravel installation directory>
/unravel/manager config bigquery integrateA URL will be provided in the output.
Note
If you want to skip the interactive gcloud authentication by Unravel and handle the gcloud authentication on your own, then run the command as follows:
<Unravel installation directory>
/unravel/manager config bigquery integrate --skip-authorizationAuthenticate gcloud CLI.
On a Google Chrome browser, copy the URL provided in the output, and in the sign-in dialog box, click Allow. Ensure to sign in to the gcloud CLI from the account that is authenticated with the required permissions.
From the Sign in to the glcoud CLI box, click Copy button to copy the authorization code.
Go back to the terminal and paste the authorization code in the Enter authorization code field, and press ENTER. This will run the following actions in the background:
Authenticate the user with Google Cloud.
Configure the required resources on the GCP.
Encrypt the credentials (service account keys) and then integrate them with Unravel.
Integrate all the added BigQuery projects with Unravel.
Securely sign out the end user from the gcloud session.
Apply the changes.
<Unravel installation directory>/unravel/manager config apply
Start Unravel.
<Unravel installation directory>/unravel/manager start