1 posts
This article contains high level information about the cloud shell command to communicate with Cloud Datawarehouse( BigQuery)
Requirements: Project should be created in GCP and must Initialize the cloud shell/Install the cloud SDK.
Either you can use the command line installed (Cloud SDK) on machine or Cloud Shell
bq is a python-based, command-line tool for BigQuery
bq help will gives you complete details of commands that you can use with bq
bq help command : will gives you description about the given command
USAGE: bq [--global_flags] <command> [--command_flags] [args]
Creating datasets:
you can create the datasets/table using bq mk command
bq --location=[LOCATION] mk --dataset --default_table_expiration [INTEGER] --description [DESCRIPTION] [PROJECT_ID]:[DATASET]
ex: bq mk new_dataset.new_table
bq --location=EU mk -d --default_table_expiration 3600 --description "This is my new dataset." [My First Project]:testing_dataset
listing the objects under the dataset :
ex:
satya_apiiit@learning-platform-project:~$ bq ls learning-platform-project:new_dataset
tableId Type Labels Time Partitioning
------------ ------- -------- -------------------
firsttable TABLE
new_table TABLE
bq show gives you the metadata level details about dataset/tables
ex:
satya_apiiit@learning-platform-project:~$ bq show learning-platform-project:new_dataset
Dataset learning-platform-project:new_dataset
Last modified ACLs Labels
----------------- -------------------------- --------
08 Sep 19:52:15 Owners:
projectOwners,
satya.apiiit@gmail.com
Writers:
projectWriters
Readers:
projectReaders
Creating a Table:
We can create the table in following ways:
1. Creating an empty table with a schema definition
satya_apiiit@learning-platform-project:~$ bq mk new_dataset.firsttable name:string,value:integer
Table 'learning-platform-project:new_dataset.firsttable' successfully created.
satya_apiiit@learning-platform-project:~$ bq show new_dataset.firsttable
Table learning-platform-project:new_dataset.firsttable
Last modified Schema Total Rows Total Bytes Expiration Time Partitioning Labels
----------------- ------------------- ------------ ------------- ------------ ------------------- --------
08 Sep 20:05:11 |- name: string 0 0
|- value: integer
2. Creating a table from a query result
To create a table from a query result, write the results to a destination table.
bq --location=[LOCATION] query --destination_table [PROJECT_ID]:[DATASET].[TABLE] --use_legacy_sql=false '[QUERY]'
ex:
bq --location=US query --destination_table new_dataset.mytable --use_legacy_sql=false 'SELECT name,number
FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE gender = "M" ORDER BY number DESC'
3. Loading data into table from csv file:
In the Cloud Shell, create an empty CSV file.
ex: touch customer_transactions.csv
open the csv file, edit & enter the data in it
ex: cloudshell edit customer_transactions.csv
above command will open a new browser window with a code editor and Cloud Shell panel.
enter some comma-separated values to load into BigQuery as below.
ID,Zipcode,Timestamp,Amount,Feedback,SKU
c123,78757,2018-02-14 17:01:39Z,1.20,4.7,he4rt5
c456,10012,2018-03-14 15:09:26Z,53.60,3.1,ppiieee
c123,78741,2018-04-01 05:59:47Z,5.98,2.0,ch0c0
Save the file in CSV format and run the below command to load the data from CSV file to table namely customer_transactions_new
ex:
satya_apiiit@learning-platform-project:~$ bq load --skip_leading_rows=1 new_dataset.customer_transactions_new ./customer_transactions.csv
id:string,zip:string,ttime:timestamp,amount:numeric,fdbk:float,sku:string
Upload complete. Waiting on bqjob_r5bd883f92d4db78d_00000165bd68d959_1 ... (0s) Current status: DONE
Querying the table :
satya_apiiit@learning-platform-project:~$ bq query 'select count(*) from [bigquery-public-data.samples.shakespeare]'
Waiting on bqjob_r4863e503ca3d0eca_00000165b9c09bf8_1 ... (0s) Current status: DONE
Limiting the row counts:
Command : bq head --max_rows [INTEGER1] --start_row [INTEGER2] --selected_fields "[FIELDS]" [PROJECT_ID]
following commands list all fields in the first 5 rows
bq head --max_rows=5 bigquery-public-data:medicare.inpatient_charges_2011
following commands list two fields in the first 3 rows
bq head --max_rows=3 --selected_fields "provider_id,provider_name" bigquery-public-data:medicare.inpatient_charges_2011
bq head -n 3 --selected_fields "provider_id ,provider_name" bigquery-public-data:medicare.inpatient_charges_2011
+-------------+-------------------------------------------------------------+
| provider_id | provider_name |
+-------------+----------------------------------+--------------------------+
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
+-------------+----------------------------------+--------------------------+
following command to list two fields from the 11th row(skip first 10 rows)
bq head --start_row=10 --selected_fields "provider_id,provider_name" bigquery-public-data:medicare.inpatient_charges_2011
following command to list two fields of five rows from the 11th row
bq head --start_row=10 --max_rows=5 --selected_fields "provider_id,provider_name" bigquery-public-data:medicare.inpatient_charges_2011
+-------------+----------------------------------+-----------------------+
| provider_id | provider_name |
+-------------+----------------------------------+-----------------------+
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
| 20001 | PROVIDENCE ALASKA MEDICAL CENTER |
+-------------+----------------------------------+-----------------------+
Updating table properties:
ex:
satya_apiiit@learning-platform-project:~$ bq update --description "This is firt table in dataset" new_dataset.firsttable
Table 'learning-platform-project:new_dataset.firsttable' successfully updated.
Getting dataset metadata using meta-tables:
BigQuery offers some special tables whose contents represent metadata, such as the list of tables and views in a dataset. The "meta-tables" are read-only. To access metadata about the tables and views in a dataset, use the __TABLES_SUMMARY__ meta-table in a query's SELECT statement.
ex:
satya_apiiit@learning-platform-project:~$ bq query 'select * from new_dataset.__TABLES_SUMMARY__'
Waiting on bqjob_r63a4891ae31e2519_00000165ba5d96bd_1 ... (0s) Current status: DONE
+---------------------------+-------------+------------+------------+--------------------+---------+
| project_id | dataset_id | table_id | creation_time | type |
+---------------------------+-------------+------------+------------+--------------------+----------+
| learning-platform-project | new_dataset | firsttable |1536417311869 | 1 |
| learning-platform-project | new_dataset | newview |1536417475416 | 2 |
+---------------------------+-------------+------------+------------+--------------------+----------+
copying the tables:
copying a single table
satya_apiiit@learning-platform-project:~$ bq cp new_dataset.mytable new_dataset.newtable1
Waiting on bqjob_r6208d23cc5e24855_00000165bd01499d_1 ... (0s) Current status: DONE
Tables 'learning-platform-project:new_dataset.mytable' successfully copied to 'learning-platform-project:new_dataset.newtable1'
The -n shortcut is used to prevent overwriting a table with the same name.
satya_apiiit@learning-platform-project:~$ bq cp -a new_dataset.mytable new_dataset.firsttable
Waiting on bqjob_r747380c9a4ea12ca_00000165bd0bfc01_1 ... (0s) Current status: DONE
Tables 'learning-platform-project:new_dataset.mytable' successfully copied to 'learning-platform-project:new_dataset.firsttable'
The -a shortcut is used to append to the destination table.
satya_apiiit@learning-platform-project:~$ bq cp -n new_dataset.mytable new_dataset.firsttable
Table 'learning-platform-project:new_dataset.firsttable' already exists, skipping
copying multiple tables
satya_apiiit@learning-platform-project:~$ bq cp new_dataset.mytable,new_dataset.mytable1 new_dataset.copytable
Waiting on bqjob_r6adee27a4f495748_00000165bd14738b_1 ... (0s) Current status: DONE
Tables 'learning-platform-project:new_dataset.mytable, learning-platform-project:new_dataset.mytable1' successfully copied to 'learning-platform-project:new_dataset.copytable'
Deleting the tables:
satya_apiiit@learning-platform-project:~$ bq rm new_dataset.mytable1
rm: remove table 'learning-platform-project:new_dataset.mytable1'? (y/N) y
The command uses the -f shortcut to bypass confirmation.
satya_apiiit@learning-platform-project:~$ bq rm -f new_dataset.mytable1
Creating a View:
satya_apiiit@learning-platform-project:~$ bq mk --view='select 1 as num' new_dataset.newview
View 'learning-platform-project:new_dataset.newview' successfully created.
satya_apiiit@learning-platform-project:~$ bq show new_dataset.newview
Table learning-platform-project:new_dataset.newview
Last modified Schema Type Expiration Labels
----------------- ---------------------------- ------ ------------ --------
08 Sep 20:07:55 |- num: integer (required) VIEW
satya_apiiit@learning-platform-project:~$ bq mk --use_legacy_sql=false --expiration 3600 --description "This is my view" --label organization:devel
opment --view 'SELECT name,number FROM `bigquery-public-data.usa_names.usa_1910_current` WHERE gender = "M" ORDER BY number DESC' new_dataset.myview
View 'learning-platform-project:new_dataset.myview' successfully created.
Loading the data into big query table from cloud storage(bucket) :
bq load --source_format=CSV \ new_dataset.names_2011 \ gs://bucket_command/babynames/yob2011.txt \ name:string,gender:string,count:integer
here we are loading the yob2011.txt data into the table names_2011 with the schema mentioned in the command
we can load the multiple files in one go as mentioned below:
bq load --source_format=CSV \ new_dataset.all_names \ gs://bucket_command/babynames/yob*.txt \ name:str
ing,gender:string,count:integer
Exporting a CSV file on cloud storage bucket from big query table:
bq extract bigquery-public-data:samples.shakespeare gs://bucket_command/export/all_names.csv
Waiting on bqjob_r3eaa1d58a72e2b67_00000165b9d92e8c_1 ... (0s) Current status: DONE
Working in interactive mode:
you can write commands in interactive mode using 'bq shell'.
Adavantage: you dont need to prefix the commands with bq
ex:
satya_apiiit@learning-platform-project:~$ bq shell
Welcome to BigQuery! (Type help for more information.)
learning-platform-project> show new_dataset
Dataset learning-platform-project:new_dataset
Last modified ACLs Labels
----------------- -------------------------- --------
08 Sep 19:52:15 Owners:
projectOwners,
satya.apiiit@gmail.com
Writers:
projectWriters
Readers:
projectReaders
to come out of interactive mode you can use exit command
learning-platform-project> exit
!! Thanks for Reading ! Keep watching for more topics on Google cloud platform !!
Please log in to leave a comment.