Gcloud::Bigquery::Table

Table

A named resource representing a BigQuery table that holds zero or more records. Every table is defined by a schema that may contain nested and repeated fields. (For more information about nested and repeated fields, see Preparing Data for BigQuery.)

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

schema = {
  "fields" => [
    {
      "name" => "first_name",
      "type" => "STRING",
      "mode" => "REQUIRED"
    },
    {
      "name" => "cities_lived",
      "type" => "RECORD",
      "mode" => "REPEATED",
      "fields" => [
        {
          "name" => "place",
          "type" => "STRING",
          "mode" => "REQUIRED"
        },
        {
          "name" => "number_of_years",
          "type" => "INTEGER",
          "mode" => "REQUIRED"
        }
      ]
    }
  ]
}
table.schema = schema

row = {
  "first_name" => "Alice",
  "cities_lived" => [
    {
      "place": "Seattle",
      "number_of_years": 5
    },
    {
      "place": "Stockholm",
      "number_of_years": 6
    }
  ]
}
table.insert row

Attributes Methods

Public Instance Methods

created_at()

The time when this table was created.

dataset_id()

The ID of the Dataset containing this table.

description()

The description of the table.

description=(new_description)

Updates the description of the table.

etag()

A string hash of the dataset.

expires_at()

The time when this table expires. If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.

fields()

The fields of the table.

headers()

The names of the columns in the table.

location()

The geographic location where the table should reside. Possible values include EU and US. The default value is US.

modified_at()

The date when this table was last modified.

name()

The name of the table.

name=(new_name)

Updates the name of the table.

project_id()

The ID of the Project containing this table.

schema()

The schema of the table.

schema=(new_schema)

Updates the schema of the table.

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

schema = {
  "fields" => [
    {
      "name" => "first_name",
      "type" => "STRING",
      "mode" => "REQUIRED"
    },
    {
      "name" => "age",
      "type" => "INTEGER",
      "mode" => "REQUIRED"
    }
  ]
}
table.schema = schema

table?()

Checks if the table's type is “TABLE”.

table_id()

A unique ID for this table. The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters.

url()

A URL that can be used to access the dataset using the REST API.

view?()

Checks if the table's type is “VIEW”.

Data Methods

Public Instance Methods

bytes_count()

The number of bytes in the table.

copy(destination_table, options = {})

Copies the data from the table to another table.

Parameters

destination_table

The destination for the copied data. (Table)

options

An optional Hash for controlling additional behavior. (Hash)

options[:create]

Specifies whether the job is allowed to create new tables. (String)

The following values are supported:

  • needed - Create the table if it does not exist.

  • never - The table must already exist. A 'notFound' error is raised if the table does not exist.

options[:write]

Specifies how to handle data already present in the destination table. The default value is empty. (String)

The following values are supported:

  • truncate - BigQuery overwrites the table data.

  • append - BigQuery appends the data to the table.

  • empty - An error will be returned if the destination table already contains data.

Returns

Gcloud::Bigquery::CopyJob

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
destination_table = dataset.table "my_destination_table"

copy_job = table.copy destination_table

data(options = {})

Retrieves data from the table.

Parameters

options

An optional Hash for controlling additional behavior. (Hash)

options[:token]

Page token, returned by a previous call, identifying the result set. (String)

options[:max]

Maximum number of results to return. (Integer)

options[:start]

Zero-based index of the starting row to read. (Integer)

Returns

Gcloud::Bigquery::Data

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

data = table.data
data.each do |row|
  puts row["first_name"]
end
more_data = table.data token: data.token

extract(extract_url, options = {})

Extract the data from the table to a Google Cloud Storage file. For more information, see Exporting Data From BigQuery .

Parameters

extract_url

The Google Storage file or file URI pattern(s) to which BigQuery should extract the table data. (Gcloud::Storage::File or String or Array)

options

An optional Hash for controlling additional behavior. (Hash)

options[:format]

The exported file format. The default value is csv. (String)

The following values are supported:

Returns

Gcloud::Bigquery::ExtractJob

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract "gs://my-bucket/file-name.json",
                            format: "json"

insert(rows, options = {})

Inserts data into the table for near-immediate querying, without the need to complete a load operation before the data can appear in query results. See Streaming Data Into BigQuery .

Parameters

rows

A hash object or array of hash objects containing the data. (Array or Hash)

options

An optional Hash for controlling additional behavior. (Hash)

options[:skip_invalid]

Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist. (Boolean)

options[:ignore_unknown]

Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors. (Boolean)

Returns

Gcloud::Bigquery::InsertResponse

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

rows = [
  { "first_name" => "Alice", "age" => 21 },
  { "first_name" => "Bob", "age" => 22 }
]
table.insert rows

load(file, options = {})

Loads data into the table.

Parameters

file

A file or the URI of a Google Cloud Storage file containing data to load into the table. (File or Gcloud::Storage::File or String)

options

An optional Hash for controlling additional behavior. (Hash)

options[:format]

The exported file format. The default value is csv. (String)

The following values are supported:

options[:create]

Specifies whether the job is allowed to create new tables. (String)

The following values are supported:

  • needed - Create the table if it does not exist.

  • never - The table must already exist. A 'notFound' error is raised if the table does not exist.

options[:write]

Specifies how to handle data already present in the table. The default value is empty. (String)

The following values are supported:

  • truncate - BigQuery overwrites the table data.

  • append - BigQuery appends the data to the table.

  • empty - An error will be returned if the table already contains data.

Returns

Gcloud::Bigquery::LoadJob

Examples

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

load_job = table.load "gs://my-bucket/file-name.csv"

You can also pass a gcloud storage file instance.

require "gcloud"
require "gcloud/storage"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

storage = gcloud.storage
bucket = storage.bucket "my-bucket"
file = bucket.file "file-name.csv"
load_job = table.load file

Or, you can upload a smaller file directly. See {Loading Data with a POST Request}[ cloud.google.com/bigquery/loading-data-post-request#multipart].

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

file = File.open "my_data.csv"
load_job = table.load file

rows_count()

The number of rows in the table.

Lifecycle Methods

Public Instance Methods

delete()

Permanently deletes the table.

Returns

true if the table was deleted.

Example

require "gcloud"

gcloud = Gcloud.new
bigquery = gcloud.bigquery
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

table.delete