Python
Storage#
Cloud Storage in 10 seconds#
Install the library#
The source code for the library (and demo code) lives on GitHub, You can install the library quickly with pip:
$ pip install gcloud
Run the demo#
In order to run the demo, you need to have registred an actual gcloud project and so you’ll need to provide some environment variables to facilitate authentication to your project:
- GCLOUD_TESTS_PROJECT_ID: Developers Console project ID (e.g. bamboo-shift-455).
- GCLOUD_TESTS_DATASET_ID: The name of the dataset your tests connect to. This is typically the same as GCLOUD_TESTS_PROJECT_ID.
- GOOGLE_APPLICATION_CREDENTIALS: The path to a JSON key file; see regression/app_credentials.json.sample as an example. Such a file can be downloaded directly from the developer’s console by clicking “Generate new JSON key”. See private key docs for more details.
Run the example script included in the package:
$ python -m gcloud.storage.demo
And that’s it! You should be walking through a demonstration of using gcloud.storage to read and write data to Google Cloud Storage.
Try it yourself#
You can interact with a demo dataset in a Python interactive shell.
Start by importing the demo module and instantiating the demo connection:
>>> from gcloud.storage import demo
>>> connection = demo.get_connection()
Once you have the connection, you can create buckets and blobs:
>>> connection.get_all_buckets()
[<Bucket: ...>, ...]
>>> bucket = connection.create_bucket('my-new-bucket')
>>> print bucket
<Bucket: my-new-bucket>
>>> blob = bucket.new_blob('my-test-file.txt')
>>> print blob
<Blob: my-new-bucket, my-test-file.txt>
>>> blob = blob.upload_from_string('this is test content!')
>>> print blob.download_as_string()
'this is test content!'
>>> print bucket.get_all_blobs()
[<Blob: my-new-bucket, my-test-file.txt>]
>>> blob.delete()
>>> bucket.delete()
Note
The get_connection method is just a shortcut for:
>>> from gcloud import storage
>>> from gcloud.storage import demo
>>> connection = storage.get_connection(demo.PROJECT_ID)
gcloud.storage#
Shortcut methods for getting set up with Google Cloud Storage.
You’ll typically use these to get started with the API:
>>> import gcloud.storage
>>> bucket = gcloud.storage.get_bucket('bucket-id-here', 'project-id')
>>> # Then do other things...
>>> blob = bucket.get_blob('/remote/path/to/file.txt')
>>> print blob.download_as_string()
>>> blob.upload_from_string('New contents!')
>>> bucket.upload_file('/remote/path/storage.txt', '/local/path.txt')
The main concepts with this API are:
- gcloud.storage.connection.Connection which represents a connection between your machine and the Cloud Storage API.
- gcloud.storage.bucket.Bucket which represents a particular bucket (akin to a mounted disk on a computer).
- gcloud.storage.blob.Blob which represents a pointer to a particular entity in Cloud Storage (akin to a file path on a remote machine).
- gcloud.storage.__init__.get_bucket(bucket_name, project)[source]#
- Shortcut method to establish a connection to a particular bucket. - You’ll generally use this as the first call to working with the API: - >>> from gcloud import storage >>> bucket = storage.get_bucket(project, bucket_name) >>> # Now you can do things with the bucket. >>> bucket.exists('/path/to/file.txt') False - Parameters: - bucket_name (string) – The id of the bucket you want to use. This is akin to a disk name on a file system.
- project (string) – The name of the project to connect to.
 - Return type: - Returns: - A bucket with a connection using the provided credentials. 
- gcloud.storage.__init__.get_connection(project)[source]#
- Shortcut method to establish a connection to Cloud Storage. - Use this if you are going to access several buckets with the same set of credentials: - >>> from gcloud import storage >>> connection = storage.get_connection(project) >>> bucket1 = connection.get_bucket('bucket1') >>> bucket2 = connection.get_bucket('bucket2') - Parameters: - project (string) – The name of the project to connect to. - Return type: - gcloud.storage.connection.Connection - Returns: - A connection defined with the proper credentials. 
- gcloud.storage.__init__.set_default_bucket(bucket=None)[source]#
- Set default bucket either explicitly or implicitly as fall-back. - In implicit case, currently only supports enviroment variable but will support App Engine, Compute Engine and other environments in the future. - In the implicit case, relies on an implicit connection in addition to the implicit bucket name. - Local environment variable used is: - GCLOUD_BUCKET_NAME - Parameters: - bucket (gcloud.storage.bucket.Bucket) – Optional. The bucket to use as default. 
- gcloud.storage.__init__.set_default_connection(project=None, connection=None)[source]#
- Set default connection either explicitly or implicitly as fall-back. - Parameters: - project (string) – Optional. The name of the project to connect to.
- connection (gcloud.storage.connection.Connection) – A connection provided to be the default.
 
- gcloud.storage.__init__.set_default_project(project=None)[source]#
- Set default bucket name either explicitly or implicitly as fall-back. - In implicit case, currently only supports enviroment variable but will support App Engine, Compute Engine and other environments in the future. - Local environment variable used is: - GCLOUD_PROJECT - Parameters: - project (string) – Optional. The project name to use as default. 
- gcloud.storage.__init__.set_defaults(bucket=None, project=None, connection=None)[source]#
- Set defaults either explicitly or implicitly as fall-back. - Uses the arguments to call the individual default methods. - Parameters: - bucket (gcloud.storage.bucket.Bucket) – Optional. The bucket to use as default.
- project (string) – Optional. The name of the project to connect to.
- connection (gcloud.storage.connection.Connection) – Optional. A connection provided to be the default.
 
Connections#
Create / interact with gcloud storage connections.
- class gcloud.storage.connection.Connection(project, *args, **kwargs)[source]#
- Bases: gcloud.connection.Connection - A connection to Google Cloud Storage via the JSON REST API. - This defines Connection.api_request() for making a generic JSON API request and most API requests are created elsewhere (e.g. in gcloud.storage.bucket.Bucket and gcloud.storage.blob.Blob). - Methods for getting, creating and deleting individual buckets as well as listing buckets associated with a project are defined here. This corresponds to the “storage.buckets” resource in the API. - See gcloud.connection.Connection for a full list of parameters. This subclass differs only in needing a project name (which you specify when creating a project in the Cloud Console). - A typical use of this is to operate on gcloud.storage.bucket.Bucket objects: - >>> from gcloud import storage >>> connection = storage.get_connection(project) >>> bucket = connection.create_bucket('my-bucket-name') - You can then delete this bucket: - >>> bucket.delete() >>> # or >>> connection.delete_bucket(bucket.name) - If you want to access an existing bucket: - >>> bucket = connection.get_bucket('my-bucket-name') - You can also iterate through all gcloud.storage.bucket.Bucket objects inside the project: - >>> for bucket in connection.get_all_buckets(): >>> print bucket <Bucket: my-bucket-name> - Parameters: - project (string) – The project name to connect to. - API_URL_TEMPLATE = '{api_base_url}/storage/{api_version}{path}'#
- A template for the URL of a particular API call. 
 - API_VERSION = 'v1'#
- The version of the API, used in building the API call’s URL. 
 - api_request(method, path, query_params=None, data=None, content_type=None, api_base_url=None, api_version=None, expect_json=True)[source]#
- Make a request over the HTTP transport to the Cloud Storage API. - You shouldn’t need to use this method, but if you plan to interact with the API using these primitives, this is the correct one to use... - Parameters: - method (string) – The HTTP method name (ie, GET, POST, etc). Required.
- path (string) – The path to the resource (ie, '/b/bucket-name'). Required.
- query_params (dict) – A dictionary of keys and values to insert into the query string of the URL. Default is empty dict.
- data (string) – The data to send as the body of the request. Default is the empty string.
- content_type (string) – The proper MIME type of the data provided. Default is None.
- api_base_url (string) – The base URL for the API endpoint. Typically you won’t have to provide this. Default is the standard API base URL.
- api_version (string) – The version of the API to call. Typically you shouldn’t provide this and instead use the default for the library. Default is the latest API version supported by gcloud-python.
- expect_json (boolean) – If True, this method will try to parse the response as JSON and raise an exception if that cannot be done. Default is True.
 - Raises: - Exception if the response code is not 200 OK. 
 - build_api_url(path, query_params=None, api_base_url=None, api_version=None, upload=False)[source]#
- Construct an API url given a few components, some optional. - Typically, you shouldn’t need to use this method. - Parameters: - path (string) – The path to the resource (ie, '/b/bucket-name').
- query_params (dict) – A dictionary of keys and values to insert into the query string of the URL.
- api_base_url (string) – The base URL for the API endpoint. Typically you won’t have to provide this.
- api_version (string) – The version of the API to call. Typically you shouldn’t provide this and instead use the default for the library.
- upload (boolean) – True if the URL is for uploading purposes.
 - Return type: - string - Returns: - The URL assembled from the pieces provided. 
 - create_bucket(bucket_name)[source]#
- Create a new bucket. - For example: - >>> from gcloud import storage >>> connection = storage.get_connection(project) >>> bucket = connection.create_bucket('my-bucket') >>> print bucket <Bucket: my-bucket> - This implements “storage.buckets.insert”. - Parameters: - bucket_name (string) – The bucket name to create. - Return type: - gcloud.storage.bucket.Bucket - Returns: - The newly created bucket. - Raises: - gcloud.exceptions.Conflict if there is a confict (bucket already exists, invalid name, etc.) 
 - delete_bucket(bucket_name)[source]#
- Delete a bucket. - You can use this method to delete a bucket by name. - >>> from gcloud import storage >>> connection = storage.get_connection(project) >>> connection.delete_bucket('my-bucket') - If the bucket doesn’t exist, this will raise a gcloud.exceptions.NotFound: - >>> from gcloud.exceptions import NotFound >>> try: >>> connection.delete_bucket('my-bucket') >>> except NotFound: >>> print 'That bucket does not exist!' - If the bucket still has objects in it, this will raise a gcloud.exceptions.Conflict: - >>> from gcloud.exceptions import Conflict >>> try: >>> connection.delete_bucket('my-bucket') >>> except Conflict: >>> print 'That bucket is not empty!' - This implements “storage.buckets.delete”. - Parameters: - bucket_name (string) – The bucket name to delete. 
 - get_all_buckets()[source]#
- Get all buckets in the project. - This will not populate the list of blobs available in each bucket. - You can also iterate over the connection object, so these two operations are identical: - >>> from gcloud import storage >>> connection = storage.get_connection(project) >>> for bucket in connection.get_all_buckets(): >>> print bucket - This implements “storage.buckets.list”. - Return type: - list of gcloud.storage.bucket.Bucket objects. - Returns: - All buckets belonging to this project. 
 - get_bucket(bucket_name)[source]#
- Get a bucket by name. - If the bucket isn’t found, this will raise a gcloud.storage.exceptions.NotFound. - For example: - >>> from gcloud import storage >>> from gcloud.exceptions import NotFound >>> connection = storage.get_connection(project) >>> try: >>> bucket = connection.get_bucket('my-bucket') >>> except NotFound: >>> print 'Sorry, that bucket does not exist!' - This implements “storage.buckets.get”. - Parameters: - bucket_name (string) – The name of the bucket to get. - Return type: - gcloud.storage.bucket.Bucket - Returns: - The bucket matching the name provided. - Raises: - gcloud.exceptions.NotFound 
 
Iterators#
Iterators for paging through API responses.
These iterators simplify the process of paging through API responses where the response is a list of results with a nextPageToken.
To make an iterator work, just override the get_items_from_response method so that given a response (containing a page of results) it parses those results into an iterable of the actual objects you want:
class MyIterator(Iterator):
  def get_items_from_response(self, response):
    items = response.get('items', [])
    for item in items:
      yield MyItemClass(properties=item, other_arg=True)
You then can use this to get all the results from a resource:
>>> iterator = MyIterator(...)
>>> list(iterator)  # Convert to a list (consumes all values).
Or you can walk your way through items and call off the search early if you find what you’re looking for (resulting in possibly fewer requests):
>>> for item in MyIterator(...):
>>>   print item.name
>>>   if not item.is_valid:
>>>     break
- class gcloud.storage.iterator.Iterator(connection, path, extra_params=None)[source]#
- Bases: object - A generic class for iterating through Cloud Storage list responses. - Parameters: - connection (gcloud.storage.connection.Connection) – The connection to use to make requests.
- path (string) – The path to query for the list of items.
 - PAGE_TOKEN = 'pageToken'#
 - RESERVED_PARAMS = frozenset(['pageToken'])#
 - get_items_from_response(response)[source]#
- Factory method called while iterating. This should be overriden. - This method should be overridden by a subclass. It should accept the API response of a request for the next page of items, and return a list (or other iterable) of items. - Typically this method will construct a Bucket or a Blob from the page of results in the response. - Parameters: - response (dict) – The response of asking for the next page of items. - Return type: - iterable - Returns: - Items that the iterator should yield. 
 - get_next_page_response()[source]#
- Requests the next page from the path provided. - Return type: - dict - Returns: - The parsed JSON response of the next page’s contents. 
 - get_query_params()[source]#
- Getter for query parameters for the next request. - Return type: - dict - Returns: - A dictionary of query parameters.