1

I'm working on a module to add save/restore checkpoints from tensorflow to google cloud storage from colaboratory (see: https://github.com/mixuala/colab_utils) . My code works from a notebook shell using ipython magic and shell commands. But I discovered that you cannot import these methods from a python module (dooh!) So now I'm trying to convert to python native.

How do I get stdout from `get_ipython().system.raw()? I want to get the same value as:

# ipython shell command
!gsutil ls $bucket_path

I tried to use get_ipython().system_raw() but I am not getting a value from stdout.

  bucket = "my-bucket"
  bucket_path = "gs://{}/".format(bucket)
  retval = get_ipython().system_raw("gsutil ls {}".format(bucket_path))
  print(bucket_path, gsutil_ls)
  # BUG: get_ipython().system_raw) returns None 
  #     retval != !gsutil ls $bucket_path
  if "BucketNotFoundException" in gsutil_ls[0]:
    raise ValueError("ERROR: GCS bucket not found, path={}".format(bucket_path))



  # retval == None

is there a better way to do this?

[SOLVED]

here is a better way to do it based on the answer below:

from google.cloud import storage

def gsutil_ls(bucket_name, project_id):
  client = storage.Client( project=project_id )
  bucket_path = "gs://{}/".format(bucket_name)

  bucket = client.get_bucket(bucket_name)
  files = ["{}{}".format(bucket_path,f.name) for f in bucket.list_blobs() ]
  # print(files)
  return files


bucket_name = "my-bucket" 
gsutil_ls(bucket_name, "my-project")
# same as `!gsutil ls  "gs://{}/".format(bucket_name) -p "my-project"` 
4

2 回答 2

2

found it.

result = get_ipython().getoutput(cmd, split=True)

see: https://github.com/ipython/ipython/blob/master/IPython/core/interactiveshell.py

于 2018-02-11T05:33:43.597 回答
1

I would recommend to use the Google Cloud Python Client Libraries for Cloud Storage. These libraries are used to interact with Google Cloud Platform services, and they are written in a set of different coding languages. You can find a detailed documentation for Cloud Storage's Client Library in this page, but I also coded a small sample code for you, that returns the same content as from the gsutil ls <YOUR_BUCKET> command that you are trying to work with.

from google.cloud import storage

client = storage.Client()
bucket_name = "<YOUR_BUCKET_NAME>"
bucket_path = "gs://{}/".format(bucket_name)

bucket = client.get_bucket(bucket_name)
blobs = list(bucket.list_blobs())
for blob in blobs:
    print("{}{}".format(bucket_path,blob.name))

The output of running this code is the following:

gs://<YOUR_BUCKET_NAME>/file_1.png
gs://<YOUR_BUCKET_NAME>/file_2.png
gs://<YOUR_BUCKET_NAME>/file_3.png

Which is the same result as running gsutil ls <YOUR_BUCKET>, so maybe you can develop from that point. In any case, I would strongly opt for Cloud Storage Client Libraries, as all (or most) functionalities are available through them, and they can make your life easier when trying to make API calls from a script.

于 2018-02-05T14:47:20.433 回答