google-api - Google Drive API：列出没有父级的文件

Question

我管理的 Google 域中的文件已进入不良状态；根目录中有数千个文件。我想识别这些文件并将它们移动到“我的云端硬盘”下的文件夹中。

当我使用 API 列出这些孤立文件之一的父级时，结果是一个空数组。要确定文件是否为孤立文件，我可以遍历域中的所有文件，并请求每个文件的父级列表。如果列表为空，我知道该文件是孤立的。

但这非常慢。

无论如何使用Drive API来搜索没有父母的文件？

q 参数的“父母”字段似乎对此没有用，因为它只能指定父母列表包含一些 ID。

更新：

我试图找到一种快速的方法来定位真正位于文档层次结构根部的项目。也就是说，他们是“My Drive”的兄弟姐妹，而不是“My Drive”的孩子。

score 6 · Accepted Answer

在 Java 中：

List<File> result = new ArrayList<File>();
Files.List request = drive.files().list();
request.setQ("'root'" + " in parents");

FileList files = null;
files = request.execute();

for (com.google.api.services.drive.model.File element : files.getItems()) {
    System.out.println(element.getTitle());
}

'root' 是父文件夹，如果文件或文件夹在根目录下

score 1 · Accepted Answer

粗鲁，但简单，它的工作原理..

    do {
        try {
            FileList files = request.execute();

            for (File f : files.getItems()) {
                if (f.getParents().size() == 0) {
                        System.out.println("Orphan found:\t" + f.getTitle());

                orphans.add(f);
                }
            }

            request.setPageToken(files.getNextPageToken());
        } catch (IOException e) {
            System.out.println("An error occurred: " + e);
            request.setPageToken(null);
        }
    } while (request.getPageToken() != null
            && request.getPageToken().length() > 0);

score 0 · Accepted Answer

0

尝试在您的查询中使用它：

'root' in parents

于 2016-10-28T15:26:03.977 回答

score 0 · Accepted Answer

前提是：

列出所有文件。
如果文件没有“父母”字段，则表示它是孤立文件。
因此，脚本将它们删除。

在开始之前，您需要：

创建OAuth id
然后，您需要将权限“../auth/drive”添加到您的 OAuth id，并针对 google 验证您的应用程序，以便您拥有删除权限。

准备复制粘贴演示

from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive']

def callback(request_id, response, exception):
    if exception:
        print("Exception:", exception)

def main():
    """
   Description:
   Shows basic usage of the Drive v3 API to delete orphan files.
   """

    """ --- CHECK CREDENTIALS --- """
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    """ --- OPEN CONNECTION --- """
    service = build('drive', 'v3', credentials=creds)

    page_token = ""
    files = None
    orphans = []
    page_size = 100
    batch_counter = 0

    print("LISTING ORPHAN FILES")
    print("-----------------------------")
    while (True):
        # List
        r = service.files().list(pageToken=page_token,
                                 pageSize=page_size,
                                 fields="nextPageToken, files"
                                 ).execute()
        page_token = r.get('nextPageToken')
        files = r.get('files', [])

        # Filter orphans
        # NOTE: (If the file has no 'parents' field, it means it's orphan)
        for file in files:
            try:
                if file['parents']:
                    print("File with a parent found.")
            except Exception as e:
                print("Orphan file found.")
                orphans.append(file['id'])

        # Exit condition
        if page_token is None:
            break

    print("DELETING ORPHAN FILES")
    print("-----------------------------")
    batch_size = min(len(orphans), 100)
    while(len(orphans) > 0):
        batch = service.new_batch_http_request(callback=callback)
        for i in range(batch_size):
            print("File with id {0} queued for deletion.".format(orphans[0]))
            batch.add(service.files().delete(fileId=orphans[0]))
            del orphans[0]
        batch.execute()
        batch_counter += 1
        print("BATCH {0} DELETED - {1} FILES DELETED".format(batch_counter,
                                                             batch_size))


if __name__ == '__main__':
    main()

此方法不会删除根目录中的文件，因为它们具有字段 'parents' 的 'root' 值。如果没有列出您的所有孤立文件，则表示它们正在被 google 自动删除。此过程最多可能需要 24 小时。

score 0 · Accepted Answer

0

文档建议使用以下查询：is:unorganized owner:me.

于 2018-02-24T01:36:38.403 回答

score 0 · Accepted Answer

Adreian Lopez，谢谢你的剧本。它真的为我节省了很多手工工作。以下是我为实现您的脚本所遵循的步骤：

创建了一个文件夹c:\temp\pythonscript\ folder
使用https://console.cloud.google.com/apis/credentials创建了 OAuth 2.0 客户端 ID ，并将凭证文件下载到c:\temp\pythonscript\ folder.
将以上内容重命名client_secret_#######-#############.apps.googleusercontent.com.json为credentials.json
复制 Adreian Lopez 的 python 脚本并将其保存为c:\temp\pythonscript\deleteGoogleDriveOrphanFiles.py
转到 Windows 10 上的“Microsoft Store”并安装 Python 3.8
打开命令提示符并输入：cd c:\temp\pythonscript\
跑pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
运行python deleteGoogleDriveOrphanFiles.py并按照屏幕上的步骤创建c:\temp\pythonscript\token.pickle文件并开始删除孤立文件。这一步可能需要相当长的时间。
验证https://one.google.com/u/1/storage
根据需要再次运行第 8 步。

google-api - Google Drive API：列出没有父级的文件

6 回答 6

Related

Reference