0

我对 Python 编程相当陌生,到目前为止,我一直是以前的开发人员制作的逆向工程代码,或者我自己拼凑了一些功能。

脚本本身有效;长话短说,它旨在解析 CSV 并(a)创建和/或更新在 CSV 中找到的联系人,以及(b)将联系人正确分配给他们的关联公司。全部使用HubSpot API。为了实现这一点,我还导入了requestscsvmapper

我有以下问题:

  1. 我怎样才能改进这个脚本以使它更pythonic?
  2. 使该脚本在远程服务器上运行的最佳方法是什么,请记住请求和 CSVMapper 可能未安装在该服务器上,并且我很可能没有安装它们的权限-最好的方法是什么要“打包”此脚本,还是将 Requests 和 CSVMapper 上传到服务器?

非常感谢任何建议。

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import print_function
import sys, os.path, requests, json, csv, csvmapper, glob, shutil
from time import sleep
major, minor, micro, release_level, serial =  sys.version_info

# Client Portal ID
portal = "XXXXXX"

# Client API Key

hapikey = "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"

# This attempts to find any file in the directory that starts with "note" and ends with ".CSV"
# Server Version
# findCSV = glob.glob('/home/accountName/public_html/clientFolder/contact*.CSV')

# Local Testing Version
findCSV = glob.glob('contact*.CSV')

for i in findCSV:

    theCSV = i

    csvfileexists = os.path.isfile(theCSV)

    # Prints a confirmation if file exists, prints instructions if it doesn't.

    if csvfileexists:
        print ("\nThe \"{csvPath}\" file was found ({csvSize} bytes); proceeding with sync ...\n".format(csvSize=os.path.getsize(theCSV), csvPath=os.path.basename(theCSV)))
    else:
        print ("File not found; check the file name to make sure it is in the same directory as this script. Exiting ...")
        sys.exit()

    # Begin the CSVmapper mapping... This creates a virtual "header" row - the CSV therefore does not need a header row.

    mapper = csvmapper.DictMapper([
      [
        {'name':'account'}, #"Org. Code"
        {'name':'id'}, #"Hubspot Ref"
        {'name':'company'}, #"Company Name"
        {'name':'firstname'}, #"Contact First Name"
        {'name':'lastname'}, #"Contact Last Name"
        {'name':'job_title'}, #"Job Title"
        {'name':'address'}, #"Address"
        {'name':'city'}, #"City"
        {'name':'phone'}, #"Phone"
        {'name':'email'}, #"Email"
        {'name':'date_added'} #"Last Update"
      ]
    ])

    # Parse the CSV using the mapper
    parser = csvmapper.CSVParser(os.path.basename(theCSV), mapper)

    # Build the parsed object
    obj = parser.buildObject()

    def contactCompanyUpdate():

        # Open the CSV, use commas as delimiters, store it in a list called "data", then find the length of that list.
        with open(os.path.basename(theCSV),"r") as f:
            reader = csv.reader(f, delimiter = ",", quotechar="\"")
            data = list(reader)

            # For every row in the CSV ...
            for row in range(0, len(data)):
                # Set up the JSON payload ...

                payload = {
                            "properties": [
                                {
                                    "name": "account",
                                    "value": obj[row].account
                                },
                                {
                                    "name": "id",
                                    "value": obj[row].id
                                },
                                {
                                    "name": "company",
                                    "value": obj[row].company
                                },
                                {
                                    "property": "firstname",
                                    "value": obj[row].firstname
                                },
                                {
                                    "property": "lastname",
                                    "value": obj[row].lastname
                                },
                                {
                                    "property": "job_title",
                                    "value": obj[row].job_title
                                },
                                {
                                    "property": "address",
                                    "value": obj[row].address
                                },
                                {
                                    "property": "city",
                                    "value": obj[row].city
                                },
                                {
                                    "property": "phone",
                                    "value": obj[row].phone
                                },
                                {
                                    "property": "email",
                                    "value": obj[row].email
                                },
                                {
                                    "property": "date_added",
                                    "value": obj[row].date_added
                                }
                            ]
                        }

                nameQuery = "{first} {last}".format(first=obj[row].firstname, last=obj[row].lastname)

                # Get a list of all contacts for a certain company.
                contactCheck = "https://api.hubapi.com/contacts/v1/search/query?q={query}&hapikey={hapikey}".format(hapikey=hapikey, query=nameQuery)

                # Convert the payload to JSON and assign it to a variable called "data"
                data = json.dumps(payload)

                # Defined the headers content-type as 'application/json'
                headers = {'content-type': 'application/json'}

                contactExistCheck = requests.get(contactCheck, headers=headers)

                for i in contactExistCheck.json()[u'contacts']:

                    # ... Get the canonical VIDs
                    canonicalVid = i[u'canonical-vid']

                    if canonicalVid:
                        print ("{theContact} exists! Their VID is \"{vid}\"".format(theContact=obj[row].firstname, vid=canonicalVid))
                        print ("Attempting to update their company...")
                        contactCompanyUpdate = "https://api.hubapi.com/companies/v2/companies/{companyID}/contacts/{vid}?hapikey={hapikey}".format(hapikey=hapikey, vid=canonicalVid, companyID=obj[row].id)
                        doTheUpdate = requests.put(contactCompanyUpdate, headers=headers)
                        if doTheUpdate.status_code == 200:
                            print ("Attempt Successful! {theContact}'s has an updated company.\n".format(theContact=obj[row].firstname))
                            break
                        else:
                            print ("Attempt Failed. Status Code: {status}. Company or Contact not found.\n".format(status=doTheUpdate.status_code))

    def createOrUpdateClient():

        # Open the CSV, use commas as delimiters, store it in a list called "data", then find the length of that list.
        with open(os.path.basename(theCSV),"r") as f:
            reader = csv.reader(f, delimiter = ",", quotechar="\"")
            data = list(reader)

            # For every row in the CSV ...
            for row in range(0, len(data)):
                # Set up the JSON payload ...

                payloadTest = {
                            "properties": [
                                {
                                    "property": "email",
                                    "value": obj[row].email
                                },
                                {
                                    "property": "firstname",
                                    "value": obj[row].firstname
                                },
                                {
                                    "property": "lastname",
                                    "value": obj[row].lastname
                                },
                                {
                                    "property": "website",
                                    "value": None
                                },
                                {
                                    "property": "company",
                                    "value": obj[row].company
                                },
                                {
                                    "property": "phone",
                                    "value": obj[row].phone
                                },
                                {
                                    "property": "address",
                                    "value": obj[row].address
                                },
                                {
                                    "property": "city",
                                    "value": obj[row].city
                                },
                                {
                                    "property": "state",
                                    "value": None
                                },
                                {
                                    "property": "zip",
                                    "value": None
                                }
                            ]
                        }

                # Convert the payload to JSON and assign it to a variable called "data"
                dataTest = json.dumps(payloadTest)

                # Defined the headers content-type as 'application/json'
                headers = {'content-type': 'application/json'}

                #print ("{theContact} does not exist!".format(theContact=obj[row].firstname))
                print ("Attempting to add {theContact} as a contact...".format(theContact=obj[row].firstname))
                createOrUpdateURL = 'http://api.hubapi.com/contacts/v1/contact/createOrUpdate/email/{email}/?hapikey={hapikey}'.format(email=obj[row].email,hapikey=hapikey)

                r = requests.post(createOrUpdateURL, data=dataTest, headers=headers)

                if r.status_code == 409:
                    print ("This contact already exists.\n")
                elif (r.status_code == 200) or (r.status_code == 202):
                    print ("Success! {firstName} {lastName} has been added.\n".format(firstName=obj[row].firstname,lastName=obj[row].lastname, response=r.status_code))
                elif r.status_code == 204:
                    print ("Success! {firstName} {lastName} has been updated.\n".format(firstName=obj[row].firstname,lastName=obj[row].lastname, response=r.status_code))
                elif r.status_code == 400:
                    print ("Bad request. You might get this response if you pass an invalid email address, if a property in your request doesn't exist, or if you pass an invalid property value.\n")
                else:
                    print ("Contact Marko for assistance.\n")

    if __name__ == "__main__":
        # Run the Create or Update function
        createOrUpdateClient()

        # Give the previous function 5 seconds to take effect.
        sleep(5.0)

        # Run the Company Update function
        contactCompanyUpdate()
        print("Sync complete.")

        print("Moving \"{something}\" to the archive folder...".format(something=theCSV))

        # Cron version
        #shutil.move( i, "/home/accountName/public_html/clientFolder/archive/" + os.path.basename(i))

        # Local version
        movePath = "archive/{thefile}".format(thefile=theCSV)
        shutil.move( i, movePath )

        print("Move successful! Exiting...\n")

sys.exit()
4

1 回答 1

4

我只会从上到下。第一条规则是,做PEP 8中的内容。它不是最终的风格指南,但它肯定是 Python 编码人员的参考基准,而且这更重要,尤其是在你刚开始的时候。第二条规则是,使其可维护。几年后,当其他新来的孩子出现时,她应该很容易弄清楚你在做什么。有时这意味着做事很长,以减少错误。有时它意味着做事短,以减少错误。:-)

#!/usr/bin/env python
# -*- coding: utf-8 -*-

两件事:根据 PEP 8,你得到了正确的编码。

编写好的文档字符串(又名“文档字符串”)的约定在PEP 257 中得到了永生。

你有一个做某事的程序。但是你没有记录什么。

from __future__ import print_function
import sys, os.path, requests, json, csv, csvmapper, glob, shutil
from time import sleep
major, minor, micro, release_level, serial =  sys.version_info

根据 PEP 8:import module每行放置一个语句。

Per Austin:让你​​的段落有单独的主题。您在一些版本信息旁边有一些导入。插入一个空行。另外,对数据做点什么!或者你不需要它就在这里,是吗?

# Client Portal ID
portal = "XXXXXX"

# Client API Key

hapikey = "XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"

你以不止一种方式掩盖了这些。WTF是一个hapikey?我想你的意思是Hubspot_API_key。有什么作用portal

一条建议:一件事越“全球化”,它就应该越“正式”。如果您有 for 循环,则可以调用其中一个变量i。如果您有一段数据在整个函数中使用,请调用它objportal. 但是如果你有一个全局使用的数据,或者是一个类变量,把它放在领带和夹克上,这样每个人都能认出来:用它Hubspot_api_key代替client_api_key. 也许即使Hubspot_client_api_key有不止一个 API。对 .做同样的事情portal

# This attempts to find any file in the directory that starts with "note" and ends with ".CSV"
# Server Version
# findCSV = glob.glob('/home/accountName/public_html/clientFolder/contact*.CSV')

没过多久,评论就变成了谎言。如果它们不是真的,请删除它们。

# Local Testing Version
findCSV = glob.glob('contact*.CSV')

这是您应该为其创建函数的类型。只需创建一个名为“get_csv_files”或其他名称的简单函数,并让它返回一个文件名列表。这使您与 glob 分离,这意味着您可以使测试代码由数据驱动(将文件名列表传递给函数,或将单个文件传递给函数,而不是要求它搜索它们)。此外,这些 glob 模式正是出现在配置文件、全局变量或作为命令行参数传递的那种东西。

for i in findCSV:

我敢打赌,一直用大写字母输入 CSV 会很痛苦。是什么findCSV意思?阅读该行,并找出应该调用该变量的内容。也许csv_files?或者new_contact_files?表明存在事物集合的事物。

    theCSV = i
    csvfileexists = os.path.isfile(theCSV)

现在做i什么?在 BiiiiiiG 循环中,您有这个漂亮的小变量名。这是一个错误,因为如果您不能在一页上看到一个变量的整个范围,它可能需要一个稍长一些的名称。但后来你为它创建了一个别名。两者itheCSV指的是同一件事。而且......我看不到你i再次使用。所以也许你的循环变量应该是theCSV. 或者也许应该是the_csv为了更容易打字。或者只是csvname

    # Prints a confirmation if file exists, prints instructions if it doesn't.

这似乎有点不必要。如果你glob用来获取文件名,它们几乎都会存在。(如果他们不这样做,那是因为它们在您调用glob和尝试打开它们之间被删除。这是可能的,但很少见。只是continue或引发异常,具体取决于。)

    if csvfileexists:
        print ("\nThe \"{csvPath}\" file was found ({csvSize} bytes); proceeding with sync ...\n".format(csvSize=os.path.getsize(theCSV), csvPath=os.path.basename(theCSV)))

在此代码中,您使用 的值csvfileexists。但这是您使用它的唯一地方。在这种情况下,您可能可以将调用移动os.path.isfile()到 if 语句中并摆脱变量。

    else:
        print ("File not found; check the file name to make sure it is in the same directory as this script. Exiting ...")
        sys.exit()

请注意,在这种情况下,当出现实际问题时,您没有打印文件名?那有多大帮助?

另外,还记得您在远程服务器上的位置吗?您应该考虑使用 Python 的logging模块以有用的方式记录这些消息。

# Begin the CSVmapper mapping... This creates a virtual "header" row - the CSV therefore does not need a header row.

mapper = csvmapper.DictMapper([
  [
    {'name':'account'}, #"Org. Code"
    {'name':'id'}, #"Hubspot Ref"
    {'name':'company'}, #"Company Name"
    {'name':'firstname'}, #"Contact First Name"
    {'name':'lastname'}, #"Contact Last Name"
    {'name':'job_title'}, #"Job Title"
    {'name':'address'}, #"Address"
    {'name':'city'}, #"City"
    {'name':'phone'}, #"Phone"
    {'name':'email'}, #"Email"
    {'name':'date_added'} #"Last Update"
  ]
])

您正在使用一堆数据创建一个对象。这将是一个函数的好地方。定义一个make_csvmapper()函数来为您完成所有这些工作,并将其移出界限。

另外,请注意标准csv模块具有您正在使用的大部分功能。我认为您实际上不需要csvmapper.

# Parse the CSV using the mapper
parser = csvmapper.CSVParser(os.path.basename(theCSV), mapper)

# Build the parsed object
obj = parser.buildObject()

这是函数的另一个机会。也许不是制作一个 csv 映射器,你可以只返回obj?

def contactCompanyUpdate():

在这一点上,事情变得可疑。您将这些函数定义缩进,但我认为您不需要它们。这是一个stackoverflow问题,还是你的代码真的看起来像这样?

    # Open the CSV, use commas as delimiters, store it in a list called "data", then find the length of that list.

    with open(os.path.basename(theCSV),"r") as f:

不,显然它真的看起来像这样。因为theCSV当你真的不需要时,你在这个函数内部使用。请考虑使用正式的函数参数,而不是仅仅抓取外部范围的对象。另外,你为什么basename在 csv 文件上使用?如果您使用 获得它glob,它不是已经有了您想要的路径吗?

        reader = csv.reader(f, delimiter = ",", quotechar="\"")
        data = list(reader)

        # For every row in the CSV ...
        for row in range(0, len(data)):

在这里,您被迫data成为从 获取的行列表reader,然后开始对其进行迭代。只需直接迭代reader,例如:for row in reader: 但是等等!您实际上是在您的obj变量中迭代已打开的 CSV 文件。只需选择一个,然后对其进行迭代。您无需为此打开文件两次。

            # Set up the JSON payload ...
            payload = {
                        "properties": [
                            {
                                "name": "account",
                                "value": obj[row].account
                            },
                            {
                                "name": "id",
                                "value": obj[row].id
                            },
                            {
                                "name": "company",
                                "value": obj[row].company
                            },
                            {
                                "property": "firstname",
                                "value": obj[row].firstname
                            },
                            {
                                "property": "lastname",
                                "value": obj[row].lastname
                            },
                            {
                                "property": "job_title",
                                "value": obj[row].job_title
                            },
                            {
                                "property": "address",
                                "value": obj[row].address
                            },
                            {
                                "property": "city",
                                "value": obj[row].city
                            },
                            {
                                "property": "phone",
                                "value": obj[row].phone
                            },
                            {
                                "property": "email",
                                "value": obj[row].email
                            },
                            {
                                "property": "date_added",
                                "value": obj[row].date_added
                            }
                        ]
                    }

好吧,那是一段很长的代码,并没有做太多。至少,将那些内部dicts各收紧一条。但更好的是,编写一个函数以您想要的格式创建您的字典。您可以使用getattr从 中按名称提取数据obj

            nameQuery = "{first} {last}".format(first=obj[row].firstname, last=obj[row].lastname)

            # Get a list of all contacts for a certain company.
            contactCheck = "https://api.hubapi.com/contacts/v1/search/query?q={query}&hapikey={hapikey}".format(hapikey=hapikey, query=nameQuery)
            # Convert the payload to JSON and assign it to a variable called "data"
            data = json.dumps(payload)

            # Defined the headers content-type as 'application/json'
            headers = {'content-type': 'application/json'}

            contactExistCheck = requests.get(contactCheck, headers=headers)

在这里,您将 API 的详细信息编码到您的代码中。考虑将它们拉出到函数中。(这样,您可以稍后返回并构建它们的模块,以便在您的下一个程序中重复使用。)另外,请注意实际上并没有告诉您任何内容的注释。并且随意将它们组合成一个单独的段落,因为它们都在为同一个关键事物服务——进行 API 调用。

            for i in contactExistCheck.json()[u'contacts']:

                # ... Get the canonical VIDs
                canonicalVid = i[u'canonical-vid']

                if canonicalVid:
                    print ("{theContact} exists! Their VID is \"{vid}\"".format(theContact=obj[row].firstname, vid=canonicalVid))
                    print ("Attempting to update their company...")
                    contactCompanyUpdate = "https://api.hubapi.com/companies/v2/companies/{companyID}/contacts/{vid}?hapikey={hapikey}".format(hapikey=hapikey, vid=canonicalVid, companyID=obj[row].id)
                    doTheUpdate = requests.put(contactCompanyUpdate, headers=headers)
                    if doTheUpdate.status_code == 200:
                        print ("Attempt Successful! {theContact}'s has an updated company.\n".format(theContact=obj[row].firstname))
                        break
                    else:
                        print ("Attempt Failed. Status Code: {status}. Company or Contact not found.\n".format(status=doTheUpdate.status_code))

我不确定最后一点是否应该是一个例外。“尝试失败”是正常行为,还是意味着某些东西被破坏了?

无论如何,请查看您正在使用的 API。我敢打赌,对于小故障,还有更多信息可用。(主要故障可能是互联网中断或他们的服务器离线。)例如,他们可能会在返回的 JSON 中提供“错误”或“错误”字段。这些应该与您的失败消息一起记录或打印。

def createOrUpdateClient():

大多数情况下,此功能与前一个功能具有相同的问题。

            else:
                print ("Contact Marko for assistance.\n")

除了这里。永远不要把你的名字放在这样的地方。或者,从现在起 10 年后,您仍然会收到有关此代码的电话。输入您的部门名称(“IT 运营”)或支持号码。需要知道的人已经知道了。而不需要知道的人可以只通知已经知道的人。

if __name__ == "__main__":
    # Run the Create or Update function
    createOrUpdateClient()

    # Give the previous function 5 seconds to take effect.
    sleep(5.0)

    # Run the Company Update function
    contactCompanyUpdate()
    print("Sync complete.")

    print("Moving \"{something}\" to the archive folder...".format(something=theCSV))

    # Cron version
    #shutil.move( i, "/home/accountName/public_html/clientFolder/archive/" + os.path.basename(i))

    # Local version
    movePath = "archive/{thefile}".format(thefile=theCSV)
    shutil.move( i, movePath )

    print("Move successful! Exiting...\n")

这很尴尬。您可能会考虑采用一些命令行参数并使用它们来确定您的行为。

sys.exit()

不要这样做。永远不要放在exit()模块范围内,因为这意味着您不可能导入此代码。也许有人想导入它来解析文档字符串。或者他们可能想借用您编写的一些 API 函数。太糟糕了!sys.exit()意味着总是不得不说“哦,对不起,我必须为你做这件事。__name__ == "__main__" ”把它放在你实际代码的底部。或者,由于您实际上并未传递值,因此只需将其完全删除。

于 2016-02-29T02:04:22.907 回答