0

我正在使用 Ibm Watson Studio 为机器学习项目设置 Jupyter Notebook 项目,当我尝试从我的 Postgresql 数据库表中添加数据时,我不断收到 TypeError is not JSON serializable。

完整的错误输出:

TypeError                                 Traceback (most recent call last)
<ipython-input-16-e72fac39b809> in <module>()
      1 classes = natural_language_classifier.classify('998520s521-nlc-1398', data_df_1.to_json())
----> 2 print(json.dumps(classes, indent=2))

/opt/conda/envs/DSX-Python35/lib/python3.5/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    235         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    236         separators=separators, default=default, sort_keys=sort_keys,
--> 237         **kw).encode(obj)
    238 
    239 

/opt/conda/envs/DSX-Python35/lib/python3.5/json/encoder.py in encode(self, o)
    198         chunks = self.iterencode(o, _one_shot=True)
    199         if not isinstance(chunks, (list, tuple)):
--> 200             chunks = list(chunks)
    201         return ''.join(chunks)
    202 

/opt/conda/envs/DSX-Python35/lib/python3.5/json/encoder.py in _iterencode(o, _current_indent_level)
    434                     raise ValueError("Circular reference detected")
    435                 markers[markerid] = o
--> 436             o = _default(o)
    437             yield from _iterencode(o, _current_indent_level)
    438             if markers is not None:

/opt/conda/envs/DSX-Python35/lib/python3.5/json/encoder.py in default(self, o)
    177 
    178         """
--> 179         raise TypeError(repr(o) + " is not JSON serializable")
    180 
    181     def encode(self, o):

TypeError: <watson_developer_cloud.watson_service.DetailedResponse object at 0x7f64ee350240> is not JSON serializable

这是我在 Notebook 中部署 AI 模型来分析这些数据的 python 代码:

from watson_developer_cloud import NaturalLanguageClassifierV1
import pandas as pd
import psycopg2


# Connecting to my database.
conn_string = 'host={} port={}  dbname={}  user={}  password={}'.format('159.***.20.***', 5432, 'searchdb', 'lcq09', 'Mys3cr3tPass')
conn_cbedce9523454e8e9fd3fb55d4c1a52e = psycopg2.connect(conn_string)
data_df_1 = pd.read_sql('SELECT description from public."search_product"', con=conn_cbedce2drf563454e8e9fd3fb8776fgh2e)

# Connecting to the ML model.
natural_language_classifier = NaturalLanguageClassifierV1(
    iam_apikey='TB97dFv8Dgug6rfi945F3***************'
)

# Apply the ML model to db datas
classes = natural_language_classifier.classify('9841d0z5a1-ncc-9076', data_df_1.to_json())
print(json.dumps(classes, indent=2))

我试过运行这个:print(data_df_1.to_json())以确保格式是 Json 格式并且格式正确,如下所示: ps:以下数据是随机的 Lorem 语句,但测试后将是产品描述。

{"description":{"0":"Lorem ipsum sjvh  hcx bftiyf,  hufcil, igfgvjuoigv gvj ifcil ,ghn fgbcggtc   yfctgg h vgchbvju.","1":"Lorem ajjgvc wiufcfboitf iujcvbnb hjnkjc  ivjhn oikgjvn uhnhgv 09iuvhb  oiuvh boiuhb mkjhv mkiuhygv m,khbgv mkjhgv mkjhgv.","2":"Lorem aiv ibveikb jvk igvcib ok blnb v  hb b hb bnjb bhb bhn bn vf vbgfc vbgv nbhgv bb nb nbh nj mjhbv mkjhbv nmjhgbv nmkn","3":"Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx","4":"Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx","5":"Lorem jsvc smc cbd ciecdbbc d vd bcvdvbj obcvb vcibs j dvx"}}

另外,我可以使用下面的代码对单个句子进行分类,但我想对整个数据库的描述表进行分类:

classes = natural_language_classifier.classify('998260x551-nlc-1018', 'How hot will it be today?')
print(json.dumps(classes.result, indent=2))

这就是为什么我用名为data_df_1.

但是当我按照提到的那样做时,我有一个 TypeError,

那么我应该怎么做才能解决这个错误呢?

4

1 回答 1

1

您的问题是,在您的数据框中有一个 watson_developer_cloud.watson_service.DetailedResponseJSON 序列化程序 Python 模块不知道如何处理的问题。

查看api看起来你可以调用detailed_response._to_dict实例方法(这会被不赞成,因为它使用私有方法),或者调用detailed_response.get_response方法来获取字典以从对象中删除数据。

理想情况下,您使用上述两种方法之一对包含该对象的数据框中的每一行进行序列化该对象的数据框,然后.to_json不应TypeError在该列中抛出 a。

col = 'column_with_unserializable_type'
data_df_1[col] = data_df_1[col].map(lambda x: x.get_response)
于 2019-05-22T19:34:26.573 回答