1

执行命令时出现以下错误:

user = sc.cassandraTable("DB NAME", "TABLE NAME").toDF()

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/src/spark/spark-1.4.1/python/pyspark/sql/context.py", line 60, in toDF
    return sqlContext.createDataFrame(self, schema, sampleRatio)
  File "/usr/local/src/spark/spark-1.4.1/python/pyspark/sql/context.py", line 333, in createDataFrame
    schema = self._inferSchema(rdd, samplingRatio)
  File "/usr/local/src/spark/spark-1.4.1/python/pyspark/sql/context.py", line 220, in _inferSchema
    raise ValueError("Some of types cannot be determined by the "
ValueError: Some of types cannot be determined by the first 100 rows, please try again with sampling
4

1 回答 1

2

直接加载到 Dataframe 中,这也将避免任何用于解释类型的 Python 级代码。

sqlContext.read.format("org.apache.spark.sql.cassandra").options(keyspace="ks",table="tb").load()
于 2015-08-26T20:08:44.460 回答