22

以下 SOF 问题How to run script in Pyspark and drop into IPython shell when done? 告诉如何启动 pyspark 脚本:

 %run -d myscript.py

但是我们如何访问existin spark上下文呢?

只是创建一个新的不起作用:

 ---->  sc = SparkContext("local", 1)

 ValueError: Cannot run multiple SparkContexts at once; existing 
 SparkContext(app=PySparkShell, master=local) created by <module> at 
 /Library/Python/2.7/site-packages/IPython/utils/py3compat.py:204

但是尝试使用现有的..那么现有的呢?

In [50]: for s in filter(lambda x: 'SparkContext' in repr(x[1]) and len(repr(x[1])) < 150, locals().iteritems()):
    print s
('SparkContext', <class 'pyspark.context.SparkContext'>)

即 SparkContext 实例没有变量

4

4 回答 4

64

包括以下这些:

from pyspark.context import SparkContext

然后调用一个静态方法SparkContext

sc = SparkContext.getOrCreate()
于 2017-01-01T07:32:46.200 回答
4

wordcount 的独立 python 脚本:使用contextmanager编写可重用的 spark 上下文

"""SimpleApp.py"""
from contextlib import contextmanager
from pyspark import SparkContext
from pyspark import SparkConf


SPARK_MASTER='local'
SPARK_APP_NAME='Word Count'
SPARK_EXECUTOR_MEMORY='200m'

@contextmanager
def spark_manager():
    conf = SparkConf().setMaster(SPARK_MASTER) \
                      .setAppName(SPARK_APP_NAME) \
                      .set("spark.executor.memory", SPARK_EXECUTOR_MEMORY)
    spark_context = SparkContext(conf=conf)

    try:
        yield spark_context
    finally:
        spark_context.stop()

with spark_manager() as context:
    File = "/home/ramisetty/sparkex/README.md"  # Should be some file on your system
    textFileRDD = context.textFile(File)
    wordCounts = textFileRDD.flatMap(lambda line: line.split()).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a+b)
    wordCounts.saveAsTextFile("output")

print "WordCount - Done"

推出:

/bin/spark-submit SimpleApp.py
于 2015-06-22T03:43:31.927 回答
3

如果您已经创建了一个 SparkSession:

spark = SparkSession \
    .builder \
    .appName("StreamKafka_Test") \
    .getOrCreate()

然后你可以像这样访问“现有的” SparkContext:

sc = spark.sparkContext
于 2020-12-04T07:31:52.230 回答
1

当你在终端输入 pyspark 时,python 会自动创建 spark 上下文 sc。

于 2015-06-22T02:47:29.400 回答