java - 在 Flask 应用程序中实现 Python-Boilerpipe 时 JVM 崩溃

Question

我使用锅炉管道编写烧瓶应用程序来提取内容。最初我将锅炉管道提取物编写为脚本来提取网站内容，但是当我尝试与我的 api 集成时，执行锅炉管道提取器时 JVM 崩溃。这是我得到的错误https://github.com/misja/python-boilerpipe/issues/17 我也在 github 中提出了一个问题

from boilerpipe.extract import Extractor
import unicodedata

class ExtractingContent:

  @classmethod
  def processingContent(self,sourceUrl,extractorType="DefaultExtractor"):
    extractor = Extractor(extractor=extractorType, url=sourceUrl)
    extractedText = extractor.getText()
    if extractedText:
      toNormalString =  unicodedata.normalize('NFKD',extractedText).encode('ascii','ignore')
     json_data = json.loads({"content": toNormalString, "url": sourceUrl , "status": "success", "publisher_id": "XXXXX", "content_count": str(len(toNormalString)) })
  return json_data
   else:    
     json_data = json.dumps({"response": {"message": "No data found", "url": sourceUrl , "status": "success", "content_count": "empty" }})
     return json.loads(json_data)

这是我试图集成到使用 flask-restful,sqlachemy,psql 的 Flask api 中的上述脚本。我也更新了我的java，但这并没有解决问题。Java 版本

java version "1.7.0_45" 
javac 1.7.0_45

任何帮助，将不胜感激

谢谢

score 4 · Accepted Answer

（我在https://github.com/misja/python-boilerpipe/issues/17中所写内容的副本）

好的，我已经重现了这个错误：调用 JVM 的线程没有附加到它，因此对 JVM 内部的调用失败。该错误来自锅炉管（见下文）。

首先，猴子补丁：在您发布在 stackoverflow 上的代码中，您只需在创建提取器之前添加以下代码：

class ExtractingContent:
   @classmethod
   def processingContent(self,sourceUrl,extractorType="DefaultExtractor"):
       print "State=", jpype.isThreadAttachedToJVM()

       if not jpype.isThreadAttachedToJVM():
           print "Needs to attach..."
           jpype.attachThreadToJVM()
           print "Check Attached=", jpype.isThreadAttachedToJVM()

       extractor = Extractor(extractor=extractorType, url=sourceUrl)

关于锅炉管：第 50 行的签if threading.activeCount() > 1入boilerpipe/extractor/__init__.py错误。调用线程必须始终附加到 JVM，即使只有一个。

java - 在 Flask 应用程序中实现 Python-Boilerpipe 时 JVM 崩溃

1 回答 1

Related

Reference