问题标签 [streamsets]

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

0 投票
1 回答
421 浏览

http - 通过 StreamSets Data Collector 流式传输时在文件名中附加 UUID

我正在使用 HttpClient 源将文件从 HTTP url 流式传输到 Hadoop 目标,但目标中的文件名附加了一些随机 uuid。我希望文件名与源文件相同。

示例:源文件名为 README.txt ,目标文件名为 README_112e5d4b-4d85-4764-ab81-1d7b6e0237b2.txt

我希望目标文件名是 README.txt

我会告诉你我的配置。

0 投票
1 回答
823 浏览

python - AttributeError:“模块”对象在脚本中没有属性“_Condition”

我正在尝试使用 python 中的 boto3 访问 AWS S3 对象。

我已经给出了AWS凭据。但是,我使用boto3APIboto3.client('S3')访问S3资源的地方抛出了属性错误。下面是代码片段:

你能帮我解决这个错误吗?

请在下面找到堆栈跟踪:

0 投票
1 回答
994 浏览

streamsets - Streamsets 在尝试解析有效​​ JSON 时出现此错误

我正在为一个项目设置流集。它的来源是 Kafka 消费者。它适用于较小的消息,但是当消息大小较大时,它会引发此错误。

我已经将 Max Object Length (chars) 设置为 1000000 并将 parser.limit 属性设置为 10335040。我无法弄清楚这个问题。

不适用

完整的堆栈跟踪是

此 json 失败:-

{"payload":{"data":{"aIndex":"application0502","aType":"application","pIndex":"profile000","pType":"profile","da":{"clientId ":"168613","clientType":"1","statusDataList":{"68348":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68348","CURR_STATUS":"1949","CURR_SUB_STATUS": null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05 -21 17:18:59","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68349":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68349","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null," STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:18:59","REQ_EMPLOYERID":"4103 ","REQ_POSTED_BY":"76866550"},"68351":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68351","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949, "SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68365 ":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68365","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0," OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:18:59","REQ_EMPLOYERID":"4103","REQ_POSTED_BY": "76866550"},"68366":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68366","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[]," CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68367":{"PAYMENT_STATUS": 1,"UNIQUE_KEY":"168613_68367","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[]," ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68369":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68367 ","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY ":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68370":{"PAYMENT_STATUS":1 ,"UNIQUE_KEY":"168613_68367","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68371":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68367","CURR_STATUS":"1949" ,"CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE" :"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68372":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68367","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE ":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00"," REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"}},"recruiterId":"76866550","isActivity":false},"ignoreParamsForIndexing":{"statusDetailsForAsyncActions":{"clientId":"168613", "statusId":"1949","subStatusId":null,"assessmentTestId":"","feedbackFormIds":[],"hiring manager":[],"isBillingEnabled":null,"isOfferGenerationEnabled":null,"statusDataJson":{"assessment":{"action":1," sendToNew":false,"resendToAll":false,"statusId":"1949","subStatusId":null},"CURR_STATUS_DATE":"2019-05-21 17:18:59"}},"projectDetailsForAsyncActions":{ "projectId":"15463"}},"optn":{"_routing":"168613"},"action":22,"activityField":"STATUS_CHANGED"},"dataArray":null,"retryCount":3 ,"additionalHeaders":{},"routingKey":"168613","topic":"rms-search-data"},"headers":{"AppId":123,"SystemId":"1234","X-TRANSACTION-ID":"27108593751"}}

这个 Json 成功了:-

{"payload":{"data":{"aIndex":"application0502","aType":"application","pIndex":"profile000","pType":"profile","da":{"clientId ":"168613","clientType":"1","statusDataList":{"68348":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68348","CURR_STATUS":"1949","CURR_SUB_STATUS": null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05 -21 17:18:59","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68349":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68349","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null," STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:18:59","REQ_EMPLOYERID":"4103 ","REQ_POSTED_BY":"76866550"},"68351":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68351","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949, "SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68365 ":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68365","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0," OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:18:59","REQ_EMPLOYERID":"4103","REQ_POSTED_BY": "76866550"},"68366":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68366","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[]," CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68367":{"PAYMENT_STATUS": 1,"UNIQUE_KEY":"168613_68367","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[]," ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68369":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68367 ","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY ":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68370":{"PAYMENT_STATUS":1 ,"UNIQUE_KEY":"168613_68367","CURR_STATUS":"1949","CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE":"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"},"68371":{"PAYMENT_STATUS":1,"UNIQUE_KEY":"168613_68367","CURR_STATUS":"1949" ,"CURR_SUB_STATUS":null,"STATUS_VALUE":1949,"SUB_STATUS_VALUE":null,"STATUS_STATE":0,"OWNERS_BY_CURR_STATUS":[],"ADDITIONAL_OWNERS":[],"CURR_STATUS_UPDATEDBY":"76866550","CURR_STATUS_DATE" :"2019-05-21 17:19:00","REQ_EMPLOYERID":"4103","REQ_POSTED_BY":"76866550"}},"recruiterId":"76866550","isActivity":false},"ignoreParamsForIndexing":{"statusDetailsForAsyncActions":{"clientId":"168613","statusId":"1949" ,"subStatusId":null,"assessmentTestId":"","feedbackFormIds":[],"招聘经理":[],"isBillingEnabled":null,"isOfferGenerationEnabled":null,"statusDataJson":{"assessment": {"action":1,"sendToNew":false,"resendToAll":false,"statusId":"1949","subStatusId":null},"CURR_STATUS_DATE":"2019-05-21 17:18:59" }},"projectDetailsForAsyncActions":{"projectId":"15463"}},"optn":{"_routing":"168613"},"action":22,"activityField":"STATUS_CHANGED"},"dataArray":null,"retryCount":3,"additionalHeaders":{},"routingKey":"168613 ","topic":"rms-search-data"},"headers":{"AppId":123,"SystemId":"1234","X-TRANSACTION-ID":"27108593751"}}

0 投票
3 回答
3218 浏览

ssis - 卡夫卡与流集

我正在阅读与 Kafka 和 StreamSets 相关的文章,我的理解是

  1. Kafka 充当生产者系统和订阅者之间的代理。生产者将数据推送到 Kafka 集群,订阅者从 Kafka 拉取数据

  2. StreamsSets 是一种通过管道将数据从一个源移动到另一个源的技术

现在,以下是我的问题,请帮助澄清

  1. Kafka 和 StreamSets 的根本区别是什么?是 Kafka 不移动数据,但 StreamSets 移动数据吗?

  2. 如果卡夫卡不移动数据,卡夫卡是用来做什么的?如果它像 ETL 解决方案一样移动数据,它与 SSIS、Informatica 等有何不同?

  3. StreamSets 与 SSIS、Informatica 等有何不同?

0 投票
2 回答
285 浏览

json - 将 time:now() 修改为小于一小时

首先,我不太清楚它的核心语言是什么。我遵循了一个教程

我的代码如下所示:

这给了我以下信息:

我想要做的是从第一次约会中抽出 1 小时。这是我尝试过的:

前:

后:

虽然这似乎是有效的代码,但它说它是无效的。请有人可以帮助我了解如何将“之前”代码修改为从现在开始的 1 小时。

0 投票
1 回答
184 浏览

docker - 流集数据收集器容器无法从 windows 目录(d:/file)读取文件

我使用 docker 创建了一个容器,并让该容器在 localhost 上运行。我想做的就是从 D:/file/ 目录中获取一个 excel 文件,但是当我在 Files Directory 中输入这样的目录时,我收到错误,因为不存在这样的目录。请帮忙。

0 投票
1 回答
369 浏览

streamsets - 错误:com.streamsets.pipeline.api.StageException:JDBC_52 - 启动 LogMiner 时出错

我在运行 oracle cdc 时遇到以下错误,因为今天早上它运行良好,但从今天早上开始出现错误。

这个错误的确切原因是什么?

由于以下错误,管道 cdc_test 在 2019-06-15 13:37:46 停止:

0 投票
0 回答
104 浏览

jython - 使用 jython 从文件中删除记录

如何使用保存在 sample.txt 中的 jython 删除 top=2 和 bottom=1 的 n 行。我的文件大小可能是 MB/GB。

sample.txt 包含以下行

预期输出:

0 投票
1 回答
283 浏览

streamsets - HADOOPFS - 无法验证流集中的基本目录

我在流集中运行管道时遇到问题,我可以看到以下错误:

有关更多详细信息,请参见: https ://cwiki.apache.org/confluence/display/HADOOP2/ConnectionRefused

0 投票
1 回答
417 浏览

streamsets - 是否可以在 StreamSets 的 HTTP 客户端处理器中使用记录字段作为 URL 参数?

我是 StreamSets 的新手,在此先感谢您的帮助。

在我的管道记录(JSON)中,我有一个带有地理坐标(纬度,经度)的字段,我正在尝试向它们添加更多元数据。我想知道是否可以使用 HTTP 客户端处理器来实现这里描述的操作https://nominatim.org/release-docs/develop/api/Reverse/使用我的记录的 lat、lon 值。如果是的话,你能指点我一些描述如何做的文档或文章吗?

我已经能够在其他场合使用 HTTP 客户端作为 Origins,但我不知道如何使用 URL 中的值。

例如,如果我的记录值是

网址应如下所示: https://nominatim.openstreetmap.org/reverse?format=jsonv2&lat=41.226599&lon=-8.709737