sql - 春季批处理项目阅读器中的子查询

Question

假设我有一个人表。一个人可以拥有一到三个产品，它看起来像这样。

id  person_id  product  price
1   person1    product3   1
2   person1    product2   2
3   person1    product1   10
4   person2    product1   11
5   person2    product2   14

我应该为每个人获取所有产品，然后对其进行处理（对价格进行映射，执行一些逻辑），然后将计算数据写入最终表，该表仅包含两个字段（person_id 和计算值，其中 person_id 是键）

person_id  calculated_value
person1        100   
person2        111   
person3        93

在这种情况下实现项目阅读器的最佳方法是什么？（获取每个人的所有产品并进行处理） 是否可以在项目阅读器中的一个查询中完成，或者我应该在项目处理器中为每个人进行额外查询？

score 0 · Accepted Answer

The chunk-oriented processing model is not suitable fo this kind of aggregations. As suggested in comments by Gordon Linoff, it would be easier and more efficient to store this data in a table and let the database do the calculation. If you really want to do it with Spring Batch, you can proceed in two steps:

step 1: a tasklet that does a select distinct(person_id) from table to get the distinct values for person IDs. This list is passed to the second step.
step 2: a chunk-oriented step with a reader that iterates over person IDs and a processor that does an additional query to get products and do the calculation. The writer of this step can write aggregate values as needed.

score 0 · Accepted Answer

这更像是一个设计问题，因此可能不适合 SO，但我会尽力提供帮助。通常，您应该针对步进处理器本身不检索数据而只进行处理。相反，读者应该收集所有数据并在读取数据时对数据执行映射，并呈现一个计算驱动的工作单元，然后可以对其进行处理并随后输出到任何需要的地方。

考虑到这一点...

是否可以在项目阅读器的一个查询中执行此操作 ^-- 如果可以通过查询完成所有操作，是的，您应该这样做。

此外，如果您需要基于查询中响应数据的编程干预/查找/映射，那么您可以从编程部分和现有的 Sql 阅读器中创建一个复合阅读器，以便您从需要的 SQL 阅读器中读取字段，然后转换这些进入以编程方式丰富的工作单元，然后传递给处理器进行工作。如果您需要将多行有效地组合成一个工作单元，请查看聚合项目阅读器方法。这是一个讨论该问题的SO：spring batch aggregate records from db as a single record

...或者我应该在项目处理器中为每个人做额外的查询？^-- 您应该避免在处理器中收集/查询更多数据。虽然您可以做得很好并且超级方便，但它不是每个框架获取更多数据的指定位置。最好将处理器设计为无状态引擎，并且所有收集都由读者负责。

sql - 春季批处理项目阅读器中的子查询

2 回答 2

Related

Reference