DISTINCT 在 JPA 中使用什么列,是否可以更改它?
这是使用 DISTINCT 的示例 JPA 查询:
select DISTINCT c from Customer c
哪个没有多大意义——不同的列基于什么?它是否在实体上指定为注释,因为我找不到?
我想指定要区分的列,例如:
select DISTINCT(c.name) c from Customer c
我正在使用 MySQL 和 Hibernate。
你很亲密。
select DISTINCT(c.name) from Customer c
根据底层的 JPQL 或 Criteria API 查询类型,DISTINCT
在 JPA 中有两个含义。
对于返回标量投影的标量查询,如以下查询:
List<Integer> publicationYears = entityManager
.createQuery(
"select distinct year(p.createdOn) " +
"from Post p " +
"order by year(p.createdOn)", Integer.class)
.getResultList();
LOGGER.info("Publication years: {}", publicationYears);
应该将DISTINCT
关键字传递给底层 SQL 语句,因为我们希望数据库引擎在返回结果集之前过滤重复项:
SELECT DISTINCT
extract(YEAR FROM p.created_on) AS col_0_0_
FROM
post p
ORDER BY
extract(YEAR FROM p.created_on)
-- Publication years: [2016, 2018]
对于实体查询,DISTINCT
有不同的含义。
不使用DISTINCT
,查询如下:
List<Post> posts = entityManager
.createQuery(
"select p " +
"from Post p " +
"left join fetch p.comments " +
"where p.title = :title", Post.class)
.setParameter(
"title",
"High-Performance Java Persistence eBook has been released!"
)
.getResultList();
LOGGER.info(
"Fetched the following Post entity identifiers: {}",
posts.stream().map(Post::getId).collect(Collectors.toList())
);
将加入这样post
的post_comment
表格:
SELECT p.id AS id1_0_0_,
pc.id AS id1_1_1_,
p.created_on AS created_2_0_0_,
p.title AS title3_0_0_,
pc.post_id AS post_id3_1_1_,
pc.review AS review2_1_1_,
pc.post_id AS post_id3_1_0__
FROM post p
LEFT OUTER JOIN
post_comment pc ON p.id=pc.post_id
WHERE
p.title='High-Performance Java Persistence eBook has been released!'
-- Fetched the following Post entity identifiers: [1, 1]
但是父记录在每个关联行post
的结果集中重复。post_comment
因此,List
实体的Post
将包含重复的Post
实体引用。
为了消除Post
实体引用,我们需要使用DISTINCT
:
List<Post> posts = entityManager
.createQuery(
"select distinct p " +
"from Post p " +
"left join fetch p.comments " +
"where p.title = :title", Post.class)
.setParameter(
"title",
"High-Performance Java Persistence eBook has been released!"
)
.getResultList();
LOGGER.info(
"Fetched the following Post entity identifiers: {}",
posts.stream().map(Post::getId).collect(Collectors.toList())
);
但是 thenDISTINCT
也被传递给 SQL 查询,这根本不可取:
SELECT DISTINCT
p.id AS id1_0_0_,
pc.id AS id1_1_1_,
p.created_on AS created_2_0_0_,
p.title AS title3_0_0_,
pc.post_id AS post_id3_1_1_,
pc.review AS review2_1_1_,
pc.post_id AS post_id3_1_0__
FROM post p
LEFT OUTER JOIN
post_comment pc ON p.id=pc.post_id
WHERE
p.title='High-Performance Java Persistence eBook has been released!'
-- Fetched the following Post entity identifiers: [1]
通过传递DISTINCT
给 SQL 查询,EXECUTION PLAN 将执行一个额外的排序阶段,这会增加开销而不会带来任何价值,因为由于子 PK 列,父子组合总是返回唯一记录:
Unique (cost=23.71..23.72 rows=1 width=1068) (actual time=0.131..0.132 rows=2 loops=1)
-> Sort (cost=23.71..23.71 rows=1 width=1068) (actual time=0.131..0.131 rows=2 loops=1)
Sort Key: p.id, pc.id, p.created_on, pc.post_id, pc.review
Sort Method: quicksort Memory: 25kB
-> Hash Right Join (cost=11.76..23.70 rows=1 width=1068) (actual time=0.054..0.058 rows=2 loops=1)
Hash Cond: (pc.post_id = p.id)
-> Seq Scan on post_comment pc (cost=0.00..11.40 rows=140 width=532) (actual time=0.010..0.010 rows=2 loops=1)
-> Hash (cost=11.75..11.75 rows=1 width=528) (actual time=0.027..0.027 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on post p (cost=0.00..11.75 rows=1 width=528) (actual time=0.017..0.018 rows=1 loops=1)
Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
Rows Removed by Filter: 3
Planning time: 0.227 ms
Execution time: 0.179 ms
为了从执行计划中消除排序阶段,我们需要使用HINT_PASS_DISTINCT_THROUGH
JPA 查询提示:
List<Post> posts = entityManager
.createQuery(
"select distinct p " +
"from Post p " +
"left join fetch p.comments " +
"where p.title = :title", Post.class)
.setParameter(
"title",
"High-Performance Java Persistence eBook has been released!"
)
.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false)
.getResultList();
LOGGER.info(
"Fetched the following Post entity identifiers: {}",
posts.stream().map(Post::getId).collect(Collectors.toList())
);
现在,SQL 查询将不包含DISTINCT
但Post
实体引用重复项将被删除:
SELECT
p.id AS id1_0_0_,
pc.id AS id1_1_1_,
p.created_on AS created_2_0_0_,
p.title AS title3_0_0_,
pc.post_id AS post_id3_1_1_,
pc.review AS review2_1_1_,
pc.post_id AS post_id3_1_0__
FROM post p
LEFT OUTER JOIN
post_comment pc ON p.id=pc.post_id
WHERE
p.title='High-Performance Java Persistence eBook has been released!'
-- Fetched the following Post entity identifiers: [1]
执行计划将确认这次我们不再有额外的排序阶段:
Hash Right Join (cost=11.76..23.70 rows=1 width=1068) (actual time=0.066..0.069 rows=2 loops=1)
Hash Cond: (pc.post_id = p.id)
-> Seq Scan on post_comment pc (cost=0.00..11.40 rows=140 width=532) (actual time=0.011..0.011 rows=2 loops=1)
-> Hash (cost=11.75..11.75 rows=1 width=528) (actual time=0.041..0.041 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on post p (cost=0.00..11.75 rows=1 width=528) (actual time=0.036..0.037 rows=1 loops=1)
Filter: ((title)::text = 'High-Performance Java Persistence eBook has been released!'::text)
Rows Removed by Filter: 3
Planning time: 1.184 ms
Execution time: 0.160 ms
@Entity
@NamedQuery(name = "Customer.listUniqueNames",
query = "SELECT DISTINCT c.name FROM Customer c")
public class Customer {
...
private String name;
public static List<String> listUniqueNames() {
return = getEntityManager().createNamedQuery(
"Customer.listUniqueNames", String.class)
.getResultList();
}
}
更新:请查看投票最多的答案。
我自己的现在已经过时了。仅出于历史原因保留在这里。
连接中通常需要 HQL 中的不同,而不是像您自己这样的简单示例。
我同意kazanaki的回答,它帮助了我。我想选择整个实体,所以我用
select DISTINCT(c) from Customer c
就我而言,我有多对多的关系,我想在一个查询中加载带有集合的实体。
我使用了 LEFT JOIN FETCH,最后我必须使结果与众不同。
我会使用 JPA 的构造函数表达式功能。另请参阅以下答案:
JPQL 构造函数表达式 - org.hibernate.hql.ast.QuerySyntaxException:表未映射
按照问题中的示例,它将是这样的。
SELECT DISTINCT new com.mypackage.MyNameType(c.name) from Customer c