假设我有一个查询返回属性 R:a 具有第二高的不同值(返回表名和属性名)。我的任务是:
对于相同属性 R:a 的每个不同值,将不同值返回为 val,并将 R 中具有此值的元组数返回为计数。显示这些不同值的结果,例如 20% 的 R 元组具有等于或低于 R:a 值,20% 的 R 元组具有更高或等于 R:a 值。
建议对存储进行修改,以便查询尽可能高效地运行。
我编写了一个查询来返回不同的值和计数,但是我如何显示 20% 的结果以及在这种情况下存储修改意味着什么?
SELECT pg_stats.tablename,pg_stats.attname,UNNEST(pg_stats.most_common_vals::text::varchar[]) as val, UNNEST(pg_stats.most_common_freqs)*pg_class.reltuples as count
FROM information_schema.columns,pg_stats,pg_class,
( SELECT T.tablename,T.attname
FROM (
SELECT pg_stats.tablename, pg_stats.attname, pg_stats.n_distinct as distval
FROM information_schema.columns,pg_stats
WHERE pg_stats.tablename = table_name AND pg_stats.attname = column_name AND (NOT (information_schema.columns.table_schema LIKE 'pg_%' OR information_schema.columns.table_schema = 'information_schema')) AND (information_schema.columns.data_type ='character varying') AND (table_name NOT in (SELECT viewname FROM pg_views))
LIMIT 5) as T
ORDER by T.distval DESC limit 1 offset 1) as temp
WHERE (pg_stats.tablename = table_name AND pg_stats.attname = column_name AND (NOT
(information_schema.columns.table_schema LIKE 'pg_%' OR
information_schema.columns.table_schema
= 'information_schema')) AND (information_schema.columns.data_type ='character varying') AND
(table_name NOT in (SELECT viewname FROM pg_views))
AND pg_stats.tablename = relname AND temp.tablename = pg_stats.tablename AND temp.attname =
pg_stats.attname);