1
SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (SELECT AVG (run_time) FROM movie) *1.1 AND m.run_time > (SELECT AVG (run_time) FROM movie) *0.9;

It costs 4.6 to 8 in postgresql

Baisically it selects the title&runtime of movies that are within 10% of the average runtime. the movie table goes like this:

CREATE TABLE MOVIE
(
title             varchar(40)                             NOT NULL,
production_year        smallint                             NOT NULL,
country         varchar(20)                            NOT NULL,
run_time        smallint                            NOT NULL,
major_genre         varchar(15)                                 ,
CONSTRAINT pk_movie PRIMARY KEY(title,production_year)
);

and has 101 entries.

Since "SELECT AVG (run_time) FROM movie" is used twice, I thought of sticking the average in a variable, and referring to that variable in a 2nd query. The Mysql looks like this, it runs, and the sum of the two commands times is shorter than the reference query above.

SET @average = (SELECT AVG (run_time) FROM movie);
SELECT m.title , m.run_time FROM movie m WHERE m.run_time < @average *1.1 AND m.run_time > @average *0.9;

Now, how to do this equivalently in postgresql? I have listed my attempts below

When I try to make variable in postgresql, like so:

\set average (SELECT AVG (run_time) FROM movie);

this works. but the next line:

SELECT m.title , m.run_time FROM movie m WHERE m.run_time < :average *1.1 AND m.run_time > :average *0.9;

ERROR:  syntax error at or near "FROMmovie"
LINE 1: ...OM movie m WHERE m.run_time < (SELECTAVG(run_time)FROMmovie)...

Happens, I think because the \set places my command literally, like a string variable, and doesn''t evaluate it, unlike mysql.

So I try to make a temporary table

CREATE TEMP TABLE temptable ( theaverage float );
insert into temptable  ( SELECT AVG (m.run_time) FROM movie m );
SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (Select * from temptable) *1.1 AND m.run_time > (Select * from temptable) *0.9;

These work. but (measuring) performance is, not so good.

explain analyze CREATE TEMP TABLE temptable ( theaverage float ); //cannot analyze this/does not work/syntax error happens.
ERROR:  syntax error at or near "float"
LINE 1: ...in analyze CREATE TEMP TABLE temptable ( theaverage float );

explain insert into temptable  ( SELECT AVG (m.run_time) FROM movie m ); //costs 2.3ish

explain SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (Select theaverage from temptable) *1.1 AND m.run_time > (Select theaverage from temptable) *0.9;

//costs 63 to 66, wat? this would make it cost signifigantly more than the unoptimized query, which is 4.6 to 8.

I have also tried SELECT INTO, but I couldn't figure out how to use it correctly for my purposes.

So, I'll repeat the question, how to make an optimized version of

"SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (SELECT AVG (run_time) FROM movie) *1.1 AND m.run_time > (SELECT AVG (run_time) FROM movie) *0.9;"

perhaps by using variables, AND with performance measurement, in postgresql?

4

1 回答 1

2

此解决方案不使用变量,适用于 PostgreSQL 和 MySQL:

SELECT m.title, m.run_time
FROM movie m, 
    (SELECT avg(run_time) AS time FROM movie) a
WHERE m.run_time BETWEEN a.time * 0.9
                     AND a.time * 1.1

强制性SQLFiddle

请注意,在列上添加索引run_time应该会提高此查询的性能(假设您的movies表很大):

CREATE INDEX movies_run_time_idx ON movies(run_time);
于 2013-08-10T04:38:37.613 回答