SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (SELECT AVG (run_time) FROM movie) *1.1 AND m.run_time > (SELECT AVG (run_time) FROM movie) *0.9;
It costs 4.6 to 8 in postgresql
Baisically it selects the title&runtime of movies that are within 10% of the average runtime. the movie table goes like this:
CREATE TABLE MOVIE
(
title varchar(40) NOT NULL,
production_year smallint NOT NULL,
country varchar(20) NOT NULL,
run_time smallint NOT NULL,
major_genre varchar(15) ,
CONSTRAINT pk_movie PRIMARY KEY(title,production_year)
);
and has 101 entries.
Since "SELECT AVG (run_time) FROM movie" is used twice, I thought of sticking the average in a variable, and referring to that variable in a 2nd query. The Mysql looks like this, it runs, and the sum of the two commands times is shorter than the reference query above.
SET @average = (SELECT AVG (run_time) FROM movie);
SELECT m.title , m.run_time FROM movie m WHERE m.run_time < @average *1.1 AND m.run_time > @average *0.9;
Now, how to do this equivalently in postgresql? I have listed my attempts below
When I try to make variable in postgresql, like so:
\set average (SELECT AVG (run_time) FROM movie);
this works. but the next line:
SELECT m.title , m.run_time FROM movie m WHERE m.run_time < :average *1.1 AND m.run_time > :average *0.9;
ERROR: syntax error at or near "FROMmovie"
LINE 1: ...OM movie m WHERE m.run_time < (SELECTAVG(run_time)FROMmovie)...
Happens, I think because the \set places my command literally, like a string variable, and doesn''t evaluate it, unlike mysql.
So I try to make a temporary table
CREATE TEMP TABLE temptable ( theaverage float );
insert into temptable ( SELECT AVG (m.run_time) FROM movie m );
SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (Select * from temptable) *1.1 AND m.run_time > (Select * from temptable) *0.9;
These work. but (measuring) performance is, not so good.
explain analyze CREATE TEMP TABLE temptable ( theaverage float ); //cannot analyze this/does not work/syntax error happens.
ERROR: syntax error at or near "float"
LINE 1: ...in analyze CREATE TEMP TABLE temptable ( theaverage float );
explain insert into temptable ( SELECT AVG (m.run_time) FROM movie m ); //costs 2.3ish
explain SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (Select theaverage from temptable) *1.1 AND m.run_time > (Select theaverage from temptable) *0.9;
//costs 63 to 66, wat? this would make it cost signifigantly more than the unoptimized query, which is 4.6 to 8.
I have also tried SELECT INTO, but I couldn't figure out how to use it correctly for my purposes.
So, I'll repeat the question, how to make an optimized version of
"SELECT m.title , m.run_time FROM movie m WHERE m.run_time < (SELECT AVG (run_time) FROM movie) *1.1 AND m.run_time > (SELECT AVG (run_time) FROM movie) *0.9;"
perhaps by using variables, AND with performance measurement, in postgresql?