I am running a query in Hive which is pretty straight forward but I am continuously exceeded GC timeout and OOM errors,
Query is of the form
select a.field1 -- selecting about 30 cols! from table1 t1 join table2 t2 on t1.field2 = t2.field2 and t1.date = '20120801' join table2 t3 on t1.field7 = t2.field2 and t1.date = '20120801'
I am selecting about 30 fields from this query. table1 is partitioned by date and contains around 300,000 records. table2 contains about 100 records.
Is there some way I can optimise this query?