我认为您需要在这里使用INNER JOIN
with a DISTINCT
:
SELECT distinct uns.*
FROM uniquestructures as uns
INNER JOIN uniqueproteins as unp on uns.ProteinID = unp.ProteinId
where LENGTH(unp.PDBASequence) < 20;
此外,如果您在表格上创建一个单独的列uniqueproteins
来保存列的长度PDBASequence
(例如PDBASequenceLength
),您可能会感到高兴。然后,您可以在列上放置索引,PDBASequenceLength
而不是LENGTH(PDBASequence)
在查询中调用。如果数据不是静态的,则创建一个触发器以在PDBASequenceLength
每次将行插入或更新到uniqueproteins
表中时填充列。因此:
CREATE TRIGGER uniqueproteins_length_insert_trg
AFTER INSERT ON uniqueproteins FOR EACH ROW SET NEW.PDBASequenceLength = length(new.PDBASequence);
CREATE TRIGGER uniqueproteins_length_update_trg
AFTER UPDATE ON uniqueproteins FOR EACH ROW SET NEW.PDBASequenceLength = length(new.PDBASequence);
alter table uniqueproteins add key `uniqueproteinsIdx2` (PDBASequenceLength);
您的查询可能是:
SELECT uns.*
FROM uniquestructures as uns
INNER JOIN uniqueproteins as unp on uns.ProteinID = unp.ProteinId
where unp.PDBASequenceLength < 20;
祝你好运!