1

我有一个几乎可以满足我要求的 SQL 语句。我需要的是找到两个qseqid值的
pindent > 60覆盖率 > 60的属。我想我需要某种类型的加入,也许就像这个问题一样

这就是我现在所拥有的。这并没有达到我想要的结果。

SELECT qseqid, genus, species, txid, sgi, pindent, coverage 
FROM vmdavis.insecta10000
WHERE pindent > 60
AND coverage > 60
AND qseqid in ("diaci0.9_transcript_99990000013040", "diaci0.9_transcript_99990000022677")
ORDER BY  genus, species, qseqid, coverage, pindent;

这是一个为什么这不起作用的例子。Anchon 符合上述 qseqid 的标准,适用于 dia...040,但不适用于 dia...677,所以我不会选择这一行。

| diaci0.9_transcript_99990000013040 | Anchon           | sp. NYSM 95-02-01-35          |  265052 |   6467730 |   80.93 |  61.7597 |

这是表格的示例

mysql> SELECT qseqid, genus, species, txid, pindent, coverage FROM vmdavis.insecta10000 limit 5;
+------------------------------------+---------+-------------+--------+---------+----------+
| qseqid                             | genus   | species     | txid   | pindent | coverage |
+------------------------------------+---------+-------------+--------+---------+----------+
| diaci0.9_transcript_99990000000055 | Apis    | florea      |   7463 |    97.5 |  2.58107 |
| diaci0.9_transcript_99990000000055 | Bombus  | impatiens   | 132113 |    97.5 |   3.3534 |
| diaci0.9_transcript_99990000000055 | Nasonia | vitripennis |   7425 |    97.5 |  1.58343 |
| diaci0.9_transcript_99990000000055 | Bombus  | terrestris  |  30195 |    97.5 |  3.41207 |
| diaci0.9_transcript_99990000000055 | Apis    | mellifera   |   7460 |    97.5 |  2.88889 |
+------------------------------------+---------+-------------+--------+---------+----------+

这是一个例子。在这种情况下, Agetocera 属被列出两次,因为对于两个 qseqid,它都符合 pindent 和覆盖范围的标准。如果 Agetocera 不满足两个 qseqid 的 pindent > 60 和 coverage > 60 的条件,则应列出这些行中的任何一个

| qseqid                             | genus     | species     | txid   | pindent | coverage
| diaci0.9_transcript_99990000013040 | Agetocera | mirablis    |  715820 | 291191497 |   82.37 |  60.7963 |
| diaci0.9_transcript_99990000022677 | Agetocera | mirablis    |  909986 | 309755769 |   77.52 |  78.6269 |

我对 mysql 很陌生,我认为这个问题的答案可能存在于 stackoverflow 上。如果我找到它,我只是不知道要搜索什么或理解解决方案。如果问题可以更好地提出,或者您可以提出更好的标题,我会更新。

4

2 回答 2

1

尝试这样的事情 - 使用子查询仅获取所需的属:

SELECT *
FROM insecta10000 i 
  JOIN 
  (
  SELECT genus
  FROM insecta10000
  WHERE pindent > 60
    AND coverage > 60
    AND qseqid in ("diaci0.9_transcript_99990000013040", "diaci0.9_transcript_99990000022677")
  GROUP BY genus
  HAVING COUNT(*) = 2
  ) i2 on i.genus = i2.genus 

这是SQL Fiddle

祝你好运。

于 2013-02-04T06:24:35.877 回答
0

如果你想要同时满足 Coverage > 60, pindent > 60 的记录,你已经得到了查询。但是,如果您正在查看这样的union内容,分别满足 Coverage 和 pindent 的记录,请尝试以下操作:

SELECT * FROM (
SELECT qseqid, genus, species, txid, sgi, pindent, coverage 
FROM vmdavis.insecta10000
WHERE pindent > 60    
UNION
SELECT qseqid, genus, species, txid, sgi, pindent, coverage 
FROM vmdavis.insecta10000
WHERE coverage > 60) x
WHERE x.qseqid in ("diaci0.9_transcript_99990000013040", "diaci0.9_transcript_99990000022677")
ORDER BY  x.genus, x.species, x.qseqid, x.coverage, x.pindent
;

现在您已经给出了预期的输出:(尽管列不同..sigly sig):

http://sqlfiddle.com/#!2/f89ce/4

SELECT qseqid, genus, species, txid, 
indent, coverage 
FROM demo
WHERE indent > 60
AND coverage > 60
AND qseqid in ("diaci0.9_transcript_99990000013040", "diaci0.9_transcript_99990000022677")
ORDER BY  genus, species, qseqid, coverage, indent;

|                             QSEQID |     GENUS |  SPECIES |   TXID | INDENT | COVERAGE |
------------------------------------------------------------------------------------------
| diaci0.9_transcript_99990000013040 | Agetocera | mirablis | 715820 |  82.37 |  60.7963 |
| diaci0.9_transcript_99990000022677 | Agetocera | mirablis | 909986 |  77.52 |  78.6269 |
于 2013-02-04T05:46:04.640 回答