我是 DESeq2 的初学者。目前,我正在尝试使用不同的设计公式来分析来自 Bioconductor package 的数据airway
。
我按照DESeq2
小插图中的步骤:RNA-seq 工作流程来计算统计结果。但是,当我在设计公式中指定交互项时,出现以下错误消息。
Error in checkForExperimentalReplicates(object, modelMatrix) :
The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer supported since v1.22.
我的问题是,当我按照说明example("results")
指定设计公式时会发生错误。为什么会出现这个错误,如何生成具有交互效果的结果??
如果有人可以帮助我解决这个问题,我将非常感激。
- 从加载数据
package(airway)
> # Loading data
> library("airway")
> library("DESeq2")
> data(gse)
> gse
class: RangedSummarizedExperiment
dim: 58294 8
metadata(6): tximetaInfo quantInfo ... txomeInfo txdbInfo
assays(3): counts abundance length
rownames(58294): ENSG00000000003.14 ENSG00000000005.5 ... ENSG00000285993.1
ENSG00000285994.1
rowData names(1): gene_id
colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
colData names(3): names donor condition
- 重命名和重新调整变量
> # rename variable
> (gse$cell <- gse$donor)
[1] N61311 N61311 N052611 N052611 N080611 N080611 N061011 N061011
Levels: N052611 N061011 N080611 N61311
> (gse$dex <- gse$condition)
[1] Untreated Dexamethasone Untreated Dexamethasone Untreated Dexamethasone Untreated
[8] Dexamethasone
Levels: Untreated Dexamethasone
> levels(gse$dex) = c("untrt", "trt")
> levels(gse$dex)
[1] "untrt" "trt"
- 用设计公式构建 DESeqDataSet
~ cell + dex
并进行分析。
> # building DESeqDataSet
> dds <- DESeqDataSet(gse, design = ~ cell + dex)
using counts and average transcript lengths from tximeta
> dds
class: DESeqDataSet
dim: 58294 8
metadata(7): tximetaInfo quantInfo ... txdbInfo version
assays(3): counts abundance avgTxLength
rownames(58294): ENSG00000000003.14 ENSG00000000005.5 ... ENSG00000285993.1
ENSG00000285994.1
rowData names(1): gene_id
colnames(8): SRR1039508 SRR1039509 ... SRR1039520 SRR1039521
colData names(5): names donor condition cell dex
> # Filtering
> keep = rowSums(counts(dds)) > 1
> dds = dds[keep,]
> dim(dds)
[1] 31604 8
> # Defferential analysis
> design(dds)
~cell + dex
> dds = DESeq(dds)
estimating size factors
using 'avgTxLength' from assays(dds), correcting for library size
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
> resultsNames(dds)
[1] "Intercept" "cell_N061011_vs_N052611" "cell_N080611_vs_N052611"
[4] "cell_N61311_vs_N052611" "dex_trt_vs_untrt"
> results(dds, contrast = c("dex", "untrt", "trt"))
log2 fold change (MLE): dex untrt vs trt
Wald test p-value: dex untrt vs trt
DataFrame with 31604 rows and 6 columns
baseMean log2FoldChange lfcSE stat pvalue padj
<numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
ENSG00000000003.14 739.940717 0.3611537 0.106869 3.379419 0.000726392 0.00531137
ENSG00000000419.12 511.735722 -0.2063147 0.128665 -1.603509 0.108822318 0.29318870
ENSG00000000457.13 314.194855 -0.0378308 0.158633 -0.238479 0.811509461 0.92255697
ENSG00000000460.16 79.793622 0.1152590 0.314991 0.365912 0.714430444 0.87298038
ENSG00000000938.12 0.307267 1.3691185 3.503764 0.390757 0.695977205 NA
... ... ... ... ... ... ...
ENSG00000285979.1 38.353886 -0.3423657 0.359511 -0.952310 0.340940 0.600750
ENSG00000285987.1 1.562508 -0.7064145 1.547295 -0.456548 0.647996 NA
ENSG00000285990.1 0.642315 -0.3647333 3.433276 -0.106235 0.915396 NA
ENSG00000285991.1 11.276284 0.1165515 0.748601 0.155692 0.876275 0.952921
ENSG00000285994.1 3.651041 0.0960094 1.068660 0.089841 0.928414 NA
- 用交互项分析数据
~ cell + dex + cell:dex
。
在这一步中,在我用交互项指定设计公式之后~ cell + dex + cell:dex
。当我尝试DESeq()
在数据集上运行函数时发生错误。
我使用~ cell + dex + cell:dex
的设计公式与他们在 中演示的交互设计公式相同example("results")
。
> # Defferential analysis using interaction term
> dds_int = dds
> design(dds_int) = formula(~ cell + dex + cell:dex)
> dds_int = DESeq(dds_int)
using pre-existing normalization factors
estimating dispersions
found already estimated dispersions, replacing these
Error in checkForExperimentalReplicates(object, modelMatrix) :
The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer supported since v1.22.
- 我尝试在构建
DESeqDataSet
. 但是,当我尝试在DESeq()
数据集上运行时发生了同样的错误
> dds_int = DESeqDataSet(gse, design = ~ cell + dex + cell:dex)
using counts and average transcript lengths from tximeta
> dim(dds_int)
[1] 58294 8
>
> keep = rowSums(counts(dds_int)) > 1
> dds_int = dds_int[keep,]
> dim(dds_int)
[1] 31604 8
>
> design(dds_int)
~cell + dex + cell:dex
>
> dds_int = DESeq(dds_int)
estimating size factors
using 'avgTxLength' from assays(dds), correcting for library size
estimating dispersions
Error in checkForExperimentalReplicates(object, modelMatrix) :
The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer supported since v1.22.
- 我试图创建
model.matrix
并使用它来运行DESeq
分析。但是,仍然会发生相同的错误。
> # model formula
> dds_int = dds
> attach(as.data.frame(colData(dds_int)))
The following objects are masked from as_data_frame(colData(dds_int)):
cell, condition, dex, donor, names
>
> mm = model.matrix( ~ cell + dex + cell:dex)
> design(dds_int) = mm
> design(dds_int)
(Intercept) cellN061011 cellN080611 cellN61311 dextrt cellN061011:dextrt cellN080611:dextrt
1 1 0 0 1 0 0 0
2 1 0 0 1 1 0 0
3 1 0 0 0 0 0 0
4 1 0 0 0 1 0 0
5 1 0 1 0 0 0 0
6 1 0 1 0 1 0 1
7 1 1 0 0 0 0 0
8 1 1 0 0 1 1 0
cellN61311:dextrt
1 0
2 1
3 0
4 0
5 0
6 0
7 0
8 0
attr(,"assign")
[1] 0 1 1 1 2 3 3 3
attr(,"contrasts")
attr(,"contrasts")$cell
[1] "contr.treatment"
attr(,"contrasts")$dex
[1] "contr.treatment"
>
> dds_int = DESeq(dds_int, test="Wald", modelMatrixType = "standard")
using supplied model matrix
using pre-existing normalization factors
estimating dispersions
found already estimated dispersions, replacing these
Error in checkForExperimentalReplicates(object, modelMatrix) :
The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer supported since v1.22.
我在上述代码块中创建的交互项的两个设计公式与example(results)
. 我想知道为什么错误不断发生。如何生成具有交互效果的结果?
谢谢大家的时间。