单细胞分析报错-不能用所有基因来scale数据

最新推荐文章于 2025-02-14 17:25:23 发布

Charlotteoqq

最新推荐文章于 2025-02-14 17:25:23 发布

阅读量1.2k

点赞数 5

文章标签：机器学习人工智能算法

本文链接：https://blog.youkuaiyun.com/Charlotteoqq/article/details/139794812

版权

不能用所有基因来scale数据

> pbmc_harmony196 <- NormalizeData(harmony196,normalization.method = "LogNormalize",scale.factor = 10000, verbose = F)
> pbmc_harmony196 <- FindVariableFeatures(pbmc_harmony196, selection.method = "vst", nfeatures = 2000, verbose = F)
Warning messages:
1: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
pseudoinverse used at -2.2095
2: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
neighborhood radius 0.49905
3: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
reciprocal condition number 7.34e-16
4: In simpleLoess(y, x, w, span, degree = degree, parametric = parametric, :
There are other near singularities as well. 0.090619
> all.genes <- rownames(pbmc_harmony196)
> pbmc_harmony196 <- ScaleData(pbmc_harmony196, features = all.genes, verbose = F)
Error: cannot allocate vector of size 457.0 Gb
> pbmc_harmony196 <- RunPCA(pbmc_harmony196,features = VariableFeatures(object = pbmc_harmony196), npcs = 30, verbose = F)
Warning in PrepDR5(object = object, features = features, layer = layer, :
The following features were not available: DENND4B, ATP6V0E1, STARD7, SLC39A1, BOD1, PPARA, COPA, TSHZ1, ELAVL1, PHKG2, MLX, GCLC, CRTC2, KIF5B, PSMD4, HSD17B7, SAT1, PPP6R3, BAG6, ZBTB24, TRIM69, XPOT, DBP, IGSF8, COPZ1, TPM2, EZH1, TNFAIP1, EDEM2, ABT1, MTO1, TFPT, IFT20, IRF7, NARF, OXSM, MGST3, AK9, TRAF3IP2, ANKRD13A, INPPL1, CCDC137, ACTB, MMS19, SLC12A9, TMEM127, BLZF1, TMEM191C, MAD2L1BP, PPP1R14A, NPC2, SPAST, UCKL1, CUL9, HAX1, PPIB, STX8, CHMP2B, ASMTL, GAPDH, ILF3, HIF1AN, ALDH2, RAPGEF1, CDIPT, UBE4B, SH3KBP1, DOCK10, BROX, GSTP1, SLC7A7, KMT2D, UFL1, BEX4, CAPN7, CTNNBIP1, PURA, KLF16, UBE2V2, SPTY2D1, NFKBIA, UBL4A, BCL11A, UBN2, RAB6A, CLEC12A, RAB2A, DUSP1, EXOC5, MAP2K1, FGR, C19orf53, KCTD12, MT-ND2, VPS37A, DUT, OTUD5, LINC00667, KLHL24, UTP15, WASL, MED28, CCDC77, SON, PLEK, ZZEF1, MT-ATP8, TRIM22, SF3A3, ARID1A, GMPR, P4HTM, POU2F2, IGFBP7, CEBPB, C19orf38, PHACTR4, EXOC7, SSBP4, SSR3, ZC3H3, UBE2S, MDM4, PRPF18, ORC2, PTPRE, ZNF487, FCGR2B, MGST1, SLC25A1, PRDX6, [... truncated]
Warning message:
In LayerData.Assay5(object = object, layer = layer) :
multiple layers are identified by scale.data.70 scale.data.71 scale.data.72 scale.data.73 scale.data.74 scale.data.75 scale.data.76 scale.data.77 scale.data.78 scale.data.79 scale.data.80 scale.data.81 scale.data.82
only the first layer is used
> pbmc_harmony196 <- RunUMAP(pbmc_harmony196, reduction = "pca", dims = 1:30, verbose = F)
Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
> saveRDS(pbmc_harmony196, file = "harmony198/2/pbmc_harmony196.rds")
> png("1.png")
options(repr.plot.height = 5, repr.plot.width = 12) #调大小
DimPlot(pbmc_harmony196, reduction = "umap",group.by = "dataset",raster=FALSE) +plot_annotation(title = "Harmony196 before integration") #UMAP聚类-dataset分组 umap_naive是早期版本
dev.off()
Warning message:
Removed 1107476 rows containing missing values (`geom_point()`).
null device
1
> png("1.png")
VlnPlot(object =pbmc_harmony196, features = "PC_1", pt.size = 0, group.by = "dataset",raster=FALSE) #PCA聚类-dataset分组-Vln。其他参数：split.by=""， pt.size = .1 图上显示细胞
dev.off()
Error in if (all(data[, feature] == data[, feature][1])) { :
missing value where TRUE/FALSE needed
In addition: Warning message:
Removing 1107476 cells missing data for vars requested
> png("1.png")
DimPlot(object =pbmc_harmony196, reduction = "pca", pt.size = .1, group.by = "dataset",raster=FALSE) #PCA聚类-dataset分组-Dim
dev.off()
Warning message:
Removed 1107476 rows containing missing values (`geom_point()`).
png
2
> png("1.png")
VlnPlot(object =pbmc_harmony196, features = "PC_1", pt.size = 0, group.by = "dataset",raster=FALSE) #PCA聚类-dataset分组-Vln。其他参数：split.by=""， pt.size = .1 图上显示细胞
dev.off()
Error in if (all(data[, feature] == data[, feature][1])) { :
missing value where TRUE/FALSE needed
In addition: Warning message:
Removing 1107476 cells missing data for vars requested