一句话简单版,直接上代码
mydataframe <- format.data.frame(mydataframe, digits = 3)
下面是正文:
使用 R 语言内置的 format() 函数
请注意,format() 函数是个泛型函数(generic function),该函数默认是针对 list 类型的。比如:
> x2 <- c(0, 0.1, 0.12, 0.123, 0.1234)
> format(x2, digits = 3)
[1] "0.000" "0.100" "0.120" "0.123" "0.123"
>
但如果你有一个 data.frame 类型的数据,format() 默认会把 data.frame 按照每一行进行格式化,将每一行格式化成一个字符串,并返回一个包含这些字符串的列表。也就是说,如果你有一个 m 行 n 列的 data.frame,格式化以后会得到一个长度为 m 的列表,这个列表的每一个元素就是原来的一行。比如,我们有 data.frame 如下:
... some code to create all.markers ...
write.csv(all.markers, file = "all.markers.csv", quote = TRUE, row.names = FALSE)
执行完上述语句后,在 bash 里查看 all.markers.csv 文件,
$ cat all.markers.csv
"p_val","avg_logFC","pct.1","pct.2","p_val_adj","cluster","gene"
0,2.98729133291798,1,0.355,0,"0","APOA1"
0,3.12262292282638,1,0.367,0,"1","FABP6"
0,1.54391599865321,0.994,0.163,0,"2","PRAC1"
0,2.38531660468238,0.999,0.393,0,"3","REG1A"
0,1.29492333281851,0.973,0.596,0,"4","PLA2G2A"
0,1.37730319536905,0.945,0.557,0,"5","LCN2"
3.00232626745486e-209,1.57361792700461,0.891,0.568,5.769570388168e-205,"6","HIST1H4C"
1.51822798006619e-74,2.30200569647446,0.868,0.86,2.9175787092932e-70,"7","ZG16"
0,1.40272942709707,0.973,0.294,0,"8","APOB"
0,2.51971944902175,0.992,0.361,0,"9","WFDC2"
0,2.31627723332725,0.952,0.241,0,"10","CA1"
0,2.34715846724303,0.921,0.175,0,"11","CA7"
0,2.60504795350165,0.964,0.174,0,"12","CA7"
0,2.54344448071181,0.989,0.164,0,"13","CEACAM7"
2.33161393250006e-260,1.48020981160154,1,0.239,4.48066249408536e-256,"14","PRAC1"
2.34915083163248e-40,5.25514235426146,0.536,0.149,4.51436315314814e-36,"15","PYY"
4.61693273127947e-100,4.12566769658387,0.99,0.272,8.87235962969975e-96,"16","REG4"
1.91512152741566e-37,3.84171627844603,0.984,0.589,3.68028903923468e-33,"17","GUCA2B"
如果直接使用 format() 函数来格式化,例如
... some code to create all.markers ...
all.markers <- format(all.markers, digits = 3)
write.csv(all.markers, file = "all.markers.csv", quote = TRUE, row.names = FALSE)
执行完上述语句后,在 bash 里查看 all.markers.csv 文件
$ cat all.markers.csv
"x"
"# A tibble: 18 x 7"
"# Groups: cluster [18]"
" p_val avg_logFC pct.1 pct.2 p_val_adj cluster gene "
" <dbl> <dbl> <dbl> <dbl> <dbl> <fct> <chr> "
" 1 0. 2.99 1 0.355 0. 0 APOA1 "
" 2 0. 3.12 1 0.367 0. 1 FABP6 "
" 3 0. 1.54 0.994 0.163 0. 2 PRAC1 "
" 4 0. 2.39 0.999 0.393 0. 3 REG1A "
" 5 0. 1.29 0.973 0.596 0. 4 PLA2G2A "
" 6 0. 1.38 0.945 0.557 0. 5 LCN2 "
" 7 3.00e-209 1.57 0.891 0.568 5.77e-205 6 HIST1H4C"
" 8 1.52e- 74 2.30 0.868 0.86 2.92e- 70 7 ZG16 "
" 9 0. 1.40 0.973 0.294 0. 8 APOB "
"10 0. 2.52 0.992 0.361 0. 9 WFDC2 "
"11 0. 2.32 0.952 0.241 0. 10 CA1 "
"12 0. 2.35 0.921 0.175 0. 11 CA7 "
"13 0. 2.61 0.964 0.174 0. 12 CA7 "
"14 0. 2.54 0.989 0.164 0. 13 CEACAM7 "
"15 2.33e-260 1.48 1 0.239 4.48e-256 14 PRAC1 "
"16 2.35e- 40 5.26 0.536 0.149 4.51e- 36 15 PYY "
"17 4.62e-100 4.13 0.99 0.272 8.87e- 96 16 REG4 "
"18 1.92e- 37 3.84 0.984 0.589 3.68e- 33 17 GUCA2B "
如果希望 all.markers 仍然保持原来的 data.frame 类型,可以使用专门为 data.frame 提供的函数
... some code to create all.markers ...
all.markers <- format.data.frame(all.markers, digits = 3)
write.csv(all.markers, file = "all.markers.csv", quote = TRUE, row.names = FALSE)
执行完上述语句后,在 bash 里查看 all.markers.csv 文件
$ cat all.markers.csv
"p_val","avg_logFC","pct.1","pct.2","p_val_adj","cluster","gene"
" 0.00e+00","2.99","1.000","0.355"," 0.00e+00","0","APOA1"
" 0.00e+00","3.12","1.000","0.367"," 0.00e+00","1","FABP6"
" 0.00e+00","1.54","0.994","0.163"," 0.00e+00","2","PRAC1"
" 0.00e+00","2.39","0.999","0.393"," 0.00e+00","3","REG1A"
" 0.00e+00","1.29","0.973","0.596"," 0.00e+00","4","PLA2G2A"
" 0.00e+00","1.38","0.945","0.557"," 0.00e+00","5","LCN2"
"3.00e-209","1.57","0.891","0.568","5.77e-205","6","HIST1H4C"
" 1.52e-74","2.30","0.868","0.860"," 2.92e-70","7","ZG16"
" 0.00e+00","1.40","0.973","0.294"," 0.00e+00","8","APOB"
" 0.00e+00","2.52","0.992","0.361"," 0.00e+00","9","WFDC2"
" 0.00e+00","2.32","0.952","0.241"," 0.00e+00","10","CA1"
" 0.00e+00","2.35","0.921","0.175"," 0.00e+00","11","CA7"
" 0.00e+00","2.61","0.964","0.174"," 0.00e+00","12","CA7"
" 0.00e+00","2.54","0.989","0.164"," 0.00e+00","13","CEACAM7"
"2.33e-260","1.48","1.000","0.239","4.48e-256","14","PRAC1"
" 2.35e-40","5.26","0.536","0.149"," 4.51e-36","15","PYY"
"4.62e-100","4.13","0.990","0.272"," 8.87e-96","16","REG4"
" 1.92e-37","3.84","0.984","0.589"," 3.68e-33","17","GUCA2B"
关于 format() 函数更多的参数和用法,请参见原文文档
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/format
本文详细介绍如何在R语言中使用format()函数对data.frame类型数据进行格式化,包括基本用法、注意事项及如何保持data.frame类型不变的具体操作。

被折叠的 条评论
为什么被折叠?



