RNA-seq分析实验出错记录及其解决方案

本文详细介绍了在Linux环境下进行生物信息学分析的具体步骤,包括解压GZ文件、下载GFF3注释文件、解读Hiseq计数结果、创建文件、安装HTSeq以及处理BAM文件时常见错误的解决方法。

我的操作平台为linux64bit,所以一切代码均在linux平台下运行。

1.linux解压缩file.gz格式文件。


【参考链接】
[1]https://blog.youkuaiyun.com/z69183787/article/details/81739901
[2]https://www.cnblogs.com/wangshouchang/p/7748527.html

【解决方案】

gzip -d gencode.v29.annotation.gff3.gz

 

2.gff文档的下载
在GENECODE数据库中可以下载到chr开头的gff3人类基因组注释文件。
https://www.gencodegenes.org/human/release_29.html
本次实验我主要下载的Comprehensive gene annotation(Regions:CHR)人类染色体的注释文件。
主要用于解决hiseq-count环节时定量结果都为0的情况。

ENSG00000000003    0
ENSG00000000005    0
ENSG00000000419    0
ENSG00000000457    0
ENSG00000000460    0
ENSG00000000938    0
ENSG00000000971    0
ENSG00000001036    0
ENSG00000001084    0
ENSG00000001167    0
ENSG00000001460    0
ENSG00000001461    0
ENSG00000001497    0
ENSG00000001561    0
ENSG00000001617    0
ENSG00000001626    0
ENSG00000001629    0
ENSG00000001630    0
ENSG00000001631    0
ENSG00000002016    0


【问题链接】https://www.bioinfo.info/?/question/462
另附他人总结的gff文件的四种下载方法:
https://blog.youkuaiyun.com/u011262253/article/details/89363809

 

3.hiseq结果文件解读
结果文件分为2列,第一列是基因名称(ENSMUSG00000000001.4),第二列是统计得到的reads数。
在文件的结尾会有汇总信息。
__no_feature 42987809     #不能对应到任何单位类型的reads数
__ambiguous 183025        #不能判断落在那个单位类型的reads数
__too_low_aQual 0         #低于-a设定的reads mapping质量的reads数
__not_aligned 0           #存在于SAM文件,但没有比对上的reads数
__alignment_not_unique 0   #比对到多个位置的reads数

接着下一步我们会对reads进行进一步的分析整合。
具体参见链接:https://www.jianshu.com/p/d8d5e0b2e33b

4.linux下创建新的文件

touch 新文件名.sh

 

5.HTSeq的安装指南
【参考官网的安装指南】
https://htseq.readthedocs.io/en/release_0.11.1/install.html#installation-on-linux
我的安装平台为buntu64位
python版本为2.7.1
所以本次安装采用的指令为

sudo apt-get install build-essential python2.7-dev python-numpy python-matplotlib python-pysam python-htseq


安装成功!


之前参考一些人的笔记,尝试过很多办法都不能解决。
https://www.cnblogs.com/triple-y/p/9338890.html
http://blog.sina.com.cn/s/blog_68ddca510102wts6.html
但是在这个过程中报错如:

symlinking folders for python2
Could not import 'setuptools', falling back to 'distutils'.
Traceback (most recent call last):
  File "setup.py", line 200, in <module>
    **kwargs
  File "/usr/lib/python2.7/distutils/core.py", line 111, in setup
    _setup_distribution = dist = klass(attrs)
  File "/usr/lib/python2.7/distutils/dist.py", line 259, in __init__
    getattr(self.metadata, "set_" + key)(val)
  File "/usr/lib/python2.7/distutils/dist.py", line 1220, in set_requires
    distutils.versionpredicate.VersionPredicate(v)
  File "/usr/lib/python2.7/distutils/versionpredicate.py", line 113, in __init__
    raise ValueError("expected parenthesized list: %r" % paren)
ValueError: expected parenthesized list: '>=0.9.0'

 

包括也尝试过在windows下的pip指令。(据说htseq是不能用在windows平台上的)

C:\Users\Administrator>pip install HTSeq
Collecting HTSeq
  Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x000002074F161A20>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/htseq/
  Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x000002074F161908>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/htseq/
  Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x000002074F161EF0>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/htseq/
  Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x000002074F161518>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/htseq/
  Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x000002074F161550>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/htseq/
  Could not find a version that satisfies the requirement HTSeq (from versions: )
No matching distribution found for HTSeq
You are using pip version 18.0, however version 19.1.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

 

6.处理bam文件的时候遇到"file may be truncated"的错误。

【错误屏显】

Error occured when reading beginning of SAM/BAM file.

no BGZF EOF marker; file may be truncated

【对bam文件是否完整的诊断方案】

samtools view 42_align_sorted.bam|tail


参考链接:https://www.jianshu.com/p/c6dd7edd6e80

【猜测出现这种情况的可能原因】

(1)生成文件的过程中,突然中断指令。

(2)在文件传输的过程中,为传输完整。(我是在用u盘拷贝文件时拷贝不完全。)

 

 

 

(type) lenovo@lenovo-ThinkStation-P2-Tower:/media/lenovo/E676335D76332E25/SCLC_rna-seq/tumor/HLAtype$ sudo rm /var/crash/nvidia-kernel-source-550* # 删除冲突崩溃报告 (type) lenovo@lenovo-ThinkStation-P2-Tower:/media/lenovo/E676335D76332E25/SCLC_rna-seq/tumor/HLAtype$ sudo rm /var/lib/dkms/nvidia/550.144.03/build/make.log # 清除旧日志 (type) lenovo@lenovo-ThinkStation-P2-Tower:/media/lenovo/E676335D76332E25/SCLC_rna-seq/tumor/HLAtype$ sudo rm /var/lib/dpkg/lock* # 解除APT锁定 (type) lenovo@lenovo-ThinkStation-P2-Tower:/media/lenovo/E676335D76332E25/SCLC_rna-seq/tumor/HLAtype$ sudo apt update 命中:1 http://security.ubuntu.com/ubuntu noble-security InRelease 命中:2 http://archive.ubuntu.com/ubuntu noble InRelease 命中:3 https://download.docker.com/linux/ubuntu noble InRelease 正在读取软件包列表... 完成 正在分析软件包的依赖关系树... 完成 正在读取状态信息... 完成 有 1 个软件包可以升级。请执行 ‘apt list --upgradable’ 来查看它们。 (type) lenovo@lenovo-ThinkStation-P2-Tower:/media/lenovo/E676335D76332E25/SCLC_rna-seq/tumor/HLAtype$ sudo apt --fix-broken install -o Dpkg::Options::="--force-overwrite" 正在读取软件包列表... 完成 正在分析软件包的依赖关系树... 完成 正在读取状态信息... 完成 下列软件包是自动安装的并且现在不需要了: linux-hwe-6.11-tools-6.11.0-25 linux-tools-6.11.0-25-generic 使用'sudo apt autoremove'来卸载它(它们)。 升级了 0 个软件包,新安装了 0 个软件包,要卸载 0 个软件包,有 1 个软件包未被升级。 有 2 个软件包没有被完全安装或卸载。 解压缩后会消耗 0 B 的额外空间。 正在设置 nvidia-dkms-550 (550.144.03-0ubuntu0.24.04.1) ... update-initramfs: deferring update (trigger activated) update-initramfs: Generating /boot/initrd.img-6.14.0-28-generic INFO:Enable nvidia DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/lenovo_thinkpad DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/dell_latitude DEBUG:Parsing /usr/share/ubuntu-drivers-common/quirks/put_your_quirks_here Removing old nvidia-550.144.03 DKMS files... Deleting module nvidia-550.144.03 completely from the DKMS tree. Loading new nvidia-550.144.03 DKMS files... Building for 6.14.0-28-generic 6.14.0-29-generic Building for architecture x86_64 Building initial module for 6.14.0-28-generic Error! Bad return status for module build on kernel: 6.14.0-28-generic (x86_64) Consult /var/lib/dkms/nvidia/550.144.03/build/make.log for more information. dpkg: 处理软件包 nvidia-dkms-550 (--configure)时出错: 已安装 nvidia-dkms-550 软件包 post-installation 脚本 子进程返回错误状态 10 正在设置 linux-image-6.14.0-29-generic (6.14.0-29.29~24.04.1) ... 正在处理用于 initramfs-tools (0.142ubuntu25.4) 的触发器 ... update-initramfs: Generating /boot/initrd.img-6.14.0-28-generic 正在处理用于 linux-image-6.14.0-29-generic (6.14.0-29.29~24.04.1) 的触发器 ... /etc/kernel/postinst.d/dkms: * dkms: running auto installation service for kernel 6.14.0-29-generic Sign command: /usr/bin/kmodsign Signing key: /var/lib/shim-signed/mok/MOK.priv Public certificate (MOK): /var/lib/shim-signed/mok/MOK.der Building module: Cleaning build area... unset ARCH; [ ! -h /usr/bin/cc ] && export CC=/usr/bin/gcc; env NV_VERBOSE=1 'ma ke' -j16 NV_EXCLUDE_BUILD_MODULES='' KERNEL_UNAME=6.14.0-29-generic IGNORE_XEN_P RESENCE=1 IGNORE_CC_MISMATCH=1 SYSSRC=/lib/modules/6.14.0-29-generic/build LD=/u sr/bin/ld.bfd CONFIG_X86_KERNEL_IBT= modules.........(bad exit status: 2) ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/nvidia-kernel-s ource-550.0.crash' Error! Bad return status for module build on kernel: 6.14.0-29-generic (x86_64) Consult /var/lib/dkms/nvidia/550.144.03/build/make.log for more information. dkms autoinstall on 6.14.0-29-generic/x86_64 failed for nvidia(10) Error! One or more modules failed to install during autoinstall. Refer to previous errors for more information. * dkms: autoinstall for kernel 6.14.0-29-generic ...fail! run-parts: /etc/kernel/postinst.d/dkms exited with return code 11 dpkg: 处理软件包 linux-image-6.14.0-29-generic (--configure)时出错: 已安装 linux-image-6.14.0-29-generic 软件包 post-installation 脚本 子进程返回错 误状态 11 在处理时有错误发生: nvidia-dkms-550 linux-image-6.14.0-29-generic E: Sub-process /usr/bin/dpkg returned an error code (1)
最新发布
09-03
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值