Non-overlapping Intervals

本文详细解析了一种去除重复区间的算法,通过排序和遍历的方式,找出最小数量的不重叠区间,适用于解决类似QueueReconstructionbyHeight的问题。文章提供了C++实现代码,展示了如何通过比较区间的起点和终点来确定重叠情况,并给出了解决方案。

1. 解析

题目大意,去掉最少的区间,使剩余的区间不重叠。注意[1, 2]与[2, 3]区间是不重叠的,虽然在2这个点上重叠,但认为不重叠的就行。

2. 分析

这道题整体上是不难的,思路有点类似于Queue Reconstruction by Height,我们将所有的区间根据起点的大小进行排序,如果起点相同,则终点值小的排在前面。例如:[ [1, 2], [2, 3], [3, 4], [1, 3] ]

排序后:[ [1, 2], [1, 3], [2, 3], [3, 4] ]

接下来我们只需从头开始遍历,每次检测区间的起点是否位于前面已经确定区间的范围内:

(1) 存在

①终点也位于确定区间内,更新当前区间为确定区间,因为当前区间是确定区间的子区间,范围更小

②终点不位于确定区间内,意味着确定的区间范围更小,则将当前区间删除

(2)  不存在

①意味着两个区间不会出现重叠,更新当前区间为确定区间。

class Solution {
public:
    int eraseOverlapIntervals(vector<vector<int>>& intervals) {
        sort(intervals.begin(), intervals.end(), [](const vector<int>& a, const vector<int>& b){
            return a[0] < b[0] || (a[0] == b[0] && a[1] < b[1]);
        });
        int res = 0, left = INT_MAX, right = INT_MAX;
        for (auto interval : intervals){
            if (left <= interval[0] && interval[0] < right){
                ++res;
                if (interval[1] > right)  //关键,如果没有包含关系,还是保持原来的区间
                    continue;               
            }
            left = interval[0]; //出现重叠子区间[1, 4], [2, 3] 或者不重叠区间[1, 4], [2, 5], 则更新
            right = interval[1];
        }
        
        return res;
    }
};

类似的题目:

Queue Reconstruction by Height

注意” yahs -h Usage: yahs [options] <contigs.fa> <hic.bed>|<hic.bam>|<hic.pa5>|<hic.bin> Options: -a FILE AGP file (for rescaffolding) [none] -r INT[,INT,...] list of resolutions in ascending order [automate] -R INT rounds to run at each resoultion level [1] -e STR restriction enzyme cutting sites [none] -l INT minimum length of a contig to scaffold [0] -q INT minimum mapping quality [10] --no-contig-ec do not do contig error correction --no-scaffold-ec do not do scaffold error correction --no-mem-check do not do memory check at runtime --file-type STR input file type BED|BAM|PA5|BIN, file name extension is ignored if set --read-length read length (required for PA5 format input) [150] --telo-motif STR telomeric sequence motif -o STR prefix of output files [yahs.out] -v INT verbose level [0] -? print long help with extra option list --version show version number (Hic) [scb3201@ln137%bscc-a6 Hic]$ which juicer_tools ~/anaconda3/envs/Hic/bin/juicer_tools (Hic) [scb3201@ln137%bscc-a6 Hic]$ juicer_tools -h WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. WARN [2025-12-05T22:53:25,675] [Globals.java:138] [main] Development mode is enabled Juicer Tools Version 2.20.00 Usage: dump <observed/oe> <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr1>[:x1:x2] <chr2>[:y1:y2] <BP/FRAG> <binsize> [outfile] dump <norm/expected> <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr> <BP/FRAG> <binsize> [outfile] dump <loops/domains> <hicFile URL> [outfile] pre [options] <infile> <outfile> <genomeID> addNorm <input_HiC_file> [input_vector_file] pearsons [-p] <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr> <BP/FRAG> <binsize> [outfile] eigenvector -p <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr> <BP/FRAG> <binsize> [outfile] apa <hicFile(s)> <PeaksFile> <SaveFolder> arrowhead <hicFile(s)> <output_file> hiccups <hicFile> <outputDirectory> hiccupsdiff <firstHicFile> <secondHicFile> <firstLoopList> <secondLoopList> <outputDirectory> validate <hicFile> -h, --help print help -v, --verbose verbose mode -V, --version print version Type juicer_tools <commandName> for more detailed usage instructions (Hic) [scb3201@ln137%bscc-a6 Hic]$ bedtools -h bedtools is a powerful toolset for genome arithmetic. Version: v2.31.1 About: developed in the quinlanlab.org and by many contributors worldwide. Docs: http://bedtools.readthedocs.io/ Code: https://github.com/arq5x/bedtools2 Mail: https://groups.google.com/forum/#!forum/bedtools-discuss Usage: bedtools <subcommand> [options] The bedtools sub-commands include: [ Genome arithmetic ] intersect Find overlapping intervals in various ways. window Find overlapping intervals within a window around an interval. closest Find the closest, potentially non-overlapping interval. coverage Compute the coverage over defined intervals. map Apply a function to a column for each overlapping interval. genomecov Compute the coverage over an entire genome. merge Combine overlapping/nearby intervals into a single interval. cluster Cluster (but don't merge) overlapping/nearby intervals. complement Extract intervals _not_ represented by an interval file. shift Adjust the position of intervals. subtract Remove intervals based on overlaps b/w two files. slop Adjust the size of intervals. flank Create new intervals from the flanks of existing intervals. sort Order the intervals in a file. random Generate random intervals in a genome. shuffle Randomly redistribute intervals in a genome. sample Sample random records from file using reservoir sampling. spacing Report the gap lengths between intervals in a file. annotate Annotate coverage of features from multiple files. [ Multi-way file comparisons ] multiinter Identifies common intervals among multiple interval files. unionbedg Combines coverage intervals from multiple BEDGRAPH files. [ Paired-end manipulation ] pairtobed Find pairs that overlap intervals in various ways. pairtopair Find pairs that overlap other pairs in various ways. [ Format conversion ] bamtobed Convert BAM alignments to BED (& other) formats. bedtobam Convert intervals to BAM records. bamtofastq Convert BAM records to FASTQ records. bedpetobam Convert BEDPE intervals to BAM records. bed12tobed6 Breaks BED12 intervals into discrete BED6 intervals. [ Fasta manipulation ] getfasta Use intervals to extract sequences from a FASTA file. maskfasta Use intervals to mask sequences from a FASTA file. nuc Profile the nucleotide content of intervals in a FASTA file. [ BAM focused tools ] multicov Counts coverage from multiple BAMs at specific intervals. tag Tag BAM alignments based on overlaps with interval files. [ Statistical relationships ] jaccard Calculate the Jaccard statistic b/w two sets of intervals. reldist Calculate the distribution of relative distances b/w two files. fisher Calculate Fisher statistic b/w two feature files. [ Miscellaneous tools ] overlap Computes the amount of overlap from two intervals. igv Create an IGV snapshot batch script. links Create a HTML page of links to UCSC locations. makewindows Make interval "windows" across a genome. groupby Group by common cols. & summarize oth. cols. (~ SQL "groupBy") expand Replicate lines based on lists of values in columns. split Split a file into multiple files with equal records or base pairs. summary Statistical summary of intervals in a file. [ General Parameters ] --cram-ref Reference used by a CRAM input [ General help ] --help Print this help menu. --version What version of bedtools are you using?. --contact Feature requests, bugs, mailing lists, etc. (Hic) [scb3201@ln137%bscc-a6 Hic]$ bwa -h [main] unrecognized command '-h' (Hic) [scb3201@ln137%bscc-a6 Hic]$ bwa Program: bwa (alignment via Burrows-Wheeler transformation) Version: 0.7.19-r1273 Contact: Heng Li <hli@ds.dfci.harvard.edu> Usage: bwa <command> [options] Command: index index sequences in the FASTA format mem BWA-MEM algorithm fastmap identify super-maximal exact matches pemerge merge overlapping paired ends (EXPERIMENTAL) aln gapped/ungapped alignment samse generate alignment (single ended) sampe generate alignment (paired ended) bwasw BWA-SW for long queries (DEPRECATED) shm manage indices in shared memory fa2pac convert FASTA to PAC format pac2bwt generate BWT from PAC pac2bwtgen alternative algorithm for generating BWT bwtupdate update .bwt to the new format bwt2sa generate SA from BWT and Occ Note: To use BWA, you need to first index the genome with `bwa index'. There are three alignment algorithms in BWA: `mem', `bwasw', and `aln/samse/sampe'. If you are not sure which to use, try `bwa mem' first. Please `man ./bwa.1' for the manual. (Hic) [scb3201@ln137%bscc-a6 Hic]$ assembly-stats usage: stats [options] <list of fasta/q files> Reports sequence length statistics from fasta and/or fastq files options: -l <int> Minimum length cutoff for each sequence. Sequences shorter than the cutoff will be ignored [1] -s Print 'grep friendly' output -t Print tab-delimited output -u Print tab-delimited output with no header line -v Print version and exit“
最新发布
12-06
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值