使用LSTM网络即兴创作爵士乐独奏-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_44334615/article/details/106346181

本文介绍了如何构建一个深度学习模型，利用LSTM网络在爵士乐中生成独奏部分。首先，对包含背景音、触发词正例和反例的音频数据进行处理，构建带有标签的训练集。接着，设计模型，通过一维卷积和两层GRU单元学习音频特征，最终通过softmax得到输出，实现音乐创作。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

构建一个识别触发语言的模型（siri）

数据构建

我们手里有：
背景音
正例音（符合trigger word）
反例音（不符合trigger word）

我们把正例音和反例音随机插在背景音当中作为训练集数据，同时如下设置标签：在每个正例音结束之后，将一定时间段的y值设为1（增加1的个数提高学习效果）

判断时间段是否冲突

# GRADED FUNCTION: is_overlapping

def is_overlapping(segment_time, previous_segments):
    """
    Checks if the time of a segment overlaps with the times of existing segments.
    
    Arguments:
    segment_time -- a tuple of (segment_start, segment_end) for the new segment
    previous_segments -- a list of tuples of (segment_start, segment_end) for the existing segments
    
    Returns:
    True if the time segment overlaps with any of the existing segments, False otherwise
    """
    
    segment_start, segment_end = segment_time
    
    ### START CODE HERE ### (≈ 4 line)
    # Step 1: Initialize overlap as a "False" flag. (≈ 1 line)
    overlap = False
    
    # Step 2: loop over the previous_segments start and end times.
    # Compare start/end times and set the flag to True if there is an overlap (≈ 3 lines)
    for previous_start, previous_end in previous_segments:
        if not(previous_start>segment_end or previous_end<segment_start):
            overlap = True
            break
    ### END CODE HERE ###

    return overlap

设置插入方法来构造训练集数据

# GRADED FUNCTION: insert_audio_clip

def insert_audio_clip(background, audio_clip, previous_segments):
    """
    Insert a new audio segment over the background noise at a random time step, ensuring that the 
    audio segment does not overlap with existing segments.
    
    Arguments:
    background -- a 10 second background audio recording.  
    audio_clip -- the audio clip to be inserted/overlaid. 
    previous_segments -- times where audio segments have already been placed
    
    Returns:
    new_background -- the updated background audio
    """
    
    # Get the duration of the audio clip in ms
    segment_ms = len(audio_clip)
    
    ### START CODE HERE ### 
    # Step 1: Use one of the helper functions to pick a random time segment onto which to insert 
    # the new audio clip. (≈ 1 line)
    segment_time = get_random_time_segment(segment_ms)
    
    # Step 2: Check if the new segment_time overlaps with one of the previous_segments. If so, keep 
    # picking new segment_time at random until it doesn't overlap. (≈ 2 lines)
    while is_overlapping(segment_time, previous_segments):
        segment_time = get_random_time_segment(segment_ms)

    # Step 3: Add the new segment_time to the list of previous_segments