【小白教程】从零开始学 Dify - 万字详解 Dify 循环和迭代的实现机制，建议收藏！！！

最新推荐文章于 2025-08-05 13:20:27 发布

原创最新推荐文章于 2025-08-05 13:20:27 发布 · 531 阅读

10 ·

CC 4.0 BY-SA版权

文章标签：

#人工智能 #语言模型 #AI大模型 #大模型入门 #大模型学习 #Dify #大模型

一、概述

Dify 是一个强大的 AI 应用开发平台，其工作流引擎支持复杂的循环和迭代操作。接下来将深入分析 Dify 中循环和迭代的实现机制。

什么是循环和迭代？

循环（Loop）：根据条件重复执行一组操作，直到满足退出条件
迭代（Iteration）：对集合中的每个元素执行相同的操作

为什么需要循环和迭代？

在 AI 应用中，循环和迭代机制能够：

处理批量数据
实现复杂的业务逻辑
提高工作流的灵活性和可重用性
支持动态数据处理

二、核心概念

节点类型

Dify 工作流引擎定义了以下与循环和迭代相关的节点类型：

class NodeType(StrEnum):
    LOOP = "loop"                    # 循环节点
    LOOP_START = "loop-start"        # 循环开始节点
    LOOP_END = "loop-end"            # 循环结束节点
    ITERATION = "iteration"          # 迭代节点
    ITERATION_START = "iteration-start"  # 迭代开始节点

核心组件架构

三、执行流程

整体执行架构

节点执行详细流程

四、循环机制详解

循环节点结构

循环节点（LoopNode）是 Dify 中实现循环逻辑的核心组件。它包含以下关键属性：

class LoopNodeData(BaseLoopNodeData):
    loop_count: int                               # 循环次数
    loop_variables: list[LoopVariable]            # 循环变量
    break_conditions: list[Condition]             # 中断条件
    logical_operator: Literal["and", "or"]        # 逻辑运算符

循环执行流程

循环实现核心代码分析

1. 循环主执行方法

api/core/workflow/nodes/loop/loop_node.py
def _run(self) -> Generator[NodeEvent | InNodeEvent, None, None]:
    loop_count = self.node_data.loop_count  
 break_conditions = self.node_data.break_conditions  
 logical_operator = self.node_data.logical_operator  
   
 inputs = {"loop_count": loop_count}  
   
ifnot self.node_data.start_node_id:  
     raise ValueError(f"field start_node_id in loop {self.node_id} not found")  
   
# Initialize graph  
 loop_graph = Graph.init(graph_config=self.graph_config, root_node_id=self.node_data.start_node_id)  
ifnot loop_graph:  
     raise ValueError("loop graph not found")  
   
# Initialize variable pool  
 variable_pool = self.graph_runtime_state.variable_pool  
 variable_pool.add([self.node_id, "index"], 0)  
   
# Initialize loop variables  
 loop_variable_selectors = {}  
if self.node_data.loop_variables:  
     for loop_variable in self.node_data.loop_variables:  
         value_processor = {  
             "constant": lambda var=loop_variable: self._get_segment_for_constant(var.var_type, var.value),  
             "variable": lambda var=loop_variable: variable_pool.get(var.value),  
         }  
   
         if loop_variable.value_type notin value_processor:  
             raise ValueError(  
                 f"Invalid value type '{loop_variable.value_type}' for loop variable {loop_variable.label}"
             )  
   
         processed_segment = value_processor[loop_variable.value_type]()  
         ifnot processed_segment:  
             raise ValueError(f"Invalid value for loop variable {loop_variable.label}")  
         variable_selector = [self.node_id, loop_variable.label]  
         variable_pool.add(variable_selector, processed_segment.value)  
         loop_variable_selectors[loop_variable.label] = variable_selector  
         inputs[loop_variable.label] = processed_segment.value  
   
from core.workflow.graph_engine.graph_engine import GraphEngine  
   
 graph_engine = GraphEngine(  
     tenant_id=self.tenant_id,  
     app_id=self.app_id,  
     workflow_type=self.workflow_type,  
     workflow_id=self.workflow_id,  
     user_id=self.user_id,  
     user_from=self.user_from,  
     invoke_from=self.invoke_from,  
     call_depth=self.workflow_call_depth,  
     graph=loop_graph,  
     graph_config=self.graph_config,  
     variable_pool=variable_pool,  
     max_execution_steps=dify_config.WORKFLOW_MAX_EXECUTION_STEPS,  
     max_execution_time=dify_config.WORKFLOW_MAX_EXECUTION_TIME,  
     thread_pool_id=self.thread_pool_id,  
 )  
   
 start_at = datetime.now(UTC).replace(tzinfo=None)  
 condition_processor = ConditionProcessor()  
   
# Start Loop event  
yield LoopRunStartedEvent(  
     loop_id=self.id,  
     loop_node_id=self.node_id,  
     loop_node_type=self.node_type,  
     loop_node_data=self.node_data,  
     start_at=start_at,  
     inputs=inputs,  
     metadata={"loop_length": loop_count},  
     predecessor_node_id=self.previous_node_id,  
 )  

 loop_duration_map = {}  
 single_loop_variable_map = {}  # single loop variable output  
try:  
     check_break_result = False
     for i in range(loop_count):  
         loop_start_time = datetime.now(UTC).replace(tzinfo=None)  
         # run single loop  
         loop_result = yieldfrom self._run_single_loop(  
             graph_engine=graph_engine,  
             loop_graph=loop_graph,  
             variable_pool=variable_pool,  
             loop_variable_selectors=loop_variable_selectors,  
             break_conditions=break_conditions,  
             logical_operator=logical_operator,  
             condition_processor=condition_processor,  
             current_index=i,  
             start_at=start_at,  
             inputs=inputs,  
         )  
         loop_end_time = datetime.now(UTC).replace(tzinfo=None)  
   
         single_loop_variable = {}  
         for key, selector in loop_variable_selectors.items():  
             item = variable_pool.get(selector)  
             if item:  
                 single_loop_variable[key] = item.value  
             else:  
                 single_loop_variable[key] = None
   
         loop_duration_map[str(i)] = (loop_end_time - loop_start_time).total_seconds()  
         single_loop_variable_map[str(i)] = single_loop_variable  
   
         check_break_result = loop_result.get("check_break_result", False)  
   
         if check_break_result:  
             break
   
     # Loop completed successfully  
     yield LoopRunSucceededEvent(  
         loop_id=self.id,  
         loop_node_id=self.node_id,  
         loop_node_type=self.node_type,  
         loop_node_data=self.node_data,  
         start_at=start_at,  
         inputs=inputs,  
         outputs=self.node_data.outputs,  
         steps=loop_count,  
         metadata={  
             WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: graph_engine.graph_runtime_state.total_tokens,  
             "completed_reason": "loop_break"if check_break_result else"loop_completed",  
             WorkflowNodeExecutionMetadataKey.LOOP_DURATION_MAP: loop_duration_map,  
             WorkflowNodeExecutionMetadataKey.LOOP_VARIABLE_MAP: single_loop_variable_map,  
         },  
     )  
   
     yield RunCompletedEvent(  
         run_result=NodeRunResult(  
             status=WorkflowNodeExecutionStatus.SUCCEEDED,  
             metadata={  
                 WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: graph_engine.graph_runtime_state.total_tokens,  
                 WorkflowNodeExecutionMetadataKey.LOOP_DURATION_MAP: loop_duration_map,  
                 WorkflowNodeExecutionMetadataKey.LOOP_VARIABLE_MAP: single_loop_variable_map,  
             },  
             outputs=self.node_data.outputs,  
             inputs=inputs,  
         )  
     )  
   
except Exception as e:  
     # Loop failed  
     logger.exception("Loop run failed")  
     yield LoopRunFailedEvent(  
         loop_id=self.id,  
         loop_node_id=self.node_id,  
         loop_node_type=self.node_type,  
         loop_node_data=self.node_data,  
         start_at=start_at,  
         inputs=inputs,  
         steps=loop_count,  
         metadata={  
             "total_tokens": graph_engine.graph_runtime_state.total_tokens,  
             "completed_reason": "error",  
             WorkflowNodeExecutionMetadataKey.LOOP_DURATION_MAP: loop_duration_map,  
             WorkflowNodeExecutionMetadataKey.LOOP_VARIABLE_MAP: single_loop_variable_map,  
         },  
         error=str(e),  
     )  
   
     yield RunCompletedEvent(  
         run_result=NodeRunResult(  
             status=WorkflowNodeExecutionStatus.FAILED,  
             error=str(e),  
             metadata={  
                 WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: graph_engine.graph_runtime_state.total_tokens,  
                 WorkflowNodeExecutionMetadataKey.LOOP_DURATION_MAP: loop_duration_map,  
                 WorkflowNodeExecutionMetadataKey.LOOP_VARIABLE_MAP: single_loop_variable_map,  
             },  
         )  
     )  
   
finally:  
     # Clean up  
     variable_pool.remove([self.node_id, "index"])

2. 单次循环执行

def _run_single_loop(
    self,
    *,
    graph_engine: "GraphEngine",
    loop_graph: Graph,
    variable_pool: "VariablePool",
    loop_variable_selectors: dict,
    break_conditions: list,
    logical_operator: Literal["and", "or"],
    condition_processor: ConditionProcessor,
    current_index: int,
    start_at: datetime,
    inputs: dict,
) -> dict:
    # 更新循环索引
    variable_pool.add([self.node_id, "index"], current_index)
    
    # 执行循环图
    for event in graph_engine.run():
        if isinstance(event, GraphRunSucceededEvent):
            # 检查中断条件
            check_break_result = False
            if break_conditions:
                check_break_result = condition_processor.process_conditions(
                    conditions=break_conditions,
                    logical_operator=logical_operator,
                    variable_pool=variable_pool,
                )
            
            if check_break_result:
                return {"check_break_result": True}
            
            # 触发下一次循环事件
            yield LoopRunNextEvent(
                loop_id=self.id,
                loop_node_id=self.node_id,
                loop_node_type=self.node_type,
                loop_node_data=self.node_data,
                index=current_index + 1,
                pre_loop_output=None,
            )
            
            return {"check_break_result": False}
            
        elif isinstance(event, GraphRunFailedEvent):
            yield LoopRunFailedEvent(
                loop_id=self.id,
                loop_node_id=self.node_id,
                loop_node_type=self.node_type,
                loop_node_data=self.node_data,
                start_at=start_at,
                inputs=inputs,
                error=event.error,
            )
            return {"check_break_result": False}](<self,
        *,
        graph_engine: "GraphEngine",
        loop_graph: Graph,
        variable_pool: "VariablePool",
        loop_variable_selectors: dict,
        break_conditions: list,
        logical_operator: Literal["and", "or"],
        condition_processor: ConditionProcessor,
        current_index: int,
        start_at: datetime,
        inputs: dict,
    ) -%3E Generator[NodeEvent | InNodeEvent, None, dict]:
        """Run a single loop iteration.
        Returns:
            dict:  {'check_break_result': bool}
        """
        # Run workflow
        rst = graph_engine.run()
        current_index_variable = variable_pool.get([self.node_id, "index"])
        ifnot isinstance(current_index_variable, IntegerSegment):
            raise ValueError(f"loop {self.node_id} current index not found")
        current_index = current_index_variable.value

        check_break_result = False

        for event in rst:
            if isinstance(event, (BaseNodeEvent | BaseParallelBranchEvent)) andnot event.in_loop_id:
                event.in_loop_id = self.node_id

            if (
                isinstance(event, BaseNodeEvent)
                and event.node_type == NodeType.LOOP_START
                andnot isinstance(event, NodeRunStreamChunkEvent)
            ):
                continue

            if (
                isinstance(event, NodeRunSucceededEvent)
                and event.node_type == NodeType.LOOP_END
                andnot isinstance(event, NodeRunStreamChunkEvent)
            ):
                check_break_result = True
                yield self._handle_event_metadata(event=event, iter_run_index=current_index)
                break

            if isinstance(event, NodeRunSucceededEvent):
                yield self._handle_event_metadata(event=event, iter_run_index=current_index)

                # Check if all variables in break conditions exist
                exists_variable = False
                for condition in break_conditions:
                    ifnot self.graph_runtime_state.variable_pool.get(condition.variable_selector):
                        exists_variable = False
                        break
                    else:
                        exists_variable = True
                if exists_variable:
                    input_conditions, group_result, check_break_result = condition_processor.process_conditions(
                        variable_pool=self.graph_runtime_state.variable_pool,
                        conditions=break_conditions,
                        operator=logical_operator,
                    )
                    if check_break_result:
                        break

            elif isinstance(event, BaseGraphEvent):
                if isinstance(event, GraphRunFailedEvent):
                    # Loop run failed
                    yield LoopRunFailedEvent(
                        loop_id=self.id,
                        loop_node_id=self.node_id,
                        loop_node_type=self.node_type,
                        loop_node_data=self.node_data,
                        start_at=start_at,
                        inputs=inputs,
                        steps=current_index,
                        metadata={
                            WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: (
                                graph_engine.graph_runtime_state.total_tokens
                            ),
                            "completed_reason": "error",
                        },
                        error=event.error,
                    )
                    yield RunCompletedEvent(
                        run_result=NodeRunResult(
                            status=WorkflowNodeExecutionStatus.FAILED,
                            error=event.error,
                            metadata={
                                WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: (
                                    graph_engine.graph_runtime_state.total_tokens
                                )
                            },
                        )
                    )
                    return {"check_break_result": True}
            elif isinstance(event, NodeRunFailedEvent):
                # Loop run failed
                yield self._handle_event_metadata(event=event, iter_run_index=current_index)
                yield LoopRunFailedEvent(
                    loop_id=self.id,
                    loop_node_id=self.node_id,
                    loop_node_type=self.node_type,
                    loop_node_data=self.node_data,
                    start_at=start_at,
                    inputs=inputs,
                    steps=current_index,
                    metadata={
                        WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: graph_engine.graph_runtime_state.total_tokens,
                        "completed_reason": "error",
                    },
                    error=event.error,
                )
                yield RunCompletedEvent(
                    run_result=NodeRunResult(
                        status=WorkflowNodeExecutionStatus.FAILED,
                        error=event.error,
                        metadata={
                            WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: graph_engine.graph_runtime_state.total_tokens
                        },
                    )
                )
                return {"check_break_result": True}
            else:
                yield self._handle_event_metadata(event=cast(InNodeEvent, event), iter_run_index=current_index)

        # Remove all nodes outputs from variable pool
        for node_id in loop_graph.node_ids:
            variable_pool.remove([node_id])

        _outputs = {}
        for loop_variable_key, loop_variable_selector in loop_variable_selectors.items():
            _loop_variable_segment = variable_pool.get(loop_variable_selector)
            if _loop_variable_segment:
                _outputs[loop_variable_key] = _loop_variable_segment.value
            else:
                _outputs[loop_variable_key] = None

        _outputs["loop_round"] = current_index + 1
        self.node_data.outputs = _outputs

        if check_break_result:
            return {"check_break_result": True}

        # Move to next loop
        next_index = current_index + 1
        variable_pool.add([self.node_id, "index"], next_index)

        yield LoopRunNextEvent(
            loop_id=self.id,
            loop_node_id=self.node_id,
            loop_node_type=self.node_type,
            loop_node_data=self.node_data,
            index=next_index,
            pre_loop_output=self.node_data.outputs,
        )

        return {"check_break_result": False}>)

循环条件处理

循环的中断条件通过 ConditionProcessor 类处理：

class ConditionProcessor:
    """条件处理器，用于处理循环中断条件"""

    def process_conditions(  
    self,  
    *,  
    variable_pool: VariablePool,  
    conditions: Sequence[Condition],  
    operator: Literal["and", "or"],  
 ):
     input_conditions = []  
     group_results = []  
   
     for condition in conditions:  
         variable = variable_pool.get(condition.variable_selector)  
         if variable isNone:  
             raise ValueError(f"Variable {condition.variable_selector} not found")  
   
         if isinstance(variable, ArrayFileSegment) and condition.comparison_operator in {  
             "contains",  
             "not contains",  
             "all of",  
         }:  
             # check sub conditions  
             ifnot condition.sub_variable_condition:  
                 raise ValueError("Sub variable is required")  
             result = _process_sub_conditions(  
                 variable=variable,  
                 sub_conditions=condition.sub_variable_condition.conditions,  
                 operator=condition.sub_variable_condition.logical_operator,  
             )  
         elif condition.comparison_operator in {  
             "exists",  
             "not exists",  
         }:  
             result = _evaluate_condition(  
                 value=variable.value,  
                 operator=condition.comparison_operator,  
                 expected=None,  
             )  
         else:  
             actual_value = variable.value if variable elseNone
             expected_value = condition.value  
             if isinstance(expected_value, str):  
                 expected_value = variable_pool.convert_template(expected_value).text  
             input_conditions.append(  
                 {  
                     "actual_value": actual_value,  
                     "expected_value": expected_value,  
                     "comparison_operator": condition.comparison_operator,  
                 }  
             )  
             result = _evaluate_condition(  
                 value=actual_value,  
                 operator=condition.comparison_operator,  
                 expected=expected_value,  
             )  
         group_results.append(result)  
         # Implemented short-circuit evaluation for logical conditions  
         if (operator == "and"andnot result) or (operator == "or"and result):  
             final_result = result  
             return input_conditions, group_results, final_result  
   
     final_result = all(group_results) if operator == "and"else any(group_results)  
     return input_conditions, group_results, final_result

五、迭代机制详解

迭代节点结构

迭代节点（IterationNode）用于对集合中的每个元素执行相同的操作：

class IterationNodeData(BaseIterationNodeData):
    iterator_selector: list[str]      # 迭代器选择器
    output_selector: list[str]        # 输出选择器
    is_parallel: bool = False         # 是否并行执行
    parallel_nums: int = 10           # 并行数量
    error_handle_mode: ErrorHandleMode = ErrorHandleMode.TERMINATED

迭代执行流程

迭代实现核心代码分析

1. 迭代主执行方法

api/core/workflow/nodes/iteration/iteration_node.py
def _run(self) -> Generator[NodeEvent | InNodeEvent, None, None]:
    """  
 Run the node.  
 """
 variable = self.graph_runtime_state.variable_pool.get(self.node_data.iterator_selector)  
   
ifnot variable:  
     raise IteratorVariableNotFoundError(f"iterator variable {self.node_data.iterator_selector} not found")  
   
ifnot isinstance(variable, ArrayVariable) andnot isinstance(variable, NoneVariable):  
     raise InvalidIteratorValueError(f"invalid iterator value: {variable}, please provide a list.")  
   
if isinstance(variable, NoneVariable) or len(variable.value) == 0:  
     yield RunCompletedEvent(  
         run_result=NodeRunResult(  
             status=WorkflowNodeExecutionStatus.SUCCEEDED,  
             outputs={"output": []},  
         )  
     )  
     return
   
 iterator_list_value = variable.to_object()  
   
ifnot isinstance(iterator_list_value, list):  
     raise InvalidIteratorValueError(f"Invalid iterator value: {iterator_list_value}, please provide a list.")  
   
 inputs = {"iterator_selector": iterator_list_value}  
   
 graph_config = self.graph_config  
   
ifnot self.node_data.start_node_id:  
     raise StartNodeIdNotFoundError(f"field start_node_id in iteration {self.node_id} not found")  
   
 root_node_id = self.node_data.start_node_id  
   
# init graph  
 iteration_graph = Graph.init(graph_config=graph_config, root_node_id=root_node_id)  
   
ifnot iteration_graph:  
     raise IterationGraphNotFoundError("iteration graph not found")  
   
 variable_pool = self.graph_runtime_state.variable_pool  
   
# append iteration variable (item, index) to variable pool  
 variable_pool.add([self.node_id, "index"], 0)  
 variable_pool.add([self.node_id, "item"], iterator_list_value[0])  
   
# init graph engine  
from core.workflow.graph_engine.graph_engine import GraphEngine, GraphEngineThreadPool  
   
 graph_engine = GraphEngine(  
     tenant_id=self.tenant_id,  
     app_id=self.app_id,  
     workflow_type=self.workflow_type,  
     workflow_id=self.workflow_id,  
     user_id=self.user_id,  
     user_from=self.user_from,  
     invoke_from=self.invoke_from,  
     call_depth=self.workflow_call_depth,  
     graph=iteration_graph,  
     graph_config=graph_config,  
     variable_pool=variable_pool,  
     max_execution_steps=dify_config.WORKFLOW_MAX_EXECUTION_STEPS,  
     max_execution_time=dify_config.WORKFLOW_MAX_EXECUTION_TIME,  
     thread_pool_id=self.thread_pool_id,  
 )  
   
 start_at = datetime.now(UTC).replace(tzinfo=None)  
   
yield IterationRunStartedEvent(  
     iteration_id=self.id,  
     iteration_node_id=self.node_id,  
     iteration_node_type=self.node_type,  
     iteration_node_data=self.node_data,  
     start_at=start_at,  
     inputs=inputs,  
     metadata={"iterator_length": len(iterator_list_value)},  
     predecessor_node_id=self.previous_node_id,  
 )  
   
yield IterationRunNextEvent(  
     iteration_id=self.id,  
     iteration_node_id=self.node_id,  
     iteration_node_type=self.node_type,  
     iteration_node_data=self.node_data,  
     index=0,  
     pre_iteration_output=None,  
     duration=None,  
 )  
 iter_run_map: dict[str, float] = {}  
 outputs: list[Any] = [None] * len(iterator_list_value)  
try:  
     if self.node_data.is_parallel:  
         futures: list[Future] = []  
         q: Queue = Queue()  
         thread_pool = GraphEngineThreadPool(  
             max_workers=self.node_data.parallel_nums, max_submit_count=dify_config.MAX_SUBMIT_COUNT  
         )  
         for index, item in enumerate(iterator_list_value):  
             future: Future = thread_pool.submit(  
                 self._run_single_iter_parallel,  
                 flask_app=current_app._get_current_object(),  # type: ignore  
                 q=q,  
                 context=contextvars.copy_context(),  
                 iterator_list_value=iterator_list_value,  
                 inputs=inputs,  
                 outputs=outputs,  
                 start_at=start_at,  
                 graph_engine=graph_engine,  
                 iteration_graph=iteration_graph,  
                 index=index,  
                 item=item,  
                 iter_run_map=iter_run_map,  
             )  
             future.add_done_callback(thread_pool.task_done_callback)  
             futures.append(future)  
         succeeded_count = 0
         whileTrue:  
             try:  
                 event = q.get(timeout=1)  
                 if event isNone:  
                     break
                 if isinstance(event, IterationRunNextEvent):  
                     succeeded_count += 1
                     if succeeded_count == len(futures):  
                         q.put(None)  
                 yield event  
                 if isinstance(event, RunCompletedEvent):  
                     q.put(None)  
                     for f in futures:  
                         ifnot f.done():  
                             f.cancel()  
                     yield event  
                 if isinstance(event, IterationRunFailedEvent):  
                     q.put(None)  
                     yield event  
             except Empty:  
                 continue
   
         # wait all threads  
         wait(futures)  
     else:  
         for _ in range(len(iterator_list_value)):  
             yieldfrom self._run_single_iter(  
                 iterator_list_value=iterator_list_value,  
                 variable_pool=variable_pool,  
                 inputs=inputs,  
                 outputs=outputs,  
                 start_at=start_at,  
                 graph_engine=graph_engine,  
                 iteration_graph=iteration_graph,  
                 iter_run_map=iter_run_map,  
             )  
     if self.node_data.error_handle_mode == ErrorHandleMode.REMOVE_ABNORMAL_OUTPUT:  
         outputs = [output for output in outputs if output isnotNone]  
   
     # Flatten the list of lists  
     if isinstance(outputs, list) and all(isinstance(output, list) for output in outputs):  
         outputs = [item for sublist in outputs for item in sublist]  
   
     yield IterationRunSucceededEvent(  
         iteration_id=self.id,  
         iteration_node_id=self.node_id,  
         iteration_node_type=self.node_type,  
         iteration_node_data=self.node_data,  
         start_at=start_at,  
         inputs=inputs,  
         outputs={"output": outputs},  
         steps=len(iterator_list_value),  
         metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},  
     )  
   
     yield RunCompletedEvent(  
         run_result=NodeRunResult(  
             status=WorkflowNodeExecutionStatus.SUCCEEDED,  
             outputs={"output": outputs},  
             metadata={  
                 WorkflowNodeExecutionMetadataKey.ITERATION_DURATION_MAP: iter_run_map,  
                 WorkflowNodeExecutionMetadataKey.TOTAL_TOKENS: graph_engine.graph_runtime_state.total_tokens,  
             },  
         )  
     )  
except IterationNodeError as e:  
     # iteration run failed  
     logger.warning("Iteration run failed")  
     yield IterationRunFailedEvent(  
         iteration_id=self.id,  
         iteration_node_id=self.node_id,  
         iteration_node_type=self.node_type,  
         iteration_node_data=self.node_data,  
         start_at=start_at,  
         inputs=inputs,  
         outputs={"output": outputs},  
         steps=len(iterator_list_value),  
         metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},  
         error=str(e),  
     )  
   
     yield RunCompletedEvent(  
         run_result=NodeRunResult(  
             status=WorkflowNodeExecutionStatus.FAILED,  
             error=str(e),  
         )  
     )  
finally:  
     # remove iteration variable (item, index) from variable pool after iteration run completed  
     variable_pool.remove([self.node_id, "index"])  
     variable_pool.remove([self.node_id, "item"])

2. 串行迭代执行

def _run_single_iter(
    self,  
    *,  
    iterator_list_value: Sequence[str],  
    variable_pool: VariablePool,  
    inputs: Mapping[str, list],  
    outputs: list,  
    start_at: datetime,  
    graph_engine: "GraphEngine",  
    iteration_graph: Graph,  
    iter_run_map: dict[str, float],  
    parallel_mode_run_id: Optional[str] = None,  
) -> Generator[NodeEvent | InNodeEvent, None, None]:  
    """  
    run single iteration    
    """    
    iter_start_at = datetime.now(UTC).replace(tzinfo=None)  

    try:  
        rst = graph_engine.run()  
        # get current iteration index  
        index_variable = variable_pool.get([self.node_id, "index"])  
        ifnot isinstance(index_variable, IntegerVariable):  
            raise IterationIndexNotFoundError(f"iteration {self.node_id} current index not found")  
        current_index = index_variable.value  
        iteration_run_id = parallel_mode_run_id if parallel_mode_run_id isnotNoneelsef"{current_index}"
        next_index = int(current_index) + 1
        for event in rst:  
            if isinstance(event, (BaseNodeEvent | BaseParallelBranchEvent)) andnot event.in_iteration_id:  
                event.in_iteration_id = self.node_id  

            if (  
                isinstance(event, BaseNodeEvent)  
                and event.node_type == NodeType.ITERATION_START  
                andnot isinstance(event, NodeRunStreamChunkEvent)  
            ):  
                continue

            if isinstance(event, NodeRunSucceededEvent):  
                yield self._handle_event_metadata(  
                    event=event, iter_run_index=current_index, parallel_mode_run_id=parallel_mode_run_id  
                )  
            elif isinstance(event, BaseGraphEvent):  
                if isinstance(event, GraphRunFailedEvent):  
                    # iteration run failed  
                    if self.node_data.is_parallel:  
                        yield IterationRunFailedEvent(  
                            iteration_id=self.id,  
                            iteration_node_id=self.node_id,  
                            iteration_node_type=self.node_type,  
                            iteration_node_data=self.node_data,  
                            parallel_mode_run_id=parallel_mode_run_id,  
                            start_at=start_at,  
                            inputs=inputs,  
                            outputs={"output": outputs},  
                            steps=len(iterator_list_value),  
                            metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},  
                            error=event.error,  
                        )  
                    else:  
                        yield IterationRunFailedEvent(  
                            iteration_id=self.id,  
                            iteration_node_id=self.node_id,  
                            iteration_node_type=self.node_type,  
                            iteration_node_data=self.node_data,  
                            start_at=start_at,  
                            inputs=inputs,  
                            outputs={"output": outputs},  
                            steps=len(iterator_list_value),  
                            metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},  
                            error=event.error,  
                        )  
                    yield RunCompletedEvent(  
                        run_result=NodeRunResult(  
                            status=WorkflowNodeExecutionStatus.FAILED,  
                            error=event.error,  
                        )  
                    )  
                    return
            elif isinstance(event, InNodeEvent):  
                # event = cast(InNodeEvent, event)  
                metadata_event = self._handle_event_metadata(  
                    event=event, iter_run_index=current_index, parallel_mode_run_id=parallel_mode_run_id  
                )  
                if isinstance(event, NodeRunFailedEvent):  
                    if self.node_data.error_handle_mode == ErrorHandleMode.CONTINUE_ON_ERROR:  
                        yield NodeInIterationFailedEvent(  
                            **metadata_event.model_dump(),  
                        )  
                        outputs[current_index] = None
                        variable_pool.add([self.node_id, "index"], next_index)  
                        if next_index < len(iterator_list_value):  
                            variable_pool.add([self.node_id, "item"], iterator_list_value[next_index])  
                        duration = (datetime.now(UTC).replace(tzinfo=None) - iter_start_at).total_seconds()  
                        iter_run_map[iteration_run_id] = duration  
                        yield IterationRunNextEvent(  
                            iteration_id=self.id,  
                            iteration_node_id=self.node_id,  
                            iteration_node_type=self.node_type,  
                            iteration_node_data=self.node_data,  
                            index=next_index,  
                            parallel_mode_run_id=parallel_mode_run_id,  
                            pre_iteration_output=None,  
                            duration=duration,  
                        )  
                        return
                    elif self.node_data.error_handle_mode == ErrorHandleMode.REMOVE_ABNORMAL_OUTPUT:  
                        yield NodeInIterationFailedEvent(  
                            **metadata_event.model_dump(),  
                        )  
                        variable_pool.add([self.node_id, "index"], next_index)  

                        if next_index < len(iterator_list_value):  
                            variable_pool.add([self.node_id, "item"], iterator_list_value[next_index])  
                        duration = (datetime.now(UTC).replace(tzinfo=None) - iter_start_at).total_seconds()  
                        iter_run_map[iteration_run_id] = duration  
                        yield IterationRunNextEvent(  
                            iteration_id=self.id,  
                            iteration_node_id=self.node_id,  
                            iteration_node_type=self.node_type,  
                            iteration_node_data=self.node_data,  
                            index=next_index,  
                            parallel_mode_run_id=parallel_mode_run_id,  
                            pre_iteration_output=None,  
                            duration=duration,  
                        )  
                        return
                    elif self.node_data.error_handle_mode == ErrorHandleMode.TERMINATED:  
                        yield IterationRunFailedEvent(  
                            iteration_id=self.id,  
                            iteration_node_id=self.node_id,  
                            iteration_node_type=self.node_type,  
                            iteration_node_data=self.node_data,  
                            start_at=start_at,  
                            inputs=inputs,  
                            outputs={"output": None},  
                            steps=len(iterator_list_value),  
                            metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},  
                            error=event.error,  
                        )  
                yield metadata_event  

        current_output_segment = variable_pool.get(self.node_data.output_selector)  
        if current_output_segment isNone:  
            raise IterationNodeError("iteration output selector not found")  
        current_iteration_output = current_output_segment.value  
        outputs[current_index] = current_iteration_output  
        # remove all nodes outputs from variable pool  
        for node_id in iteration_graph.node_ids:  
            variable_pool.remove([node_id])  

        # move to next iteration  
        variable_pool.add([self.node_id, "index"], next_index)  

        if next_index < len(iterator_list_value):  
            variable_pool.add([self.node_id, "item"], iterator_list_value[next_index])  
        duration = (datetime.now(UTC).replace(tzinfo=None) - iter_start_at).total_seconds()  
        iter_run_map[iteration_run_id] = duration  
        yield IterationRunNextEvent(  
            iteration_id=self.id,  
            iteration_node_id=self.node_id,  
            iteration_node_type=self.node_type,  
            iteration_node_data=self.node_data,  
            index=next_index,  
            parallel_mode_run_id=parallel_mode_run_id,  
            pre_iteration_output=current_iteration_output orNone,  
            duration=duration,  
        )  

    except IterationNodeError as e:  
        logger.warning(f"Iteration run failed:{str(e)}")  
        yield IterationRunFailedEvent(  
            iteration_id=self.id,  
            iteration_node_id=self.node_id,  
            iteration_node_type=self.node_type,  
            iteration_node_data=self.node_data,  
            start_at=start_at,  
            inputs=inputs,  
            outputs={"output": None},  
            steps=len(iterator_list_value),  
            metadata={"total_tokens": graph_engine.graph_runtime_state.total_tokens},  
            error=str(e),  
        )  
        yield RunCompletedEvent(  
            run_result=NodeRunResult(  
                status=WorkflowNodeExecutionStatus.FAILED,  
                error=str(e),  
            )  
        )

3. 并行迭代执行

def _run_single_iter_parallel(  
    self,  
    *,  
    flask_app: Flask,  
    context: contextvars.Context,  
    q: Queue,  
    iterator_list_value: Sequence[str],  
    inputs: Mapping[str, list],  
    outputs: list,  
    start_at: datetime,  
    graph_engine: "GraphEngine",  
    iteration_graph: Graph,  
    index: int,  
    item: Any,  
    iter_run_map: dict[str, float],  
):  
    """  
    run single iteration in parallel mode    """    for var, val in context.items():  
        var.set(val)  

    # FIXME(-LAN-): Save current user before entering new app context  
    from flask import g  

    saved_user = None
    if has_request_context() and hasattr(g, "_login_user"):  
        saved_user = g._login_user  

    with flask_app.app_context():  
        # Restore user in new app context  
        if saved_user isnotNone:  
            from flask import g  

            g._login_user = saved_user  

        parallel_mode_run_id = uuid.uuid4().hex  
        graph_engine_copy = graph_engine.create_copy()  
        variable_pool_copy = graph_engine_copy.graph_runtime_state.variable_pool  
        variable_pool_copy.add([self.node_id, "index"], index)  
        variable_pool_copy.add([self.node_id, "item"], item)  
        for event in self._run_single_iter(  
            iterator_list_value=iterator_list_value,  
            variable_pool=variable_pool_copy,  
            inputs=inputs,  
            outputs=outputs,  
            start_at=start_at,  
            graph_engine=graph_engine_copy,  
            iteration_graph=iteration_graph,  
            iter_run_map=iter_run_map,  
            parallel_mode_run_id=parallel_mode_run_id,  
        ):  
            q.put(event)  
        graph_engine.graph_runtime_state.total_tokens += graph_engine_copy.graph_runtime_state.total_tokens

错误处理模式

迭代节点支持三种错误处理模式：

class ErrorHandleMode(StrEnum):
    TERMINATED = "terminated"              # 遇到错误立即终止
    CONTINUE_ON_ERROR = "continue-on-error"  # 遇到错误继续执行
    REMOVE_ABNORMAL_OUTPUT = "remove-abnormal-output"  # 移除异常输出

六、事件系统

事件类型层次结构

事件触发时机

七、变量池管理

变量池结构

变量池是工作流中的数据中心，负责存储和管理所有变量：

class VariablePool(BaseModel):
    # 变量字典：第一级键是节点ID，第二级是变量的哈希值
    variable_dictionary: dict[str, dict[int, Segment]] = Field(
        description="Variables mapping",
        default=defaultdict(dict),
    )
    
    # 用户输入变量
    user_inputs: Mapping[str, Any] = Field(
        description="User inputs",
    )
    
    # 系统变量
    system_variables: Mapping[SystemVariableKey, Any] = Field(
        description="System variables",
    )
    
    # 环境变量
    environment_variables: Sequence[Variable] = Field(
        description="Environment variables.",
        default_factory=list,
    )
    
    # 会话变量
    conversation_variables: Sequence[Variable] = Field(
        description="Conversation variables.",
        default_factory=list,
    )

变量作用域管理

变量操作方法

def add(self, selector: Sequence[str], value: Any, /) -> None:
    """
    向变量池添加变量。

    Args:
        selector (Sequence[str]): 变量选择器。
        value (VariableValue): 变量值。

    Raises:
        ValueError: 如果选择器无效。

    Returns:
        None
    """
    if len(selector) < 2:
        raise ValueError("Invalid selector")

    if isinstance(value, Variable):
        variable = value
    if isinstance(value, Segment):
        variable = variable_factory.segment_to_variable(segment=value, selector=selector)
    else:
        segment = variable_factory.build_segment(value)
        variable = variable_factory.segment_to_variable(segment=segment, selector=selector)

    hash_key = hash(tuple(selector[1:]))
    self.variable_dictionary[selector[0]][hash_key] = variable

def get(self, selector: Sequence[str], /) -> Segment | None:
    """
    根据选择器从变量池中检索值。

    Args:
        selector (Sequence[str]): 用于标识变量的选择器。

    Returns:
        Any: 与给定选择器关联的值。

    Raises:
        ValueError: 如果选择器无效。
    """
    if len(selector) < 2:
        returnNone

    hash_key = hash(tuple(selector[1:]))
    value = self.variable_dictionary[selector[0]].get(hash_key)

    if value isNone:
        selector, attr = selector[:-1], selector[-1]
        # Python support `attr in FileAttribute` after 3.12
        if attr notin {item.value for item in FileAttribute}:
            returnNone
        value = self.get(selector)
        ifnot isinstance(value, FileSegment | NoneSegment):
            returnNone
        if isinstance(value, FileSegment):
            attr = FileAttribute(attr)
            attr_value = file_manager.get_attr(file=value.value, attr=attr)
            return variable_factory.build_segment(attr_value)
        return value

    return value

def remove(self, selector: Sequence[str], /):
    """
    根据选择器从变量池中移除变量。

    Args:
        selector (Sequence[str]): 表示选择器的字符串序列。

    Returns:
        None
    """
    ifnot selector:
        return
    if len(selector) == 1:
        self.variable_dictionary[selector[0]] = {}
        return
    hash_key = hash(tuple(selector[1:]))
    self.variable_dictionary[selector[0]].pop(hash_key, None)

八、实际应用场景

场景1：批量数据处理

需求：对用户上传的多个文档进行 AI 分析

配置示例：

{
  "type": "iteration",
  "config": {
    "iterator_selector": ["start", "documents"],
    "is_parallel": true,
    "parallel_nums": 5,
    "error_handle_mode": "continue-on-error"
  }
}

场景2：条件循环处理

需求：持续调用 API 直到获得满意的结果

配置示例：

{
  "type": "loop",
"config": {
    "loop_variable_selectors": {
      "loop.attempt": ["start", "initial_attempt"]
    },
    "break_conditions": [
      {
        "variable_selector": ["api_result", "quality_score"],
        "comparison_operator": ">",
        "value": 0.8
      }
    ],
    "logical_operator": "or"
  }
}

场景3：嵌套循环处理

需求：对多个用户的多个问题进行批量回答

九、总结

Dify 的循环和迭代机制是其工作流引擎的核心功能之一，通过以下关键特性实现了强大的数据处理能力：

核心优势

灵活的控制结构
- 支持条件循环和集合迭代
- 提供丰富的退出条件和错误处理模式
- 支持嵌套循环和迭代
高性能执行
- 并行迭代支持
- 智能变量池管理
- 事件驱动的异步执行
完善的监控体系
- 详细的事件系统
- 实时状态跟踪
- 丰富的调试信息
易于使用
- 直观的配置方式
- 清晰的错误提示
- 完善的文档支持

技术亮点

事件驱动架构：通过事件系统实现松耦合的组件通信
变量作用域管理：精确控制变量的生命周期和访问范围
条件处理引擎：支持复杂的逻辑条件判断
并发执行优化：智能的线程池管理和资源调度

通过深入理解 Dify 的循环和迭代机制，我们可以更好地设计和优化 AI 应用的工作流，提高处理效率和用户体验，是值得学习和借鉴的优秀设计。

最后

为什么要学AI大模型

当下，⼈⼯智能市场迎来了爆发期，并逐渐进⼊以⼈⼯通⽤智能（AGI）为主导的新时代。企业纷纷官宣“ AI+ ”战略，为新兴技术⼈才创造丰富的就业机会，⼈才缺⼝将达 400 万！

DeepSeek问世以来，生成式AI和大模型技术爆发式增长，让很多岗位重新成了炙手可热的新星，岗位薪资远超很多后端岗位，在程序员中稳居前列。

在这里插入图片描述

与此同时AI与各行各业深度融合，飞速发展，成为炙手可热的新风口，企业非常需要了解AI、懂AI、会用AI的员工，纷纷开出高薪招聘AI大模型相关岗位。
在这里插入图片描述
最近很多程序员朋友都已经学习或者准备学习 AI 大模型，后台也经常会有小伙伴咨询学习路线和学习资料，我特别拜托北京清华大学学士和美国加州理工学院博士学位的鲁为民老师给大家这里给大家准备了一份涵盖了AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频 全系列的学习资料，这些学习资料不仅深入浅出，而且非常实用，让大家系统而高效地掌握AI大模型的各个知识点。