Concurrency and Asynchronous Nodes_onhalted()-优快云博客

Concurrency and Asynchronous Nodes

Understand Asynchrous Nodes, Concurrency and Parallelism

翻译自：https://www.behaviortree.dev

Understand Asynchrous Nodes, Concurrency and Parallelism

[了解异步节点、并发和并行]

When designing reactive Behavior Trees, it is important to understand 2 main concepts: [在设计响应式行为树时，理解两个主要概念很重要：]

what we mean by “Asynchronous” Actions VS “Synchronous” ones. [我们所说的“异步”动作与“同步”动作是什么意思。]
The difference between Concurrency and Parallelism in general and in the context of BT.CPP. [一般情况下以及 BT.CPP 上下文中的并发性和并行性之间的区别。]

Concurrency vs Parallelism

[并发与并行]

If you Google those words, you will read many good articles about this topic. [如果你用谷歌搜索这些词，你会读到很多关于这个主题的好文章。]

!!! info “Defintions”
Concurrency is when two or more tasks can start, run, and complete in overlapping time periods. It doesn’t necessarily mean they’ll ever both be running at the same instant. [并发是指两个或多个任务可以在重叠的时间段内启动、运行和完成。这并不一定意味着它们会同时运行。]

**Parallelism** is when tasks literally run at the same time in different  threads, e.g., on a multicore processor. [<font color="green">并行性是指任务在不同的线程中同时运行，例如在多核处理器上。</font>]

BT.CPP executes all the nodes Concurrently, in other words: [BT.CPP 并发执行所有节点，换句话说：]

The Tree execution engine itself is single-threaded. [树执行引擎本身是单线程的。]
all the tick() methods are always executed sequentially. [所有的 tick() 方法总是按顺序执行。]
if any tick() method is blocking, the entire execution is blocked. [如果任何 tick() 方法被阻塞，则整个执行被阻塞。]

We achieve reactive behaviors through “concurrency” and asynchronous execution. [我们通过“并发”和异步执行来实现反应行为。]

In other words, an Action that takes a long time to execute should, instead, return as soon as possible the state RUNNING to notify that the action was started, and only when ticked again check if the action is completed or not. [换句话说，一个需要很长时间执行的动作应该尽快返回状态 RUNNING 以通知动作已经开始，并且只有在再次tick时才检查动作是否完成。]

An Asynchronous node may delegate this long execution either to another process, another server or simply another thread. [异步节点可以将此长时间执行委托给另一个进程、另一个服务器或只是另一个线程。]

Asynchronous vs Synchronous

[异步与同步]

In general, an Asynchronous Action (or TreeNode) is simply one that: [通常，异步操作（或 TreeNode）只是一个：]

May return RUNNING instead of SUCCESS or FAILURE, when ticked. [tick后可能返回 RUNNING 而不是 SUCCESS 或 FAILURE。]
Can be stopped as fast as possible when the method halt() (to be implemented by the developer) is invoked. [当调用方法 halt()（由开发人员实现）时，可以尽快停止。]

When your Tree ends up executing an Asynchronous action that returns running, that RUNNING state is usually propagated backbard and the entire Tree is itself in the RUNNING state. [当您的 Tree 最终执行返回运行的异步操作时，该 RUNNING 状态通常会向后传播，并且整个 Tree 本身处于 RUNNING 状态。]

In the example below, “ActionE” is asynchronous and RUNNING; when a node is RUNNING, usually its parent returns RUNNING too. [在下面的示例中，“ActionE”是异步且正在运行的；当一个节点正在运行时，通常它的父节点也会返回 RUNNING。]

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-UdXx2wxU-1664603900898)(images/RunningTree.svg)]

Let’s consider a simple “SleepNode”. A good template to get started is the StatefulAction [让我们考虑一个简单的“SleepNode”。 StatefulAction 是一个很好的入门模板]

// Example of Asynchronous node that use StatefulActionNode as base class
class SleepNode : public BT::StatefulActionNode
{
  public:
    SleepNode(const std::string& name, const BT::NodeConfiguration& config)
      : BT::StatefulActionNode(name, config)
    {}

    static BT::PortsList providedPorts()
    {
      // amount of milliseconds that we want to sleep
      return{ BT::InputPort<int>("msec") };
    }

    NodeStatus onStart() override
    {
      int msec = 0;
      getInput("msec", msec);

      if( msec <= 0 ) {
        // No need to go into the RUNNING state
        return NodeStatus::SUCCESS;
      }
      else {
        using namespace std::chrono;
        // once the deadline is reached, we will return SUCCESS.
        deadline_ = system_clock::now() + milliseconds(msec);
        return NodeStatus::RUNNING;
      }
    }

    /// method invoked by an action in the RUNNING state.
    NodeStatus onRunning() override
    {
      if ( std::chrono::system_clock::now() >= deadline_ ) {
        return NodeStatus::SUCCESS;
      }
      else {
        return NodeStatus::RUNNING;
      }
    }

    void onHalted() override
    {
      // nothing to do here...
      std::cout << "SleepNode interrupted" << std::endl;
    }

  private:
    std::chrono::system_clock::time_point deadline_;
};

In the code above: [在上面的代码中：]

When the SleepNode is ticked the first time, the onStart() method is executed. This may return SUCCESS immediately if the sleep time is 0 or will return RUNNING otherwise. [当第一次tick SleepNode 时，会执行 onStart() 方法。如果睡眠时间为 0，这可能会立即返回 SUCCESS，否则将返回 RUNNING。]
We should continue ticking the tree in a loop. This will invoke the method onRunning() that may return RUNNING again or, eventually, SUCCESS. [我们应该继续循环tick树。这将调用可能再次返回 RUNNING 或最终返回 SUCCESS 的方法 onRunning()。]
Another node might trigger a halt() signal. In this case, the onHalted() method is invoked. We can take the opportunity to clean up our internal state. [另一个节点可能会触发一个 halt() 信号。在这种情况下，将调用 onHalted() 方法。我们可以借此机会清理我们的内部状态。]

Avoid blocking the execution of the tree

[避免阻塞树的执行]

A wrong way to implement the SleepNode would be this one: [实现 SleepNode 的错误方法是这样的：]

// This is the synchronous version of the Node. probably not what we want.
class BadSleepNode : public BT::ActionNodeBase
{
  public:
    BadSleepNode(const std::string& name, const BT::NodeConfiguration& config)
      : BT::ActionNodeBase(name, config)
    {}

    static BT::PortsList providedPorts()
    {
      return{ BT::InputPort<int>("msec") };
    }

    NodeStatus tick() override
    {  
      int msec = 0;
      getInput("msec", msec);
      // This blocking function will FREEZE the entire tree :(
      std::this_thread::sleep_for( std::chrono::milliseconds(msec) );
      return NodeStatus::SUCCESS;
     }

    void halt() override
    {
      // No one can invoke this method, because I freezed the tree.
      // Even if this method COULD be executed, there is no way I can
      // interrupt std::this_thread::sleep_for()
    }
};

The problem with multi-threading

[多线程的问题]

In the early days of this library (version 1.x), spawning a new thread looked like a good solution to build asynchronous Actions. [在这个库（版本 1.x）的早期，产生一个新线程看起来是构建异步操作的好解决方案。]

That was a bad idea, for multiple reasons: [这是一个坏主意，原因有很多：]

Accessing the blackboard in a thread-safe way is harder (more about this later). [以线程安全的方式访问黑板更加困难（稍后会详细介绍）。]
You probably don’t need to. [你可能不需要。]
People think that this will magically make the Action “asynchronous”, but they forget that it is still their responsibility to stop that thread “somehow” when the halt()method is invoked. [人们认为这会神奇地使 Action “异步”，但他们忘记了在调用 halt() 方法时“以某种方式”停止该线程仍然是他们的责任。]

For this reason, users are usually discouraged from using BT::AsyncActionNode as a base class. Let’s have a look again at the SleepNode. [因此，通常不鼓励用户使用 BT::AsyncActionNode 作为基类。让我们再看一下 SleepNode。]

// This will spawn its own thread. But it still has problems when halted
class BadSleepNode : public BT::AsyncActionNode
{
  public:
    BadSleepNode(const std::string& name, const BT::NodeConfiguration& config)
      : BT::ActionNodeBase(name, config)
    {}

    static BT::PortsList providedPorts()
    {
      return{ BT::InputPort<int>("msec") };
    }

    NodeStatus tick() override
    {  
      // This code runs in its own thread, therefore the Tree is still running.
      // This seems good but the thread still can't be aborted
      int msec = 0;
      getInput("msec", msec);
      std::this_thread::sleep_for( std::chrono::milliseconds(msec) );
      return NodeStatus::SUCCESS;
    }

    // The halt() method can not kill the spawned thread :(

    // Keep in mind that most of the time we should not
    // override AsyncActionNode::halt()
};

A correct version would be: [正确的版本是：]

// I will create my own thread here, for no good reason
class ThreadedSleepNode : public BT::AsyncActionNode
{
  public:
    ThreadedSleepNode(const std::string& name, const BT::NodeConfiguration& config)
      : BT::ActionNodeBase(name, config)
    {}

    static BT::PortsList providedPorts()
    {
      return{ BT::InputPort<int>("msec") };
    }

    NodeStatus tick() override
    {  
      // This code run in its own thread, therefore the Tree is still running.
      int msec = 0;
      getInput("msec", msec);

      using namespace std::chrono;
      const auto deadline = system_clock::now() + milliseconds(msec);

      // periodically check isHaltRequested() 
      // and sleep for a small amount of time only (1 millisecond)
      while( !isHaltRequested() && system_clock::now() < deadline )
      {
        std::this_thread::sleep_for( std::chrono::milliseconds(1) );
      }
      return NodeStatus::SUCCESS;
    }

    // The halt() method will set isHaltRequested() to true 
    // and stop the while loop in the spawned thread.
};

As you can see, this looks more complicated than the version we implemented first, using BT::StatefulActionNode. This pattern can still be useful in some case, but you must remember that introducing multi-threading make things more complicated and should be avoided by default. [如您所见，这看起来比我们首先使用 BT::StatefulActionNode 实现的版本更复杂。这种模式在某些情况下仍然有用，但您必须记住，引入多线程会使事情变得更加复杂，默认情况下应该避免使用。]

Advanced example: client / server communication [高级示例：客户端/服务器通信]

Frequently, people using BT.CPP execute the actual task in a different process. [通常，使用 BT.CPP 的人在不同的进程中执行实际任务。]

A typical (and recommended) way to do this in ROS is using ActionLib. [在 ROS 中执行此操作的典型（和推荐）方法是使用 ActionLib。]

ActionLib provides exactly the kind of API that we need to implement correctly an asynchronous behavior: [ActionLib 提供了我们正确实现异步行为所需的那种 API：]

A non-blocking function to start the Action. [启动动作的非阻塞函数。]
A way to monitor the current state of execution of the Action. [一种监控Action当前执行状态的方法。]
A way to retrieve the result or the error messages. [一种检索结果或错误消息的方法。]
The ability to preempt / abort an action that is being executed. [抢占/中止正在执行的操作的能力。]

None of these operations are “blocking”, therefore we don’t need to spawn our own thread. [这些操作都不是“阻塞”的，因此我们不需要生成自己的线程。]

More generally, let’s assume that the developer has their own inter-processing communication, with a client/server relationship between the BT executor and the actual service provider. [更一般地，让我们假设开发人员有自己的进程间通信，BT 执行者和实际服务提供者之间存在客户端/服务器关系。]

The corresponding pseudo-code implementation will look like this: [相应的伪代码实现将如下所示：]

// This action talk to a remote server
class ActionClientNode : public BT::StatefulActionNode
{
  public:
    SleepNode(const std::string& name, const BT::NodeConfiguration& config)
      : BT::StatefulActionNode(name, config)
    {}

    NodeStatus onStart() override
    {
      // send a request to the server
      bool accepted = sendStartRequestToServer();
      // check if the request was rejected by the server
      if( !accepted ) {
        return NodeStatus::FAILURE;
      }
      else {
        return NodeStatus::RUNNING;
      }
    }

    /// method invoked by an action in the RUNNING state.
    NodeStatus onRunning() override
    {
      // more psuedo-code
      auto current_state = getCurrentStateFromServer();

      if( current_state == DONE )
      {
        // retrieve the result
        auto result = getResult();
        // check if this result is "good"
        if( IsValidResult(result) ) {
          return NodeStatus::SUCCESS;
        } 
        else {
          return NodeStatus::FAILURE;
        }
      }
      else if( current_state == ABORTED ) {
        // fail if the action was aborted by some other client
        // or by the server itself
        return NodeStatus::FAILURE;
      }
      else {
        // probably (current_state == EXECUTING) ?
        return NodeStatus::RUNNING;
      }
    }

    void onHalted() override
    {
      // notify the server that the operation have been aborted
      sendAbortSignalToServer();
    }
};