apache nifi_apache nifi流指纹安全漏洞

最新推荐文章于 2025-05-04 11:10:15 发布

weixin_26636643

最新推荐文章于 2025-05-04 11:10:15 发布

阅读量664

点赞数

原文链接：https://medium.com/apache-nifi-security/apache-nifi-flow-fingerprint-security-vulnerability-f105a5a5b0f6

版权

本文讨论了Apache NiFi中的一个安全漏洞(CVE-2020–1942)，该漏洞可能导致敏感信息泄露。当处理器无法与NiFi群集同步时，敏感值可能会出现在日志中。通过使用Argon2哈希算法和静态盐值，成功解决了这一问题。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

apache nifi

In this post, I will discuss a security vulnerability discovered in Apache NiFi flow fingerprints containing sensitive property descriptor values appearing in logs (CVE-2020–1942). During a troubleshooting session with a NiFi user who implemented custom Apache NiFi processors, Andy LoPresto — a NiFi PMC member and committer — discovered that sensitive values were output in logs when a processor failed to sync up with a NiFi cluster.

在本文中，我将讨论在Apache NiFi流指纹中发现的安全漏洞，该指纹包含出现在日志中的敏感属性描述符值( CVE-2020–1942 )。在与实施了自定义Apache NiFi处理器的NiFi用户进行的故障排除会话中，NiFi PMC成员和提交者Andy LoPresto发现，当处理器无法与NiFi群集同步时，敏感值会输出到日志中。

OK, there were a lot of terms and concepts you just read. If you didn’t quite get it all — don’t fret! Let’s break it down and get into detail to understand what is happening.

好的，您刚刚阅读了很多术语和概念。如果您还不了解所有内容， 请不要担心！ 让我们对其进行分解，并详细了解正在发生的事情。

首先，什么是集群？ (First, what’s a cluster?)

Before diving into the vulnerability, I’d like to do a quick overview of Apache NiFi clusters. Depending on your dataset — and in most-use cases — a single NiFi instance may not be powerful enough to process high volumes of data. The purpose of clustering allows NiFi Administrators or DataFlow Managers (DFM) the capability to run multiple instances from different servers and — through a single interface — make changes and monitor the dataflow.

在探究该漏洞之前，我想快速概述一下Apache NiFi集群。根据您的数据集(在大多数情况下)，单个NiFi实例的功能可能不足以处理大量数据。群集的目的是使NiFi管理员或DataFlow Manager(DFM)能够运行来自不同服务器的多个实例，并通过单个界面进行更改并监视数据流。

NiFi clustering employs a Zero-Master paradigm. Each node in the cluster performs the same tasks on the data, but each operates on a different set of data.

NiFi群集采用了零主范式。集群中的每个节点对数据执行相同的任务，但是每个节点都对不同的数据集进行操作。

集群协调员 (Cluster Coordinator)

This concept elects one node to be the Cluster Coordinator (using Apache Zookeeper) that is responsible for three main tasks:

此概念选择一个节点作为集群协调器 (使用Apache Zookeeper )，该节点负责三个主要任务：

1. Decide which nodes are allowed to join the cluster.

1.确定允许哪些节点加入群集。

2. Synchronize cluster nodes with current flows.

2.将群集节点与当前流同步。

3. Disconnect nodes that do not have a heartbeat status after a certain amount of time.

3.在一段时间后，断开没有心跳状态的节点。

Therefore, when the DFM makes each change once from any NiFi node, it will be replicated throughout the cluster.

因此，当DFM从任何NiFi节点进行一次更改时，它将在整个群集中复制。

检查流程指纹 (Examining Flow Fingerprints)

Now that we know what a cluster is, let’s focus on one of the responsibilities of the Cluster Coordinator — determining if a node is allowed to join the cluster. When a node is added to the cluster, the Cluster Coordinator will first look at that node’s flow.xml.gz — where a flow fingerprint can be derived using attributes related to data processing.

现在我们知道集群是什么了，让我们专注于集群协调器的职责之一-确定是否允许节点加入集群。将节点添加到群集后， 群集协调器将首先查看该节点的flow.xml.gz ，其中可以使用与数据处理相关的属性来获取流指纹 。

The flow fingerprint can contain properties such as processor IDs, processor relationships, and processor properties. A set of properties that the Cluster Coordinator will check are the processor flow configurations. If the flow configurations are empty, this indicates a new node, and will be allowed to join the cluster and inherit the current flow configurations. On the other hand, if flow configurations are present and they do not match the configurations of the rest of the nodes, that node will not be allowed to join the cluster.

流指纹可以包含诸如处理器ID，处理器关系和处理器属性之类的属性。 集群协调器将检查的一组属性是处理器流配置 。如果流配置为空，则表示新节点，并且将被允许加入集群并继承当前流配置。另一方面，如果存在流配置，但它们与其余节点的配置不匹配，则将不允许该节点加入群集。

For example, let’s say we have aGetFTP processor in a cluster. Below is the flow.xml and (some of) its properties:

例如，假设我们在集群中有一个GetFTP处理器。以下是flow.xml及其(某些)属性：

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<flowController encoding-version="1.4">
  <maxTimerDrivenThreadCount>10</maxTimerDrivenThreadCount>
  <maxEventDrivenThreadCount>1</maxEventDrivenThreadCount>
  <registries/>
  <parameterContexts/>
  <rootGroup>
    <id>adb378b3-0170-1000-426f-ff54a5486f97</id>
    <name>NiFi Flow</name>
    <position x="0.0" y="0.0"/>
    <comment/>
    <processor>
      <id>add68dbc-0170-1000-ffff-ffff9e996a54</id>
      <name>GetFTP</name>
      <position x="464.0" y="104.0"/>
      <styles/>
      <comment/>
      <class>org.apache.nifi.processors.standard.GetFTP</class>
      <bundle>
        <group>org.apache.nifi</group>
        <artifact>nifi-standard-nar</artifact>
        <version>1.10.0</version>
      </bundle>
      <maxConcurrentTasks>1</maxConcurrentTasks>
      <schedulingPeriod>0 sec</schedulingPeriod>
      <penalizationPeriod>30 sec</penalizationPeriod>
      <yieldPeriod>1 sec</yieldPeriod>
      <bulletinLevel>WARN</bulletinLevel>
      <lossTolerant>false</lossTolerant>
      <scheduledState>STOPPED</scheduledState>
      <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
      <executionNode>ALL</executionNode>
      <runDurationNanos>0</runDurationNanos>
      <property>
        <name>Hostname</name>
        <value>myHost.com</value>
      </property>
      <property>
        <name>Port</name>
        <value>21</value>
      </property>
      <property>
        <name>Username</name>
        <value>myUsername</value>
      </property>
      <property>
        <name>Password</name>
        <value>myPassword</value>
      </property>
      <property>
        <name>Connection Mode</name>
        <value>Passive</value>
      </property>
      <property>
        <name>Transfer Mode</name>
        <value>Binary</value>
      </property>
      ...
    <autoTerminatedRelationship>success</autoTerminatedRelationship>
    </processor>
  </rootGroup>
  <controllerServices/>
  <reportingTasks/>
</flowController>

In a real NiFi instance, the sensitive values like the password are always stored in an encrypted format — the decrypted example value is shown here for clarity. Next, we manually make local changes to the properties, such as modifying the password.

在真实的NiFi实例中，敏感值(如密码)始终以加密格式存储-为清楚起见，此处显示了解密后的示例值。接下来，我们手动对属性进行本地更改，例如修改密码。

When this node spins back up, the Cluster Coordinator will examine the flow fingerprints, determine it does not match with the other nodes and, therefore, not allow that node to join the cluster. Now, we are left with a homeless node!

当该节点旋转起来时， 群集协调器将检查流指纹，确定它与其他节点不匹配，因此不允许该节点加入群集。现在，我们剩下了一个无家可归的节点！

发现漏洞 (Discovering the vulnerability)

This was the scenario during a recent troubleshooting session with a NiFi user who implemented custom processors. Local changes were made to their processors while NiFi was offline. After restarting NiFi, the node was unable to join the cluster.

这是在与实施了自定义处理器的NiFi用户进行的最近一次故障排除会话期间的情况。 NiFi离线时对其处理器进行了本地更改。重新启动NiFi后，该节点无法加入群集。

To help pinpoint the error, NiFi PMC member— Andy LoPresto — took to the logs with the level set to ‘TRACE’. Upon further inspection, he discovered that when a node failed to join the cluster, the flow fingerprints along with its property names and values were printed.

为了帮助查明错误，NiFi PMC成员Andy LoPresto进入了级别设置为“ TRACE”的日志。经过进一步检查，他发现当节点无法加入群集时，将打印流指纹以及其属性名称和值。

2020-01-16 14:43:00,458 TRACE [main] o.a.n.c.StandardFlowSynchronizer Exporting snippets from controller2020-01-16 14:43:00,458 TRACE [main] o.a.n.c.StandardFlowSynchronizer Getting Authorizer fingerprint from controller2020-01-16 14:43:00,459 TRACE [main] o.a.n.c.StandardFlowSynchronizer Checking flow inheritability2020-01-16 14:43:00,474 TRACE [main] o.a.n.c.StandardFlowSynchronizer Local Fingerprint Before Hash = NO_VALUENO_PARAMETER_CONTEXTSadb378b3-0170-1000-426f-ff54a5486f97NO_VALUENO_VALUENO_VERSION_CONTROL_INFORMATIONadd68dbc-0170-1000-ffff-ffff9e996a54NO_VALUEorg.apache.nifi.processors.standard.GetFTPNO_VALUEorg.apache.nifinifi-standard-nar1.10.010 sec30 sec1 secWARNfalseTIMER_DRIVENALL0Hostname=myHost.comPassword=myModifiedPasswordUsername=myUsernamesuccess
2020-01-16 14:43:00,474 TRACE [main] o.a.n.c.StandardFlowSynchronizer Proposed Fingerprint Before Hash = NO_VALUENO_PARAMETER_CONTEXTSadb378b3-0170-1000-426f-ff54a5486f97NO_VALUENO_VALUENO_VERSION_CONTROL_INFORMATIONadd68dbc-0170-1000-ffff-ffff9e996a54NO_VALUEorg.apache.nifi.processors.standard.GetFTPNO_VALUEorg.apache.nifinifi-standard-nar1.10.010 sec30 sec1 secWARNfalseTIMER_DRIVENALL0Hostname=myHost.comPassword=myPasswordUsername=myUsernamesuccess
2020-01-16 14:43:00,477 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow.
org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow.
 at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1026)
 at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)
 at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:1028)
 at org.apache.nifi.NiFi.<init>(NiFi.java:158)
 at org.apache.nifi.NiFi.<init>(NiFi.java:72)
 at org.apache.nifi.NiFi.main(NiFi.java:301)
Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed configuration is not inheritable by the flow controller because of flow differences: Found difference in Flows:
Local Fingerprint: he.nifinifi-standard-nar1.10.010 sec30 sec1 secWARNfalseTIMER_DRIVENALL0Hostname=myHost.comPassword=myModifiedPasswordUsername=myUsernamesuccess
Cluster Fingerprint: he.nifinifi-standard-nar1.10.010 sec30 sec1 secWARNfalseTIMER_DRIVENALL0Hostname=myHost.comPassword=myPasswordUsername=myUsernamesuccess
 at org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:315)
 at org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1368)
 at org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:88)
 at org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:812)
 at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1001)
 … 5 common frames omitted

At the level where these properties are printed, plaintext values that are potentially sensitive are not yet encrypted. As a result, the logs output both sensitive and non-sensitive data.

在打印这些属性的级别上，尚未加密可能敏感的明文值。结果，日志同时输出敏感数据和非敏感数据。

怎么办？ (Now what?)

Of course we never want to expose sensitive data. It’s also in direct violation of OWASP Top 10 most critical security risks. It’s ranked third on the list under A3 — Sensitive Data Exposure.

当然，我们永远都不想公开敏感数据。它也直接违反了OWASP十大最关键的安全风险。在“ A3-敏感数据暴露”下，它排名第三。

The OWASP Top 10 outline points out that sensitive data must always be encrypted at rest and in transit, taking care not to use weak or outdated cryptographic algorithms.

OWASP Top 10概述指出，敏感数据必须在静止和传输过程中始终进行加密，请注意不要使用弱或过时的加密算法。

So, we know not to expose sensitive data and there are a number of ways to prevent this. A simple solution that generally comes to mind is to disable printing the values. But in order to better narrow down cluster errors, comparing any discrepancy in flow fingerprints is key.

因此，我们知道不公开敏感数据，并且有很多方法可以防止这种情况的发生。通常想到的一个简单解决方案是禁用打印值。但是为了更好地缩小簇错误，比较流指纹中的任何差异是关键。

氩气2 (Argon2)

To keep this capability and our sensitive data safe, the implementation of a hashing algorithm came into play. And to keep in line with incorporating strong cryptographic algorithms as advised by OWASP, the Argon2 hashing algorithm — winner of the July 2015 Password Hashing Competition — was introduced to the modules.

为了保持此功能和我们的敏感数据的安全，哈希算法的实现开始起作用。为了与OWASP建议的结合强大的加密算法保持一致，模块中引入了Argon2哈希算法(2015年7月密码哈希大赛的获胜者)。

部分解决方案 (A partial solution)

Similar to other commonly used hashing algorithms, such as Scrypt and Bcrypt, Argon2 concatenates a random salt with a given input and outputs a hashed value. For password hashing, this is desirable as it thwarts accessing plaintext values.

与其他常用的哈希算法(例如Scrypt和Bcrypt)相似 ，Argon2将随机盐与给定的输入连接起来并输出哈希值。对于密码哈希，这是理想的，因为它会阻止访问纯文本值。

This is not fully effective in our use-case. Hashing sensitive property values solves the issue of protecting sensitive data. But this also gives us a random hashed value each time, so we cannot determine if one value equals another.

这不是充分地在我们的用例中有效。散列敏感属性值解决了保护敏感数据的问题。但这每次也为我们提供一个随机散列值，因此我们无法确定一个值是否等于另一个。

完整的解决方案 (The complete solution)

A static salt can be swapped in. Therefore, when two identical values are hashed, they will also produce matching hashed values. The FingerprintFactory class is where the flow fingerprint is built. The class contains a method that determines whether the processor property value is encrypted — i.e., a sensitive value. If the value is marked encrypted, it will use Argon2 and a static salt to return a hashed value to be added to the flow fingerprint.

可以交换静态盐。因此，当对两个相同的值进行哈希处理时，它们还将产生匹配的哈希值。 FingerprintFactory类是构建流指纹的位置。该类包含一个确定处理器属性值是否已加密的方法，即敏感值。如果将该值标记为已加密，它将使用Argon2和静态盐返回要添加到流指纹的哈希值。

结论 (Conclusion)

We had a brief introduction to NiFi clusters and the Cluster Coordinator responsibilities. When the Cluster Coordinator determines a node is unable to join a cluster, the flow fingerprints are compared for discrepancies. While viewing the flow fingerprints in logs set at ‘TRACE’ level, it resulted in a security vulnerability that printed processor property values that potentially contained sensitive values in plaintext.

我们简要介绍了NiFi群集和群集协调器的职责。当群集协调器确定节点无法加入群集时，将比较流指纹的差异。在设置为“ TRACE”级别的日志中查看流指纹时，它导致了一个安全漏洞，该漏洞打印了可能包含明文形式的敏感值的处理器属性值。

The implementation of Argon2 secure hasher, in combination with a static salt, allows for deterministic logging of these values.

将Argon2安全哈希器与静态盐结合使用，允许确定性地记录这些值。

翻译自: https://medium.com/apache-nifi-security/apache-nifi-flow-fingerprint-security-vulnerability-f105a5a5b0f6

apache nifi