Get Your Back Covered! Coverage, CodeCov和Tox

本文介绍了Python测试覆盖率工具coverage.py的使用,包括如何安装、收集覆盖率数据以及生成报告。此外,还详细讲解了Tox如何实现矩阵化测试,包括Tox的工作原理、配置方法和如何与CI集成。最后,提到了如何通过Codecov发布覆盖率报告,以提升开源项目的专业性和可信度。

1. Coverage - 衡量测试的覆盖率

我们已经掌握了如何进行单元测试。接下来,一个很自然的问题浮现出来,我们如何知道单元测试的质量呢?这就提出了测试覆盖率的概念。覆盖率测量通常用于衡量测试的有效性。它可以显示您的代码的哪些部分已被测试过,哪些没有。

coverage.py 是最常用的测量 Python 程序代码覆盖率的工具。它监视您的程序,记录代码的哪些部分已被执行,然后分析源代码以识别已执行和未执行的代码。

我们可以通过下面的方法来安装 coverage.py:

$ pip install coverage

要收集测试覆盖率数据,我们只需要在原来的测试命令前加上 coverage run 即可。比如,如果我们之前是使用pytest arg1 arg2 arg3来进行测试,则现在我们使用:

$ coverage run -m pytest arg1 arg2 arg3

当测试运行完成后,我们可以通过coverage report -m来查看测试覆盖率的报告:

Name                      Stmts   Miss  Cover   Missing
-------------------------------------------------------
my_program.py                20      4    80%   33-35, 39
my_other_module.py           56      6    89%   17-23
-------------------------------------------------------
TOTAL                        76     10    87%

如果希望得到更好的视觉效果,也可以使用 coverage html 命令来生成带注释的 HTML 报告,然后在浏览器中打开 htmlcov/index.html。
75%

不过,更多人选择使用 pytest-cov 插件来进行测试覆盖率的收集。这也是 ppw 的选择。通过 ppw 生成的工程,pytest-cov 已被加入到测试依赖中,因此也就自然安装到环境中去了。

因此,通过 ppw 配置的工程,我们一般不需要直接调用 coverage 命令,而是使用 pytest 命令来进行测试。pytest-cov 插件会自动收集测试覆盖率数据,然后在测试完成后,将测试覆盖率报告打印到控制台上。如果希望生成带注释的 HTML 报告,可以使用pytest --cov-report=html命令。

默认情况下,coverage.py 将测试行(语句)覆盖率,但通过配置,还可以测量分支覆盖率。我们通过下面的示例代码来说明这两种覆盖分别是什么意思。

def my_partial_fn(x):
    if x:
        y = 10
    return y

my_partial_fn(1)

在上面的代码中,第 2 行是一个 if 语句,根据 x 的取值,接下来可能运行到第 3 行,也可能运行到第 4 行。当 coverage 被配置为按语句计算覆盖时(这是默认的情况),只要该函数被执行,则 coverage 将统计为该函数的所有语句都已被执行过;但如果 coverage 被配置为按分支计算覆盖时,如果 x 求值的结果为 False,那么代码执行将从第 2 行直接跳到第 4 行。Coverage 将把第 2 行到第 4 行的代码标记为部分分支覆盖。

除了配置分支覆盖外,还有其它几种情况需要配置。接下来我们就介绍如何进行配置。

Coverage.py 配置文件的默认名称是。coveragerc,在

protected-mode no port 6379 tcp-backlog 511 timeout 0 tcp-keepalive 300 daemonize no pidfile /var/run/redis_6379.pid loglevel notice logfile "" databases 16 always-show-logo no set-proc-title yes proc-title-template "{title} {listen-addr} {server-mode}" stop-writes-on-bgsave-error yes rdbcompression yes rdbchecksum yes dbfilename dump.rdb rdb-del-sync-files no dir ./ replica-serve-stale-data yes replica-read-only yes repl-diskless-sync no repl-diskless-sync-delay 5 repl-diskless-load disabled repl-disable-tcp-nodelay no replica-priority 100 acllog-max-len 128 requirepass Guyuan@2021 # New users are initialized with restrictive permissions by default, via the # equivalent of this ACL rule 'off resetkeys -@all'. Starting with Redis 6.2, it # is possible to manage access to Pub/Sub channels with ACL rules as well. The # default Pub/Sub channels permission if new users is controlled by the # acl-pubsub-default configuration directive, which accepts one of these values: # # allchannels: grants access to all Pub/Sub channels # resetchannels: revokes access to all Pub/Sub channels # # To ensure backward compatibility while upgrading Redis 6.0, acl-pubsub-default # defaults to the 'allchannels' permission. # # Future compatibility note: it is very likely that in a future version of Redis # the directive's default of 'allchannels' will be changed to 'resetchannels' in # order to provide better out-of-the-box Pub/Sub security. Therefore, it is # recommended that you explicitly define Pub/Sub permissions for all users # rather then rely on implicit default values. Once you've set explicit # Pub/Sub for all existing users, you should uncomment the following line. # # acl-pubsub-default resetchannels # Command renaming (DEPRECATED). # # ------------------------------------------------------------------------ # WARNING: avoid using this option if possible. Instead use ACLs to remove # commands from the default user, and put them only in some admin user you # create for administrative purposes. # ------------------------------------------------------------------------ # # It is possible to change the name of dangerous commands in a shared # environment. For instance the CONFIG command may be renamed into something # hard to guess so that it will still be available for internal-use tools # but not available for general clients. # # Example: # # rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52 # # It is also possible to completely kill a command by renaming it into # an empty string: # # rename-command CONFIG "" # # Please note that changing the name of commands that are logged into the # AOF file or transmitted to replicas may cause problems. ################################### CLIENTS #################################### # Set the max number of connected clients at the same time. By default # this limit is set to 10000 clients, however if the Redis server is not # able to configure the process file limit to allow for the specified limit # the max number of allowed clients is set to the current file limit # minus 32 (as Redis reserves a few file descriptors for internal uses). # # Once the limit is reached Redis will close all the new connections sending # an error 'max number of clients reached'. # # IMPORTANT: When Redis Cluster is used, the max number of connections is also # shared with the cluster bus: every node in the cluster will use two # connections, one incoming and another outgoing. It is important to size the # limit accordingly in case of very large clusters. # # maxclients 10000 ############################## MEMORY MANAGEMENT ################################ # Set a memory usage limit to the specified amount of bytes. # When the memory limit is reached Redis will try to remove keys # according to the eviction policy selected (see maxmemory-policy). # # If Redis can't remove keys according to the policy, or if the policy is # set to 'noeviction', Redis will start to reply with errors to commands # that would use more memory, like SET, LPUSH, and so on, and will continue # to reply to read-only commands like GET. # # This option is usually useful when using Redis as an LRU or LFU cache, or to # set a hard memory limit for an instance (using the 'noeviction' policy). # # WARNING: If you have replicas attached to an instance with maxmemory on, # the size of the output buffers needed to feed the replicas are subtracted # from the used memory count, so that network problems / resyncs will # not trigger a loop where keys are evicted, and in turn the output # buffer of replicas is full with DELs of keys evicted triggering the deletion # of more keys, and so forth until the database is completely emptied. # # In short... if you have replicas attached it is suggested that you set a lower # limit for maxmemory so that there is some free RAM on the system for replica # output buffers (but this is not needed if the policy is 'noeviction'). # # maxmemory <bytes> # MAXMEMORY POLICY: how Redis will select what to remove when maxmemory # is reached. You can select one from the following behaviors: # # volatile-lru -> Evict using approximated LRU, only keys with an expire set. # allkeys-lru -> Evict any key using approximated LRU. # volatile-lfu -> Evict using approximated LFU, only keys with an expire set. # allkeys-lfu -> Evict any key using approximated LFU. # volatile-random -> Remove a random key having an expire set. # allkeys-random -> Remove a random key, any key. # volatile-ttl -> Remove the key with the nearest expire time (minor TTL) # noeviction -> Don't evict anything, just return an error on write operations. # # LRU means Least Recently Used # LFU means Least Frequently Used # # Both LRU, LFU and volatile-ttl are implemented using approximated # randomized algorithms. # # Note: with any of the above policies, when there are no suitable keys for # eviction, Redis will return an error on write operations that require # more memory. These are usually commands that create new keys, add data or # modify existing keys. A few examples are: SET, INCR, HSET, LPUSH, SUNIONSTORE, # SORT (due to the STORE argument), and EXEC (if the transaction includes any # command that requires memory). # # The default is: # # maxmemory-policy noeviction # LRU, LFU and minimal TTL algorithms are not precise algorithms but approximated # algorithms (in order to save memory), so you can tune it for speed or # accuracy. By default Redis will check five keys and pick the one that was # used least recently, you can change the sample size using the following # configuration directive. # # The default of 5 produces good enough results. 10 Approximates very closely # true LRU but costs more CPU. 3 is faster but not very accurate. # # maxmemory-samples 5 # Eviction processing is designed to function well with the default setting. # If there is an unusually large amount of write traffic, this value may need to # be increased. Decreasing this value may reduce latency at the risk of # eviction processing effectiveness # 0 = minimum latency, 10 = default, 100 = process without regard to latency # # maxmemory-eviction-tenacity 10 # Starting from Redis 5, by default a replica will ignore its maxmemory setting # (unless it is promoted to master after a failover or manually). It means # that the eviction of keys will be just handled by the master, sending the # DEL commands to the replica as keys evict in the master side. # # This behavior ensures that masters and replicas stay consistent, and is usually # what you want, however if your replica is writable, or you want the replica # to have a different memory setting, and you are sure all the writes performed # to the replica are idempotent, then you may change this default (but be sure # to understand what you are doing). # # Note that since the replica by default does not evict, it may end using more # memory than the one set via maxmemory (there are certain buffers that may # be larger on the replica, or data structures may sometimes take more memory # and so forth). So make sure you monitor your replicas and make sure they # have enough memory to never hit a real out-of-memory condition before the # master hits the configured maxmemory setting. # # replica-ignore-maxmemory yes # Redis reclaims expired keys in two ways: upon access when those keys are # found to be expired, and also in background, in what is called the # "active expire key". The key space is slowly and interactively scanned # looking for expired keys to reclaim, so that it is possible to free memory # of keys that are expired and will never be accessed again in a short time. # # The default effort of the expire cycle will try to avoid having more than # ten percent of expired keys still in memory, and will try to avoid consuming # more than 25% of total memory and to add latency to the system. However # it is possible to increase the expire "effort" that is normally set to # "1", to a greater value, up to the value "10". At its maximum value the # system will use more CPU, longer cycles (and technically may introduce # more latency), and will tolerate less already expired keys still present # in the system. It's a tradeoff between memory, CPU and latency. # # active-expire-effort 1 ############################# LAZY FREEING #################################### # Redis has two primitives to delete keys. One is called DEL and is a blocking # deletion of the object. It means that the server stops processing new commands # in order to reclaim all the memory associated with an object in a synchronous # way. If the key deleted is associated with a small object, the time needed # in order to execute the DEL command is very small and comparable to most other # O(1) or O(log_N) commands in Redis. However if the key is associated with an # aggregated value containing millions of elements, the server can block for # a long time (even seconds) in order to complete the operation. # # For the above reasons Redis also offers non blocking deletion primitives # such as UNLINK (non blocking DEL) and the ASYNC option of FLUSHALL and # FLUSHDB commands, in order to reclaim memory in background. Those commands # are executed in constant time. Another thread will incrementally free the # object in the background as fast as possible. # # DEL, UNLINK and ASYNC option of FLUSHALL and FLUSHDB are user-controlled. # It's up to the design of the application to understand when it is a good # idea to use one or the other. However the Redis server sometimes has to # delete keys or flush the whole database as a side effect of other operations. # Specifically Redis deletes objects independently of a user call in the # following scenarios: # # 1) On eviction, because of the maxmemory and maxmemory policy configurations, # in order to make room for new data, without going over the specified # memory limit. # 2) Because of expire: when a key with an associated time to live (see the # EXPIRE command) must be deleted from memory. # 3) Because of a side effect of a command that stores data on a key that may # already exist. For example the RENAME command may delete the old key # content when it is replaced with another one. Similarly SUNIONSTORE # or SORT with STORE option may delete existing keys. The SET command # itself removes any old content of the specified key in order to replace # it with the specified string. # 4) During replication, when a replica performs a full resynchronization with # its master, the content of the whole database is removed in order to # load the RDB file just transferred. # # In all the above cases the default is to delete objects in a blocking way, # like if DEL was called. However you can configure each case specifically # in order to instead release memory in a non-blocking way like if UNLINK # was called, using the following configuration directives. lazyfree-lazy-eviction no lazyfree-lazy-expire no lazyfree-lazy-server-del no replica-lazy-flush no # It is also possible, for the case when to replace the user code DEL calls # with UNLINK calls is not easy, to modify the default behavior of the DEL # command to act exactly like UNLINK, using the following configuration # directive: lazyfree-lazy-user-del no # FLUSHDB, FLUSHALL, and SCRIPT FLUSH support both asynchronous and synchronous # deletion, which can be controlled by passing the [SYNC|ASYNC] flags into the # commands. When neither flag is passed, this directive will be used to determine # if the data should be deleted asynchronously. lazyfree-lazy-user-flush no ################################ THREADED I/O ################################# # Redis is mostly single threaded, however there are certain threaded # operations such as UNLINK, slow I/O accesses and other things that are # performed on side threads. # # Now it is also possible to handle Redis clients socket reads and writes # in different I/O threads. Since especially writing is so slow, normally # Redis users use pipelining in order to speed up the Redis performances per # core, and spawn multiple instances in order to scale more. Using I/O # threads it is possible to easily speedup two times Redis without resorting # to pipelining nor sharding of the instance. # # By default threading is disabled, we suggest enabling it only in machines # that have at least 4 or more cores, leaving at least one spare core. # Using more than 8 threads is unlikely to help much. We also recommend using # threaded I/O only if you actually have performance problems, with Redis # instances being able to use a quite big percentage of CPU time, otherwise # there is no point in using this feature. # # So for instance if you have a four cores boxes, try to use 2 or 3 I/O # threads, if you have a 8 cores, try to use 6 threads. In order to # enable I/O threads use the following configuration directive: # # io-threads 4 # # Setting io-threads to 1 will just use the main thread as usual. # When I/O threads are enabled, we only use threads for writes, that is # to thread the write(2) syscall and transfer the client buffers to the # socket. However it is also possible to enable threading of reads and # protocol parsing using the following configuration directive, by setting # it to yes: # # io-threads-do-reads no # # Usually threading reads doesn't help much. # # NOTE 1: This configuration directive cannot be changed at runtime via # CONFIG SET. Aso this feature currently does not work when SSL is # enabled. # # NOTE 2: If you want to test the Redis speedup using redis-benchmark, make # sure you also run the benchmark itself in threaded mode, using the # --threads option to match the number of Redis threads, otherwise you'll not # be able to notice the improvements. ############################ KERNEL OOM CONTROL ############################## # On Linux, it is possible to hint the kernel OOM killer on what processes # should be killed first when out of memory. # # Enabling this feature makes Redis actively control the oom_score_adj value # for all its processes, depending on their role. The default scores will # attempt to have background child processes killed before all others, and # replicas killed before masters. # # Redis supports three options: # # no: Don't make changes to oom-score-adj (default). # yes: Alias to "relative" see below. # absolute: Values in oom-score-adj-values are written as is to the kernel. # relative: Values are used relative to the initial value of oom_score_adj when # the server starts and are then clamped to a range of -1000 to 1000. # Because typically the initial value is 0, they will often match the # absolute values. oom-score-adj no # When oom-score-adj is used, this directive controls the specific values used # for master, replica and background child processes. Values range -2000 to # 2000 (higher means more likely to be killed). # # Unprivileged processes (not root, and without CAP_SYS_RESOURCE capabilities) # can freely increase their value, but not decrease it below its initial # settings. This means that setting oom-score-adj to "relative" and setting the # oom-score-adj-values to positive values will always succeed. oom-score-adj-values 0 200 800 #################### KERNEL transparent hugepage CONTROL ###################### # Usually the kernel Transparent Huge Pages control is set to "madvise" or # or "never" by default (/sys/kernel/mm/transparent_hugepage/enabled), in which # case this config has no effect. On systems in which it is set to "always", # redis will attempt to disable it specifically for the redis process in order # to avoid latency problems specifically with fork(2) and CoW. # If for some reason you prefer to keep it enabled, you can set this config to # "no" and the kernel global to "always". disable-thp yes ############################## APPEND ONLY MODE ############################### # By default Redis asynchronously dumps the dataset on disk. This mode is # good enough in many applications, but an issue with the Redis process or # a power outage may result into a few minutes of writes lost (depending on # the configured save points). # # The Append Only File is an alternative persistence mode that provides # much better durability. For instance using the default data fsync policy # (see later in the config file) Redis can lose just one second of writes in a # dramatic event like a server power outage, or a single write if something # wrong with the Redis process itself happens, but the operating system is # still running correctly. # # AOF and RDB persistence can be enabled at the same time without problems. # If the AOF is enabled on startup Redis will load the AOF, that is the file # with the better durability guarantees. # # Please check https://redis.io/topics/persistence for more information. appendonly yes # The name of the append only file (default: "appendonly.aof") appendfilename "appendonly.aof" # The fsync() call tells the Operating System to actually write data on disk # instead of waiting for more data in the output buffer. Some OS will really flush # data on disk, some other OS will just try to do it ASAP. # # Redis supports three different modes: # # no: don't fsync, just let the OS flush the data when it wants. Faster. # always: fsync after every write to the append only log. Slow, Safest. # everysec: fsync only one time every second. Compromise. # # The default is "everysec", as that's usually the right compromise between # speed and data safety. It's up to you to understand if you can relax this to # "no" that will let the operating system flush the output buffer when # it wants, for better performances (but if you can live with the idea of # some data loss consider the default persistence mode that's snapshotting), # or on the contrary, use "always" that's very slow but a bit safer than # everysec. # # More details please check the following article: # http://antirez.com/post/redis-persistence-demystified.html # # If unsure, use "everysec". # appendfsync always appendfsync everysec # appendfsync no # When the AOF fsync policy is set to always or everysec, and a background # saving process (a background save or AOF log background rewriting) is # performing a lot of I/O against the disk, in some Linux configurations # Redis may block too long on the fsync() call. Note that there is no fix for # this currently, as even performing fsync in a different thread will block # our synchronous write(2) call. # # In order to mitigate this problem it's possible to use the following option # that will prevent fsync() from being called in the main process while a # BGSAVE or BGREWRITEAOF is in progress. # # This means that while another child is saving, the durability of Redis is # the same as "appendfsync none". In practical terms, this means that it is # possible to lose up to 30 seconds of log in the worst scenario (with the # default Linux settings). # # If you have latency problems turn this to "yes". Otherwise leave it as # "no" that is the safest pick from the point of view of durability. no-appendfsync-on-rewrite no # Automatic rewrite of the append only file. # Redis is able to automatically rewrite the log file implicitly calling # BGREWRITEAOF when the AOF log size grows by the specified percentage. # # This is how it works: Redis remembers the size of the AOF file after the # latest rewrite (if no rewrite has happened since the restart, the size of # the AOF at startup is used). # # This base size is compared to the current size. If the current size is # bigger than the specified percentage, the rewrite is triggered. Also # you need to specify a minimal size for the AOF file to be rewritten, this # is useful to avoid rewriting the AOF file even if the percentage increase # is reached but it is still pretty small. # # Specify a percentage of zero in order to disable the automatic AOF # rewrite feature. auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb # An AOF file may be found to be truncated at the end during the Redis # startup process, when the AOF data gets loaded back into memory. # This may happen when the system where Redis is running # crashes, especially when an ext4 filesystem is mounted without the # data=ordered option (however this can't happen when Redis itself # crashes or aborts but the operating system still works correctly). # # Redis can either exit with an error when this happens, or load as much # data as possible (the default now) and start if the AOF file is found # to be truncated at the end. The following option controls this behavior. # # If aof-load-truncated is set to yes, a truncated AOF file is loaded and # the Redis server starts emitting a log to inform the user of the event. # Otherwise if the option is set to no, the server aborts with an error # and refuses to start. When the option is set to no, the user requires # to fix the AOF file using the "redis-check-aof" utility before to restart # the server. # # Note that if the AOF file will be found to be corrupted in the middle # the server will still exit with an error. This option only applies when # Redis will try to read more data from the AOF file but not enough bytes # will be found. aof-load-truncated yes # When rewriting the AOF file, Redis is able to use an RDB preamble in the # AOF file for faster rewrites and recoveries. When this option is turned # on the rewritten AOF file is composed of two different stanzas: # # [RDB file][AOF tail] # # When loading, Redis recognizes that the AOF file starts with the "REDIS" # string and loads the prefixed RDB file, then continues loading the AOF # tail. aof-use-rdb-preamble yes ################################ LUA SCRIPTING ############################### # Max execution time of a Lua script in milliseconds. # # If the maximum execution time is reached Redis will log that a script is # still in execution after the maximum allowed time and will start to # reply to queries with an error. # # When a long running script exceeds the maximum execution time only the # SCRIPT KILL and SHUTDOWN NOSAVE commands are available. The first can be # used to stop a script that did not yet call any write commands. The second # is the only way to shut down the server in the case a write command was # already issued by the script but the user doesn't want to wait for the natural # termination of the script. # # Set it to 0 or a negative value for unlimited execution without warnings. lua-time-limit 5000 ################################ REDIS CLUSTER ############################### # Normal Redis instances can't be part of a Redis Cluster; only nodes that are # started as cluster nodes can. In order to start a Redis instance as a # cluster node enable the cluster support uncommenting the following: # # cluster-enabled yes # Every cluster node has a cluster configuration file. This file is not # intended to be edited by hand. It is created and updated by Redis nodes. # Every Redis Cluster node requires a different cluster configuration file. # Make sure that instances running in the same system do not have # overlapping cluster configuration file names. # # cluster-config-file nodes-6379.conf # Cluster node timeout is the amount of milliseconds a node must be unreachable # for it to be considered in failure state. # Most other internal time limits are a multiple of the node timeout. # # cluster-node-timeout 15000 # A replica of a failing master will avoid to start a failover if its data # looks too old. # # There is no simple way for a replica to actually have an exact measure of # its "data age", so the following two checks are performed: # # 1) If there are multiple replicas able to failover, they exchange messages # in order to try to give an advantage to the replica with the best # replication offset (more data from the master processed). # Replicas will try to get their rank by offset, and apply to the start # of the failover a delay proportional to their rank. # # 2) Every single replica computes the time of the last interaction with # its master. This can be the last ping or command received (if the master # is still in the "connected" state), or the time that elapsed since the # disconnection with the master (if the replication link is currently down). # If the last interaction is too old, the replica will not try to failover # at all. # # The point "2" can be tuned by user. Specifically a replica will not perform # the failover if, since the last interaction with the master, the time # elapsed is greater than: # # (node-timeout * cluster-replica-validity-factor) + repl-ping-replica-period # # So for example if node-timeout is 30 seconds, and the cluster-replica-validity-factor # is 10, and assuming a default repl-ping-replica-period of 10 seconds, the # replica will not try to failover if it was not able to talk with the master # for longer than 310 seconds. # # A large cluster-replica-validity-factor may allow replicas with too old data to failover # a master, while a too small value may prevent the cluster from being able to # elect a replica at all. # # For maximum availability, it is possible to set the cluster-replica-validity-factor # to a value of 0, which means, that replicas will always try to failover the # master regardless of the last time they interacted with the master. # (However they'll always try to apply a delay proportional to their # offset rank). # # Zero is the only value able to guarantee that when all the partitions heal # the cluster will always be able to continue. # # cluster-replica-validity-factor 10 # Cluster replicas are able to migrate to orphaned masters, that are masters # that are left without working replicas. This improves the cluster ability # to resist to failures as otherwise an orphaned master can't be failed over # in case of failure if it has no working replicas. # # Replicas migrate to orphaned masters only if there are still at least a # given number of other working replicas for their old master. This number # is the "migration barrier". A migration barrier of 1 means that a replica # will migrate only if there is at least 1 other working replica for its master # and so forth. It usually reflects the number of replicas you want for every # master in your cluster. # # Default is 1 (replicas migrate only if their masters remain with at least # one replica). To disable migration just set it to a very large value or # set cluster-allow-replica-migration to 'no'. # A value of 0 can be set but is useful only for debugging and dangerous # in production. # # cluster-migration-barrier 1 # Turning off this option allows to use less automatic cluster configuration. # It both disables migration to orphaned masters and migration from masters # that became empty. # # Default is 'yes' (allow automatic migrations). # # cluster-allow-replica-migration yes # By default Redis Cluster nodes stop accepting queries if they detect there # is at least a hash slot uncovered (no available node is serving it). # This way if the cluster is partially down (for example a range of hash slots # are no longer covered) all the cluster becomes, eventually, unavailable. # It automatically returns available as soon as all the slots are covered again. # # However sometimes you want the subset of the cluster which is working, # to continue to accept queries for the part of the key space that is still # covered. In order to do so, just set the cluster-require-full-coverage # option to no. # # cluster-require-full-coverage yes # This option, when set to yes, prevents replicas from trying to failover its # master during master failures. However the replica can still perform a # manual failover, if forced to do so. # # This is useful in different scenarios, especially in the case of multiple # data center operations, where we want one side to never be promoted if not # in the case of a total DC failure. # # cluster-replica-no-failover no # This option, when set to yes, allows nodes to serve read traffic while the # the cluster is in a down state, as long as it believes it owns the slots. # # This is useful for two cases. The first case is for when an application # doesn't require consistency of data during node failures or network partitions. # One example of this is a cache, where as long as the node has the data it # should be able to serve it. # # The second use case is for configurations that don't meet the recommended # three shards but want to enable cluster mode and scale later. A # master outage in a 1 or 2 shard configuration causes a read/write outage to the # entire cluster without this option set, with it set there is only a write outage. # Without a quorum of masters, slot ownership will not change automatically. # # cluster-allow-reads-when-down no # In order to setup your cluster make sure to read the documentation # available at https://redis.io web site. ########################## CLUSTER DOCKER/NAT support ######################## # In certain deployments, Redis Cluster nodes address discovery fails, because # addresses are NAT-ted or because ports are forwarded (the typical case is # Docker and other containers). # # In order to make Redis Cluster working in such environments, a static # configuration where each node knows its public address is needed. The # following four options are used for this scope, and are: # # * cluster-announce-ip # * cluster-announce-port # * cluster-announce-tls-port # * cluster-announce-bus-port # # Each instructs the node about its address, client ports (for connections # without and with TLS) and cluster message bus port. The information is then # published in the header of the bus packets so that other nodes will be able to # correctly map the address of the node publishing the information. # # If cluster-tls is set to yes and cluster-announce-tls-port is omitted or set # to zero, then cluster-announce-port refers to the TLS port. Note also that # cluster-announce-tls-port has no effect if cluster-tls is set to no. # # If the above options are not used, the normal Redis Cluster auto-detection # will be used instead. # # Note that when remapped, the bus port may not be at the fixed offset of # clients port + 10000, so you can specify any port and bus-port depending # on how they get remapped. If the bus-port is not set, a fixed offset of # 10000 will be used as usual. # # Example: # # cluster-announce-ip 10.1.1.5 # cluster-announce-tls-port 6379 # cluster-announce-port 0 # cluster-announce-bus-port 6380 ################################## SLOW LOG ################################### # The Redis Slow Log is a system to log queries that exceeded a specified # execution time. The execution time does not include the I/O operations # like talking with the client, sending the reply and so forth, # but just the time needed to actually execute the command (this is the only # stage of command execution where the thread is blocked and can not serve # other requests in the meantime). # # You can configure the slow log with two parameters: one tells Redis # what is the execution time, in microseconds, to exceed in order for the # command to get logged, and the other parameter is the length of the # slow log. When a new command is logged the oldest one is removed from the # queue of logged commands. # The following time is expressed in microseconds, so 1000000 is equivalent # to one second. Note that a negative number disables the slow log, while # a value of zero forces the logging of every command. slowlog-log-slower-than 10000 # There is no limit to this length. Just be aware that it will consume memory. # You can reclaim memory used by the slow log with SLOWLOG RESET. slowlog-max-len 128 ################################ LATENCY MONITOR ############################## # The Redis latency monitoring subsystem samples different operations # at runtime in order to collect data related to possible sources of # latency of a Redis instance. # # Via the LATENCY command this information is available to the user that can # print graphs and obtain reports. # # The system only logs operations that were performed in a time equal or # greater than the amount of milliseconds specified via the # latency-monitor-threshold configuration directive. When its value is set # to zero, the latency monitor is turned off. # # By default latency monitoring is disabled since it is mostly not needed # if you don't have latency issues, and collecting data has a performance # impact, that while very small, can be measured under big load. Latency # monitoring can easily be enabled at runtime using the command # "CONFIG SET latency-monitor-threshold <milliseconds>" if needed. latency-monitor-threshold 0 ############################# EVENT NOTIFICATION ############################## # Redis can notify Pub/Sub clients about events happening in the key space. # This feature is documented at https://redis.io/topics/notifications # # For instance if keyspace events notification is enabled, and a client # performs a DEL operation on key "foo" stored in the Database 0, two # messages will be published via Pub/Sub: # # PUBLISH __keyspace@0__:foo del # PUBLISH __keyevent@0__:del foo # # It is possible to select the events that Redis will notify among a set # of classes. Every class is identified by a single character: # # K Keyspace events, published with __keyspace@<db>__ prefix. # E Keyevent events, published with __keyevent@<db>__ prefix. # g Generic commands (non-type specific) like DEL, EXPIRE, RENAME, ... # $ String commands # l List commands # s Set commands # h Hash commands # z Sorted set commands # x Expired events (events generated every time a key expires) # e Evicted events (events generated when a key is evicted for maxmemory) # t Stream commands # d Module key type events # m Key-miss events (Note: It is not included in the 'A' class) # A Alias for g$lshzxetd, so that the "AKE" string means all the events # (Except key-miss events which are excluded from 'A' due to their # unique nature). # # The "notify-keyspace-events" takes as argument a string that is composed # of zero or multiple characters. The empty string means that notifications # are disabled. # # Example: to enable list and generic events, from the point of view of the # event name, use: # # notify-keyspace-events Elg # # Example 2: to get the stream of the expired keys subscribing to channel # name __keyevent@0__:expired use: # # notify-keyspace-events Ex # # By default all notifications are disabled because most users don't need # this feature and the feature has some overhead. Note that if you don't # specify at least one of K or E, no events will be delivered. notify-keyspace-events "" ############################### GOPHER SERVER ################################# # Redis contains an implementation of the Gopher protocol, as specified in # the RFC 1436 (https://www.ietf.org/rfc/rfc1436.txt). # # The Gopher protocol was very popular in the late '90s. It is an alternative # to the web, and the implementation both server and client side is so simple # that the Redis server has just 100 lines of code in order to implement this # support. # # What do you do with Gopher nowadays? Well Gopher never *really* died, and # lately there is a movement in order for the Gopher more hierarchical content # composed of just plain text documents to be resurrected. Some want a simpler # internet, others believe that the mainstream internet became too much # controlled, and it's cool to create an alternative space for people that # want a bit of fresh air. # # Anyway for the 10nth birthday of the Redis, we gave it the Gopher protocol # as a gift. # # --- HOW IT WORKS? --- # # The Redis Gopher support uses the inline protocol of Redis, and specifically # two kind of inline requests that were anyway illegal: an empty request # or any request that starts with "/" (there are no Redis commands starting # with such a slash). Normal RESP2/RESP3 requests are completely out of the # path of the Gopher protocol implementation and are served as usual as well. # # If you open a connection to Redis when Gopher is enabled and send it # a string like "/foo", if there is a key named "/foo" it is served via the # Gopher protocol. # # In order to create a real Gopher "hole" (the name of a Gopher site in Gopher # talking), you likely need a script like the following: # # https://github.com/antirez/gopher2redis # # --- SECURITY WARNING --- # # If you plan to put Redis on the internet in a publicly accessible address # to server Gopher pages MAKE SURE TO SET A PASSWORD to the instance. # Once a password is set: # # 1. The Gopher server (when enabled, not by default) will still serve # content via Gopher. # 2. However other commands cannot be called before the client will # authenticate. # # So use the 'requirepass' option to protect your instance. # # Note that Gopher is not currently supported when 'io-threads-do-reads' # is enabled. # # To enable Gopher support, uncomment the following line and set the option # from no (the default) to yes. # # gopher-enabled no ############################### ADVANCED CONFIG ############################### # Hashes are encoded using a memory efficient data structure when they have a # small number of entries, and the biggest entry does not exceed a given # threshold. These thresholds can be configured using the following directives. hash-max-ziplist-entries 512 hash-max-ziplist-value 64 # Lists are also encoded in a special way to save a lot of space. # The number of entries allowed per internal list node can be specified # as a fixed maximum size or a maximum number of elements. # For a fixed maximum size, use -5 through -1, meaning: # -5: max size: 64 Kb <-- not recommended for normal workloads # -4: max size: 32 Kb <-- not recommended # -3: max size: 16 Kb <-- probably not recommended # -2: max size: 8 Kb <-- good # -1: max size: 4 Kb <-- good # Positive numbers mean store up to _exactly_ that number of elements # per list node. # The highest performing option is usually -2 (8 Kb size) or -1 (4 Kb size), # but if your use case is unique, adjust the settings as necessary. list-max-ziplist-size -2 # Lists may also be compressed. # Compress depth is the number of quicklist ziplist nodes from *each* side of # the list to *exclude* from compression. The head and tail of the list # are always uncompressed for fast push/pop operations. Settings are: # 0: disable all list compression # 1: depth 1 means "don't start compressing until after 1 node into the list, # going from either the head or tail" # So: [head]->node->node->...->node->[tail] # [head], [tail] will always be uncompressed; inner nodes will compress. # 2: [head]->[next]->node->node->...->node->[prev]->[tail] # 2 here means: don't compress head or head->next or tail->prev or tail, # but compress all nodes between them. # 3: [head]->[next]->[next]->node->node->...->node->[prev]->[prev]->[tail] # etc. list-compress-depth 0 # Sets have a special encoding in just one case: when a set is composed # of just strings that happen to be integers in radix 10 in the range # of 64 bit signed integers. # The following configuration setting sets the limit in the size of the # set in order to use this special memory saving encoding. set-max-intset-entries 512 # Similarly to hashes and lists, sorted sets are also specially encoded in # order to save a lot of space. This encoding is only used when the length and # elements of a sorted set are below the following limits: zset-max-ziplist-entries 128 zset-max-ziplist-value 64 # HyperLogLog sparse representation bytes limit. The limit includes the # 16 bytes header. When an HyperLogLog using the sparse representation crosses # this limit, it is converted into the dense representation. # # A value greater than 16000 is totally useless, since at that point the # dense representation is more memory efficient. # # The suggested value is ~ 3000 in order to have the benefits of # the space efficient encoding without slowing down too much PFADD, # which is O(N) with the sparse encoding. The value can be raised to # ~ 10000 when CPU is not a concern, but space is, and the data set is # composed of many HyperLogLogs with cardinality in the 0 - 15000 range. hll-sparse-max-bytes 3000 # Streams macro node max size / items. The stream data structure is a radix # tree of big nodes that encode multiple items inside. Using this configuration # it is possible to configure how big a single node can be in bytes, and the # maximum number of items it may contain before switching to a new node when # appending new stream entries. If any of the following settings are set to # zero, the limit is ignored, so for instance it is possible to set just a # max entries limit by setting max-bytes to 0 and max-entries to the desired # value. stream-node-max-bytes 4096 stream-node-max-entries 100 # Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in # order to help rehashing the main Redis hash table (the one mapping top-level # keys to values). The hash table implementation Redis uses (see dict.c) # performs a lazy rehashing: the more operation you run into a hash table # that is rehashing, the more rehashing "steps" are performed, so if the # server is idle the rehashing is never complete and some more memory is used # by the hash table. # # The default is to use this millisecond 10 times every second in order to # actively rehash the main dictionaries, freeing memory when possible. # # If unsure: # use "activerehashing no" if you have hard latency requirements and it is # not a good thing in your environment that Redis can reply from time to time # to queries with 2 milliseconds delay. # # use "activerehashing yes" if you don't have such hard requirements but # want to free memory asap when possible. activerehashing yes # The client output buffer limits can be used to force disconnection of clients # that are not reading data from the server fast enough for some reason (a # common reason is that a Pub/Sub client can't consume messages as fast as the # publisher can produce them). # # The limit can be set differently for the three different classes of clients: # # normal -> normal clients including MONITOR clients # replica -> replica clients # pubsub -> clients subscribed to at least one pubsub channel or pattern # # The syntax of every client-output-buffer-limit directive is the following: # # client-output-buffer-limit <class> <hard limit> <soft limit> <soft seconds> # # A client is immediately disconnected once the hard limit is reached, or if # the soft limit is reached and remains reached for the specified number of # seconds (continuously). # So for instance if the hard limit is 32 megabytes and the soft limit is # 16 megabytes / 10 seconds, the client will get disconnected immediately # if the size of the output buffers reach 32 megabytes, but will also get # disconnected if the client reaches 16 megabytes and continuously overcomes # the limit for 10 seconds. # # By default normal clients are not limited because they don't receive data # without asking (in a push way), but just after a request, so only # asynchronous clients may create a scenario where data is requested faster # than it can read. # # Instead there is a default limit for pubsub and replica clients, since # subscribers and replicas receive data in a push fashion. # # Both the hard or the soft limit can be disabled by setting them to zero. client-output-buffer-limit normal 0 0 0 client-output-buffer-limit replica 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 # Client query buffers accumulate new commands. They are limited to a fixed # amount by default in order to avoid that a protocol desynchronization (for # instance due to a bug in the client) will lead to unbound memory usage in # the query buffer. However you can configure it here if you have very special # needs, such us huge multi/exec requests or alike. # # client-query-buffer-limit 1gb # In the Redis protocol, bulk requests, that are, elements representing single # strings, are normally limited to 512 mb. However you can change this limit # here, but must be 1mb or greater # # proto-max-bulk-len 512mb # Redis calls an internal function to perform many background tasks, like # closing connections of clients in timeout, purging expired keys that are # never requested, and so forth. # # Not all tasks are performed with the same frequency, but Redis checks for # tasks to perform according to the specified "hz" value. # # By default "hz" is set to 10. Raising the value will use more CPU when # Redis is idle, but at the same time will make Redis more responsive when # there are many keys expiring at the same time, and timeouts may be # handled with more precision. # # The range is between 1 and 500, however a value over 100 is usually not # a good idea. Most users should use the default of 10 and raise this up to # 100 only in environments where very low latency is required. hz 10 # Normally it is useful to have an HZ value which is proportional to the # number of clients connected. This is useful in order, for instance, to # avoid too many clients are processed for each background task invocation # in order to avoid latency spikes. # # Since the default HZ value by default is conservatively set to 10, Redis # offers, and enables by default, the ability to use an adaptive HZ value # which will temporarily raise when there are many connected clients. # # When dynamic HZ is enabled, the actual configured HZ will be used # as a baseline, but multiples of the configured HZ value will be actually # used as needed once more clients are connected. In this way an idle # instance will use very little CPU time while a busy instance will be # more responsive. dynamic-hz yes # When a child rewrites the AOF file, if the following option is enabled # the file will be fsync-ed every 32 MB of data generated. This is useful # in order to commit the file to the disk more incrementally and avoid # big latency spikes. aof-rewrite-incremental-fsync yes # When redis saves RDB file, if the following option is enabled # the file will be fsync-ed every 32 MB of data generated. This is useful # in order to commit the file to the disk more incrementally and avoid # big latency spikes. rdb-save-incremental-fsync yes # Redis LFU eviction (see maxmemory setting) can be tuned. However it is a good # idea to start with the default settings and only change them after investigating # how to improve the performances and how the keys LFU change over time, which # is possible to inspect via the OBJECT FREQ command. # # There are two tunable parameters in the Redis LFU implementation: the # counter logarithm factor and the counter decay time. It is important to # understand what the two parameters mean before changing them. # # The LFU counter is just 8 bits per key, it's maximum value is 255, so Redis # uses a probabilistic increment with logarithmic behavior. Given the value # of the old counter, when a key is accessed, the counter is incremented in # this way: # # 1. A random number R between 0 and 1 is extracted. # 2. A probability P is calculated as 1/(old_value*lfu_log_factor+1). # 3. The counter is incremented only if R < P. # # The default lfu-log-factor is 10. This is a table of how the frequency # counter changes with a different number of accesses with different # logarithmic factors: # # +--------+------------+------------+------------+------------+------------+ # | factor | 100 hits | 1000 hits | 100K hits | 1M hits | 10M hits | # +--------+------------+------------+------------+------------+------------+ # | 0 | 104 | 255 | 255 | 255 | 255 | # +--------+------------+------------+------------+------------+------------+ # | 1 | 18 | 49 | 255 | 255 | 255 | # +--------+------------+------------+------------+------------+------------+ # | 10 | 10 | 18 | 142 | 255 | 255 | # +--------+------------+------------+------------+------------+------------+ # | 100 | 8 | 11 | 49 | 143 | 255 | # +--------+------------+------------+------------+------------+------------+ # # NOTE: The above table was obtained by running the following commands: # # redis-benchmark -n 1000000 incr foo # redis-cli object freq foo # # NOTE 2: The counter initial value is 5 in order to give new objects a chance # to accumulate hits. # # The counter decay time is the time, in minutes, that must elapse in order # for the key counter to be divided by two (or decremented if it has a value # less <= 10). # # The default value for the lfu-decay-time is 1. A special value of 0 means to # decay the counter every time it happens to be scanned. # # lfu-log-factor 10 # lfu-decay-time 1 ########################### ACTIVE DEFRAGMENTATION ####################### # # What is active defragmentation? # ------------------------------- # # Active (online) defragmentation allows a Redis server to compact the # spaces left between small allocations and deallocations of data in memory, # thus allowing to reclaim back memory. # # Fragmentation is a natural process that happens with every allocator (but # less so with Jemalloc, fortunately) and certain workloads. Normally a server # restart is needed in order to lower the fragmentation, or at least to flush # away all the data and create it again. However thanks to this feature # implemented by Oran Agra for Redis 4.0 this process can happen at runtime # in a "hot" way, while the server is running. # # Basically when the fragmentation is over a certain level (see the # configuration options below) Redis will start to create new copies of the # values in contiguous memory regions by exploiting certain specific Jemalloc # features (in order to understand if an allocation is causing fragmentation # and to allocate it in a better place), and at the same time, will release the # old copies of the data. This process, repeated incrementally for all the keys # will cause the fragmentation to drop back to normal values. # # Important things to understand: # # 1. This feature is disabled by default, and only works if you compiled Redis # to use the copy of Jemalloc we ship with the source code of Redis. # This is the default with Linux builds. # # 2. You never need to enable this feature if you don't have fragmentation # issues. # # 3. Once you experience fragmentation, you can enable this feature when # needed with the command "CONFIG SET activedefrag yes". # # The configuration parameters are able to fine tune the behavior of the # defragmentation process. If you are not sure about what they mean it is # a good idea to leave the defaults untouched. # Enabled active defragmentation # activedefrag no # Minimum amount of fragmentation waste to start active defrag # active-defrag-ignore-bytes 100mb # Minimum percentage of fragmentation to start active defrag # active-defrag-threshold-lower 10 # Maximum percentage of fragmentation at which we use maximum effort # active-defrag-threshold-upper 100 # Minimal effort for defrag in CPU percentage, to be used when the lower # threshold is reached # active-defrag-cycle-min 1 # Maximal effort for defrag in CPU percentage, to be used when the upper # threshold is reached # active-defrag-cycle-max 25 # Maximum number of set/hash/zset/list fields that will be processed from # the main dictionary scan # active-defrag-max-scan-fields 1000 # Jemalloc background thread for purging will be enabled by default jemalloc-bg-thread yes # It is possible to pin different threads and processes of Redis to specific # CPUs in your system, in order to maximize the performances of the server. # This is useful both in order to pin different Redis threads in different # CPUs, but also in order to make sure that multiple Redis instances running # in the same host will be pinned to different CPUs. # # Normally you can do this using the "taskset" command, however it is also # possible to this via Redis configuration directly, both in Linux and FreeBSD. # # You can pin the server/IO threads, bio threads, aof rewrite child process, and # the bgsave child process. The syntax to specify the cpu list is the same as # the taskset command: # # Set redis server/io threads to cpu affinity 0,2,4,6: # server_cpulist 0-7:2 # # Set bio threads to cpu affinity 1,3: # bio_cpulist 1,3 # # Set aof rewrite child process to cpu affinity 8,9,10,11: # aof_rewrite_cpulist 8-11 # # Set bgsave child process to cpu affinity 1,10,11 # bgsave_cpulist 1,10-11 # In some cases redis will emit warnings and even refuse to start if it detects # that the system is in bad state, it is possible to suppress these warnings # by setting the following config which takes a space delimited list of warnings # to suppress # # ignore-warnings ARM64-COW-BUG 在里面那边加上bind 0.0.0.0
05-24
#!/usr/bin/env python3 """ submodular_match_selection.py Run: python submodular_match_selection.py left.jpg right.jpg --K 200 """ import argparse import cv2 import numpy as np from heapq import heappush, heappop import math import time # -------------------- # Orientation & Diversity # -------------------- def match_direction_from_pts(pt1, pt2): x1, y1 = pt1 x2, y2 = pt2 return math.atan2(y2 - y1, x2 - x1) def delta_R(candidate, S, sigma=0.6): """ Orientation consistency gain. If |S| < threshold, return neutral 1.0 so it doesn't dominate early. """ if len(S) < 5: return 1.0 dirs = np.array([match_direction_from_pts(s['pt1'], s['pt2']) for s in S]) # compute circular mean mean_dir = math.atan2(np.mean(np.sin(dirs)), np.mean(np.cos(dirs))) dtheta = match_direction_from_pts(candidate['pt1'], candidate['pt2']) - mean_dir dtheta = (dtheta + math.pi) % (2*math.pi) - math.pi return math.exp(-(dtheta ** 2) / (sigma ** 2)) def delta_V(candidate, S, tau=80.0): """ Spatial diversity: larger if candidate is far from nearest selected point. Returns value in (0,1], ~1 when far, -> ~0 when very close. """ if len(S) == 0: return 1.0 p = np.array(candidate['pt1']) pts = np.array([s['pt1'] for s in S]) dists = np.linalg.norm(pts - p[None, :], axis=1) dmin = float(np.min(dists)) return math.exp(-dmin / tau) # -------------------- # Visualization utils # -------------------- def draw_matches(img1, kp1, img2, kp2, matches, outpath='selected_matches.png'): h1, w1 = img1.shape[:2] h2, w2 = img2.shape[:2] out_h = max(h1, h2) out_w = w1 + w2 out = np.zeros((out_h, out_w, 3), dtype=np.uint8) out[:h1, :w1] = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB) if img1.ndim == 3 else cv2.cvtColor(img1, cv2.COLOR_GRAY2BGR) out[:h2, w1:w1+w2] = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB) if img2.ndim == 3 else cv2.cvtColor(img2, cv2.COLOR_GRAY2BGR) for m in matches: pt1 = tuple(map(int, kp1[m['qidx']].pt)) pt2 = tuple(map(int, (int(kp2[m['tidx']].pt[0]) + w1, int(kp2[m['tidx']].pt[1])))) color = tuple(np.random.randint(50, 255, size=3).tolist()) cv2.circle(out, pt1, 3, color, -1) cv2.circle(out, pt2, 3, color, -1) cv2.line(out, pt1, pt2, color, 1) # write BGR cv2.imwrite(outpath, out[:,:,::-1]) print(f"Saved visualization to {outpath}") # -------------------- # Sampson distance (for residuals) # -------------------- def sampson_distance(pt1, pt2, F): if F is None: return float('inf') x1 = np.array([pt1[0], pt1[1], 1.0]) x2 = np.array([pt2[0], pt2[1], 1.0]) Fx1 = F.dot(x1) Ftx2 = F.T.dot(x2) num = (x2.T.dot(F).dot(x1))**2 den = Fx1[0]**2 + Fx1[1]**2 + Ftx2[0]**2 + Ftx2[1]**2 if den <= 0: return float('inf') return float(num / (den + 1e-12)) # -------------------- # Candidate generation # -------------------- def detect_and_match(img1, img2, ratio_th=0.75): # ORB detector (fast, doesn't need contrib) orb = cv2.ORB_create(5000) kp1, des1 = orb.detectAndCompute(img1, None) kp2, des2 = orb.detectAndCompute(img2, None) if des1 is None or des2 is None or len(kp1) < 4 or len(kp2) < 4: return [], kp1, kp2 bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=False) raw_matches = bf.knnMatch(des1, des2, k=2) candidates = [] for pair in raw_matches: if len(pair) < 2: continue m, n = pair if m.distance < ratio_th * n.distance: conf = 1.0 - (m.distance / 256.0) candidates.append({ 'qidx': m.queryIdx, 'tidx': m.trainIdx, 'distance': m.distance, 'conf': conf }) print(f"Detected {len(kp1)} / {len(kp2)} keypoints, initial candidates after ratio test: {len(candidates)}") return candidates, kp1, kp2 # -------------------- # Grid assignment # -------------------- def assign_cells(kp1, kp2, candidates, grid=(8,8), img1_shape=None, img2_shape=None): gx, gy = grid h1, w1 = img1_shape[:2] h2, w2 = img2_shape[:2] def cell_of(pt, w, h, gx, gy): x, y = pt cx = min(int(x / (w / gx)), gx-1) cy = min(int(y / (h / gy)), gy-1) return cy * gx + cx for c in candidates: pt1 = kp1[c['qidx']].pt pt2 = kp2[c['tidx']].pt c['pt1'] = (float(pt1[0]), float(pt1[1])) c['pt2'] = (float(pt2[0]), float(pt2[1])) c['cellA'] = cell_of(pt1, w1, h1, gx, gy) c['cellB'] = cell_of(pt2, w2, h2, gx, gy) c['cell_pair'] = (c['cellA'], c['cellB']) return candidates # -------------------- # Coverage / Overlap gains # -------------------- def delta_D(candidate, precomp): cell = candidate['cellA'] return 1.0 if cell not in precomp['cell_covered'] else 0.0 def delta_O(candidate, precomp): pair = candidate['cell_pair'] return 1.0 if pair not in precomp['cell_pairs_covered'] else 0.0 # -------------------- # Fit Fundamental matrix (RANSAC) given a list of matches # -------------------- def fit_F_from_matches(match_list): if match_list is None or len(match_list) < 8: return None pts1 = np.float32([m['pt1'] for m in match_list]) pts2 = np.float32([m['pt2'] for m in match_list]) F, mask = cv2.findFundamentalMat(pts1, pts2, cv2.FM_RANSAC, 3.0, 0.99) if F is None or F.shape != (3,3): return None return F def mean_sampson_over_candidates(F, candidates): if F is None: return float('inf') vals = [] for c in candidates: d = sampson_distance(c['pt1'], c['pt2'], F) if np.isfinite(d): vals.append(d) if len(vals) == 0: return float('inf') return float(np.mean(vals)) # -------------------- # True marginal geometric gain ΔG_real # -------------------- def delta_G_real(candidate, S, candidates, M_S): """ Compute delta = R(S) - R(S U {candidate}) where R(·) is mean Sampson distance across all candidates under fitted F. M_S is current model (F) for S (may be None). """ # baseline residual under current model R_S = mean_sampson_over_candidates(M_S['F']) if (M_S is not None and 'F' in M_S and M_S['F'] is not None) else None # fit model for S ∪ {candidate} S_new = S + [candidate] F_new = fit_F_from_matches(S_new) R_new = mean_sampson_over_candidates(F_new) # if no baseline model, fall back to candidate.conf as proxy if R_S is None or not np.isfinite(R_S): # give a proxy: if new model exists, reward by 1/R_new (smaller R_new -> larger gain) if F_new is None or not np.isfinite(R_new): return 0.0 # use improvement relative to median/scale: bigger when R_new small return max(0.0, 1.0 / (1.0 + R_new)) # normalized small reward if not np.isfinite(R_new): return 0.0 delta = R_S - R_new return max(0.0, float(delta)) # -------------------- # Initial heap scores # -------------------- def initial_heap_scores(candidates, M_S, alpha, beta, gamma, mu, lambda_R, eta_V, precomp): heap = [] S_empty = [] # at init S is empty for i, c in enumerate(candidates): g_geom = delta_G_real(c, S_empty, candidates, M_S) if M_S is not None else c.get('conf', 0.5) g = (alpha * g_geom + beta * delta_D(c, precomp) + gamma * delta_O(c, precomp) + lambda_R * delta_R(c, S_empty) + eta_V * delta_V(c, S_empty) - mu) heappush(heap, (-g, i, 0)) return heap # -------------------- # Lazy greedy selection (with true delta_G evaluation) # -------------------- def greedy_select(candidates, K=200, alpha=1.0, beta=0.5, gamma=0.8, mu=0.01, refit_every=50, img1=None, img2=None, kp1=None, kp2=None, lambda_R=0.3, eta_V=0.2): S = [] precomp = {'cell_covered': set(), 'cell_pairs_covered': set()} M_S = {'F': None} # initial heap heap = initial_heap_scores(candidates, M_S, alpha, beta, gamma, mu, lambda_R, eta_V, precomp) selected_idx = set() iter_count = 0 t0 = time.time() while len(S) < K and heap: neg_g, idx, last_eval_size = heappop(heap) # skip if already selected if idx in selected_idx: continue # if S changed since last eval -> re-evaluate and push if last_eval_size != len(S): c = candidates[idx] # compute current geometric gain using current model M_S g_geom = delta_G_real(c, S, candidates, M_S) g = (alpha * g_geom + beta * delta_D(c, precomp) + gamma * delta_O(c, precomp) + lambda_R * delta_R(c, S) + eta_V * delta_V(c, S) - mu) heappush(heap, (-g, idx, len(S))) continue # accept current top c = candidates[idx] S.append(c) selected_idx.add(idx) precomp['cell_covered'].add(c['cellA']) precomp['cell_pairs_covered'].add(c['cell_pair']) iter_count += 1 # update orientation cache? (not necessary; we compute from S on the fly) # Periodically refit fundamental matrix on current S (robust) if (iter_count % refit_every == 0) or (len(S) in [8, 16, 32]): if len(S) >= 8: F = fit_F_from_matches(S) if F is not None: M_S['F'] = F # force re-eval of heap by pushing placeholders for i in range(len(candidates)): if i in selected_idx: continue heappush(heap, (0.0, i, len(S))) t1 = time.time() # print timing summary # print(f"greedy_select finished in {t1 - t0:.2f}s, selected {len(S)}") return S, M_S # -------------------- # Main flow # -------------------- def main(args): img1 = cv2.imread(args.img1) img2 = cv2.imread(args.img2) if img1 is None or img2 is None: print("Error: could not read images. Provide valid paths.") return # grayscale for feature detection g1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY) g2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY) candidates, kp1, kp2 = detect_and_match(g1, g2, ratio_th=args.ratio) if len(candidates) == 0: print("No matches found after ratio test.") return candidates = assign_cells(kp1, kp2, candidates, grid=(args.grid, args.grid), img1_shape=img1.shape, img2_shape=img2.shape) # optional prefilter: keep top by conf candidates.sort(key=lambda c: -c.get('conf', 0.5)) if args.max_candidates is not None: candidates = candidates[:args.max_candidates] print(f"Truncated to top {len(candidates)} candidates by descriptor confidence.") print("Running greedy selection ... (this may take some time for many candidates)") S, M_S = greedy_select( candidates, K=args.K, alpha=args.alpha, beta=args.beta, gamma=args.gamma, mu=args.mu, refit_every=args.refit_every, img1=img1, img2=img2, kp1=kp1, kp2=kp2, lambda_R=args.lambda_R, eta_V=args.eta_V ) print(f"Selected {len(S)} matches (requested K={args.K}).") # compute simple stats: median sampson of selected samps = [] for s in S: samps.append(sampson_distance(s['pt1'], s['pt2'], M_S['F']) if (M_S and M_S.get('F') is not None) else None) samps_valid = [x for x in samps if x is not None and np.isfinite(x)] if len(samps_valid) > 0: print(f"Selected median Sampson distance: {np.median(samps_valid):.4f}") else: print("No valid Sampson distances (insufficient model).") # visualize draw_matches(img1, kp1, img2, kp2, S, outpath=args.out) if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('img1', type=str, help='left image path') parser.add_argument('img2', type=str, help='right image path') parser.add_argument('--K', type=int, default=200, help='number of matches to select') parser.add_argument('--grid', type=int, default=8, help='grid size (grid x grid)') parser.add_argument('--alpha', type=float, default=1.0) parser.add_argument('--beta', type=float, default=0.5) parser.add_argument('--gamma', type=float, default=0.8) parser.add_argument('--lambda_R', type=float, default=0.3, help='weight for orientation consistency') parser.add_argument('--eta_V', type=float, default=0.2, help='weight for spatial diversity') parser.add_argument('--mu', type=float, default=0.01) parser.add_argument('--refit_every', type=int, default=50) parser.add_argument('--ratio', type=float, default=0.75, help='ratio test threshold') parser.add_argument('--max_candidates', type=int, default=2000, help='truncate top candidates by descriptor confidence (optional)') parser.add_argument('--out', type=str, default='selected_matches.png', help='output visualization filename') args = parser.parse_args() main(args)
12-08
<think> 我们正在处理一个关于车道线覆盖检测的问题。用户希望检查长轨道段中未被车道线覆盖的部分。这个问题可能涉及计算机视觉或自动驾驶领域中的车道检测与评估。 核心需求:检测长轨道段(可能是连续的轨迹或道路段)中哪些部分没有被车道线覆盖。 分析: 1. 输入:长轨道段(可能是一系列连续的点坐标)车道线数据(可能是由一系列点或线段组成的集合)。 2. 输出:轨道段中未被车道线覆盖的部分(可能是一个子段集合)。 方法思路: 我们可以将轨道段离散化为一系列点,然后检查每个点是否被车道线覆盖。如果连续一段轨道点都没有被覆盖,则形成一个未被覆盖的轨道段。 步骤: (1) 轨道段离散化:将轨道段按一定步长(例如1米)采样,得到点序列$P=\{p_1, p_2, ..., p_n\}$。 (2) 车道线表示:车道线通常由多条线段组成(每条车道线被近似为折线段)。设车道线集合为$L=\{l_1, l_2, ..., l_m\}$,其中每条车道线$l_i$由多个点连接而成。 (3) 定义覆盖条件:对于轨道上的一个点$p$,如果它到任意一条车道线的距离小于等于一个阈值(例如0.5米),则认为该点被覆盖。 (4) 遍历轨道点:对每个点$p_i$,计算其到所有车道线的最小距离$d_i$,若$d_i \leq \text{threshold}$,则标记为覆盖(True),否则为未覆盖(False)。 (5) 找出未被覆盖的连续段:遍历标记序列,将连续的False标记的区间提取出来,即得到未被覆盖的长轨道段。 注意:轨道段可能很长,需要考虑计算效率。可以使用空间分割结构(如KD树)来加速距离查询,但车道线是线段集合,我们可以使用线段树或预先构建空间索引。 伪代码: ``` function check_coverage(track, lane_lines, threshold): # 离散化轨道 points = discretize_track(track, step=1.0) coverage_flags = [] # 存储每个点是否被覆盖 # 为车道线构建空间索引(例如使用KD树,但注意:KD树适用于点,这里我们需要点到线段的距离) # 我们可以将车道线拆分为许多小线段,然后取这些线段的中点或端点构建点集?但这样不准确。 # 更准确的方法是使用专门的空间索引结构来索引线段(例如R树),或者使用近似方法:将车道线点云化(密集采样)然后构建KD树,但这样会有误差。 # 另一种方法:不构建索引,直接计算每个轨道点到每条车道线的最小距离(效率低,但简单,如果车道线不多可以接受)。 # 假设车道线数量不多,我们可以直接计算 for point in points: min_dist = infinity for line in lane_lines: # 计算点到线段的最小距离 dist = point_to_line_distance(point, line) if dist < min_dist: min_dist = dist coverage_flags.append(min_dist <= threshold) # 现在coverage_flags是一个布尔列表,True表示覆盖,False表示未覆盖 # 找出连续的未覆盖段 segments = [] # 存储未覆盖的段 [start_index, end_index] start = None for i, flag in enumerate(coverage_flags): if not flag: # 未覆盖 if start is None: start = i else: if start is not None: segments.append((points[start], points[i-1])) # 记录这段的起点终点(在轨道上的位置) start = None # 处理末尾 if start is not None: segments.append((points[start], points[-1])) return segments ``` 改进:如果车道线数据很大,需要优化距离计算。可以将所有车道线拆分成小线段,然后构建一个空间索引(如KD树,存储的是线段,但KD树通常用于点,所以我们可以用另一种索引结构如R树,或者使用专门的线段索引)。 这里,我们使用一个近似:将每条车道线密集采样成点集,然后构建一个包含所有车道线点的KD树,然后计算轨道点到这个点集的最短距离(作为到车道线的距离的近似)。这样做的误差取决于采样密度。 另一种精确方法是使用计算点到线段距离的快速方法,并利用空间划分(如四叉树或网格)来减少需要计算的车道线线段数量。 考虑到问题规模,如果轨道很长且车道线复杂,我们可能需要优化。但这里我们先给出一个简单实现。 我们假设车道线由点列表示,相邻点用线段连接。那么,点到车道线的距离就是点到这条车道线所有线段的最小距离。 实现点到线段的距离函数(point_to_line_segment_distance): 给定点P线段AB(由两个点AB组成),计算点到线段的距离。 方法: 计算向量AP、AB、BP、BA,然后判断投影点是否在线段上: 若投影点在线段上,则距离为点到直线的距离; 若在端点外,则取到最近端点的距离。 公式: $$ d = \min(|\overrightarrow{AP}|, |\overrightarrow{BP}|) $$ 如果投影点不在线段上; 否则,$$ d = \frac{| \overrightarrow{AP} \times \overrightarrow{AB} |}{| \overrightarrow{AB} |} $$ 具体步骤: 设向量:$\overrightarrow{AB} = B - A$,$\overrightarrow{AP} = P - A$。 计算$t = \frac{\overrightarrow{AP} \cdot \overrightarrow{AB}}{|\overrightarrow{AB}|^2}$。 如果$t < 0$,则最近点为A,距离为$|AP|$。 如果$t > 1$,则最近点为B,距离为$|BP|$。 否则,投影点在线段上,距离为$|\overrightarrow{AP} - t \cdot \overrightarrow{AB}|$(或使用叉积计算:$ \frac{|\overrightarrow{AP} \times \overrightarrow{AB}|}{|\overrightarrow{AB}|} $)。 注意:叉积方式在二维情况下:$\overrightarrow{AP} \times \overrightarrow{AB} = (AP_x \cdot AB_y - AP_y \cdot AB_x)$,其绝对值除以$|\overrightarrow{AB}|$。 我们选择使用向量投影的方法: ``` def point_to_line_segment_distance(point, segment): # segment: (A, B) 两个点 A, B = segment AB = (B[0]-A[0], B[1]-A[1]) AP = (point[0]-A[0], point[1]-A[1]) AB_squared = AB[0]**2 + AB[1]**2 if AB_squared == 0: # 线段退化为点 return math.sqrt(AP[0]**2+AP[1]**2) t = (AP[0]*AB[0] + AP[1]*AB[1]) / AB_squared if t < 0: # 最近点是A return math.sqrt(AP[0]**2+AP[1]**2) elif t > 1: BP = (point[0]-B[0], point[1]-B[1]) return math.sqrt(BP[0]**2+BP[1]**2) else: # 投影点在线段上 projection = (A[0] + t*AB[0], A[1] + t*AB[1]) dx = point[0] - projection[0] dy = point[1] - projection[1] return math.sqrt(dx*dx+dy*dy) ``` 然后,对于一条车道线(由多个线段组成),我们计算点到每个线段的距离,取最小值。 由于一条车道线有很多线段,我们需要遍历所有线段。对于整个车道线集合,我们需要遍历所有车道线,再遍历每条车道线的所有线段。 因此,计算量可能很大。优化方法:可以先对车道线线段建立空间索引(如R树),这样对于每个轨道点,我们可以快速查询附近的车道线线段。 但是,由于用户没有提供具体的数据规模,我们先按照简单方法实现,并提醒用户如果数据量大则需要优化。 最后,我们将未覆盖的段返回。 此外,用户要求的是“长轨道段”,所以我们可能还需要设置一个最小长度阈值,只返回长度大于某个值的未覆盖段(例如连续未覆盖的点数对应实际长度超过10米)。 因此,在输出未被覆盖的段时,我们还需要计算每个连续段在原始轨道上的实际长度(可以通过起点终点在轨道中的位置计算,或者通过累计离散点之间的距离得到近似长度),然后过滤掉较短的段。 步骤补充: 在得到连续的未覆盖段(在离散点上的索引)后,计算该段对应的轨道长度(例如,该段起点到终点的轨道长度),如果长度大于阈值,则保留。 伪代码修改: 在生成segments时,我们记录的是离散点的索引区间。然后,我们可以计算该区间对应的轨道长度(注意:离散点之间并不一定是等长的,因为轨道离散化时可能是按固定步长采样,但最后一个步长可能不足。但如果我们按固定步长采样,那么每个点间距相等,则长度=步长*点数。然而,实际轨道可能是曲线,离散点之间的实际距离可能不是精确的步长。为了精确,我们可以在离散化时记录每个点的累计弧长。) 建议:在离散化轨道时,计算并保存每个点的累计弧长s[i]。这样,一个区间[i, j]对应的长度就是s[j]-s[i]。 因此,离散化轨道时,我们沿着轨道按固定步长(弧长步长)采样,这样点与点之间的弧长就是步长(除了最后一段可能不足)。那么连续k个未覆盖点对应的长度就是(k-1)*步长?不,如果从第i个点到第j个点(共j-i+1个点),那么它们之间的弧长应该是s[j]-s[i](注意s[i]表示从起点到第i个点的弧长)。而相邻点之间的弧长差为步长,所以从i到j的弧长=(j-i)*步长(如果步长固定)。 所以,如果未覆盖段有n个连续点,那么弧长=(n-1)*步长。但考虑到离散点代表的是位置,两个相邻点之间的轨道段是步长,所以从第一个点到最后一个点,中间有(n-1)个间隔,因此长度=(n-1)*步长。 但是,如果我们要求未覆盖段实际长度大于阈值(如10米),那么需要满足:(n-1)*步长 >= 阈值。 因此,我们可以在得到连续未覆盖段(索引从i到j,共j-i+1个点)后,计算长度=(j-i)*步长(因为从i到j有j-i个间隔,每个间隔步长)。例如:索引0到索引1,长度为1*步长。 所以,在生成segments时,我们记录起点索引终点索引,然后计算长度=(end_index - start_index) * step,然后过滤。 注意:离散化步长选择:步长越小,精度越高,但计算量越大。通常可以选择0.5米或1米。 代码结构: 我们将轨道离散化函数、点到线段距离函数、覆盖检查函数等组织起来。 由于用户的问题比较抽象,我们给出一个概念性的解决方案,并附上Python代码框架。 注意:这里假设轨道车道线都是二维平面上的。如果是三维,则需要调整距离计算(但车道线通常是二维的,即使在高程上有变化,也可以考虑投影到平面或使用三维距离)。 下面我们用Python代码实现关键步骤: </think>要检查长轨道段中未被车道线覆盖的车道线覆盖情况,可以按照以下步骤进行: ### 解决方案步骤 1. **数据准备与预处理** - 获取轨道段数据(GPS坐标序列)车道线数据(多边形或线条几何) - 将轨道段离散化为密集的点序列,采样间隔建议0.5-1米 - 将车道线数据转换为缓冲区(建议宽度0.3-0.5米)以表示覆盖范围 2. **空间覆盖分析** - 对每个轨道点计算其到最近车道线的距离: $$ d_i = \min(\text{distance}(P_i, L_j)) \quad \forall j $$ - 标记未覆盖点:$ \text{uncovered} = \{P_i | d_i > \text{阈值}\} $(阈值通常设为0.3-0.5米) - 使用空间索引(如KD树)加速距离查询[^1] 3. **连续未覆盖段识别** ```python def find_uncovered_segments(points, coverage_flags, min_length=10): segments = [] start_idx = None for i, covered in enumerate(coverage_flags): if not covered and start_idx is None: # 未覆盖段开始 start_idx = i elif covered and start_idx is not None: # 未覆盖段结束 if distance(points[start_idx], points[i-1]) >= min_length: segments.append((points[start_idx], points[i-1])) start_idx = None return segments ``` 4. **可视化与输出** - 生成覆盖热力图:红色表示未覆盖,绿色表示已覆盖 - 输出未覆盖段统计: - 总未覆盖率 = $\frac{\text{未覆盖点数量}}{\text{总点数}} \times 100\%$ - 最长未覆盖段长度 - 未覆盖段位置分布 ### 关键算法说明 **空间关系计算**(使用Shapely库示例): ```python from shapely.geometry import LineString, MultiLineString from shapely.ops import nearest_points def calculate_coverage(track_points, lane_lines, buffer_dist=0.3): coverage = [] buffered_lanes = lane_lines.buffer(buffer_dist) # 创建车道缓冲区 for point in track_points: # 检查点是否在任一车道缓冲区内 coverage.append(buffered_lanes.contains(point)) return coverage ``` ### 优化建议 1. **多尺度分析**: - 粗粒度:快速识别大段未覆盖区域(采样间隔5-10米) - 细粒度:对问题区域精细分析(采样间隔0.1-0.5米) 2. **动态阈值调整**: - 根据道路等级调整覆盖阈值(高速路阈值>城市道路) - 弯道区域适当增大阈值:$ \text{threshold} = \text{base} \times (1 + \frac{|c|}{\text{max\_curvature}}) $ 3. **时序分析**(针对移动设备): - 结合IMU数据补偿GPS漂移 - 使用滑动窗口检测瞬时覆盖丢失: $$ \text{loss\_count} = \sum_{i=k}^{k+w} \mathbb{1}_{\text{uncovered}}(i) $$ ### 输出示例 ``` [分析报告] 轨道总长度:12.8 km 未覆盖率:8.7% 未覆盖段数量:23 最长未覆盖段:142 m(位置:32°12'N, 118°45'E) 建议检测区域:3处连续弯道(标记见附图) ```
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

量化风云

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值