B. Scenes From a Memory

该博客讨论了一道编程竞赛题目,要求从输入的整数中删除数字,使剩余的数(至少一位)不再是质数。解题策略包括检查单个数字和组合两位数字来形成非质数。提供的AC代码实现了这一策略,并通过了所有测试用例。

https://codeforces.com/contest/1562/problem/B

在这里插入图片描述
INPUT

7
3
237
5
44444
3
221
2
35
3
773
1
4
30
626221626221626221626221626221

OUTPUT

2
27
1
4
1
1
2
35
2
77
1
4
1
6

题意
给T,T组数据,每组数据中有一个len和一个数,len为数的位数,你需要尽可能删去数内位置数字并且剩下一个最短的非质数
注意1也算个非质数。
输出要求:先输出n,为最后剩下数的长度,再输出剩下的数。

解法

既然是求最短,如果其中一位就是非质数,那直接输出长度为1,和这个数。(判断质数的时候注意特判1)
如果每一位上的数都是质数就排查任意两位数是否能组成非质数。然后输出2以及这个两个数的非质数
排查两位数就能出结果,因为不存在一个三位数每位上都是质数但是自己不是质数(详情见下面打表)

在这里插入图片描述
AC代码

#include<bits/stdc++.h>
using namespace std;

bool check(int a)
{
	if(a == 1)	return true;
	for(int i = 2;i < a;i++)
		if(a % i == 0)
			return true;	//no
	return false;
}

int main()
{
	int T;cin>>T;
	string s;
	int len;
	while(T--)
	{
		cin>>len>>s;
		bool flag = false;
		for(int i = 0;i<len;i++)
		{
			if(check(s[i] - '0'))
			{
				cout<<1<<endl<<(s[i] - '0')<<endl;
				flag = true;
				break;
			}
		}
		if(flag)	continue;
		for(int i = 0;i<len;i++)
			for(int j = i + 1;j<len;j++)
			{
				if(check((s[i] - '0') * 10 + (s[j] - '0')))
				{
					cout<<2<<endl<<(s[i] - '0') * 10 + (s[j] - '0')<<endl;
					goto a;
				}
			}
		a: continue;	
	}
	return 0;
	
}

如有问题敬请直接指出哈
ε=ε=ε=┏(゜ロ゜;)┛

3 Problem Formulation Given a language instruction I and the agent’s ego- centric observation O, the aerial VLN agent has to determine a sequence of action to reach the tar- get location pd in a continuous 3D space. At each action step t, the agent follows a policy π taking current observation ot and instruction I as input to predict the next action at and move to location pt by its kinematic model F, which is given by: pt = F(pt−1, π(ot, I)), (1) Given a sequence of action, the agent reaches a final position, and the success probability Ps of reaching the target pd is Ps = P (||F(π(p0, O, I)) − pd|| < ϵ), (2) where || · || is the Euclidean distance and ϵ is the threshold that indicates if the target is reached. Thus, the goal of VLN is to find a policy π∗ that maximizes the success rate, given by: π∗ = argmaxπPs. (3) 4 CityNavAgent In this section, we present the workflow of the pro- posed CityNavAgent for zero-shot aerial VLN in urban environments. As shown in Figure 1, City- NavAgent framework comprises three key mod- ules. 1) The open-vocabulary perception module extracts structured semantic and spatial informa- tion from its panoramic observation via a founda- tion model. 2) The hierarchical semantic planning module (HSPM) leverages LLM to decompose the navigation task into hierarchical planning sub-tasks to reduce the planning complexity and predict the intermediate waypoint for the execution module. 3) The global memory module, represented as a topological graph, stores valid waypoints extracted from historical trajectories to further reduce the action space of motion planning and enhance long- range navigation stability. 4.1 Open-vocabulary Perception Module To accurately understand complex semantics in the urban environment, we leverage the powerful open- vocabulary captioning and grounding model to ex- tract scene semantic features. Besides, integrating the scene semantics and depth information, we con- struct a 3D semantic map for further planning. Scene Semantic Perception Extracting urban scene semantics requires robust open-vocabulary object recognition. As shown in Figure 2, given a set of panoramic images {It Ri } at step t, we first Memory Graph Update History TrajectoryHistory Trajectory Hierarchical Semantic Planning Module Global Memory Module Memory Graph Search Semantic Graph Search Waypoint Sequence Object-level Planner Start Landmark Objects on the way Object-level Planner Start Landmark Objects on the wayStart End Landmarks on the path Landmark-level Planner Start End Landmarks on the path Landmark-level Planner Motion-level Planner Motion-level Planner Visual Detector (VD) Visual Detector (VD) Image Captioner (CAP) Image Captioner (CAP) Image Captioner (CAP) Segmentation (SEG) Visual Tokenizer (VT) Visual Tokenizer (VT) Semantic Point CloudSemantic Point Cloud Object Phrases Open-vocabulary Object phrases: road, white buildings Object Phrases Open-vocabulary Object phrases: road, white buildings Geometric Projector (GP) Geometric Projector (GP) Semantic masksSemantic masksRGB ImagesRGB Images Depth ImagesDepth Images Open-vocabulary Perception Module Go straight to the telephone booth at the end of the road. Then find the white statue on your right. Instruction Go straight to the telephone booth at the end of the road. Then find the white statue on your right. Instruction Next WaypointFigure 2: CityNavAgent consists of three key modules. The open-vocabulary module extracts open-vocabulary objects in the scenes and builds a semantic point cloud of the surroundings. The hierarchical semantic planning module decomposes the original instruction into sub-goals with different semantic levels and predicts the agent’s next action to achieve the high-level sub-goals. The global memory module stores historical trajectories to assist motion planning toward visited targets. use an open-vocabulary image captioner CAP(·) empowered by GPT-4V (Achiam et al., 2023) to generate object captions ct i for the image It Ri . Then, we leverage a visual detector VD(·) named Ground- ingDINO (Liu et al., 2023a), to generate the bound- ing boxes obbt i for captioned objects by obbi = VD(ct i, It Ri ), ct i = CAP(It Ri ). (4) Finally, the bounding boxes are tokenized by a visual tokenizer VT(·), which is then fed into a segmentation model SEG(·) (Kirillov et al., 2023) to generate fine-grained semantic masks It Si for objects as follows: It Si = SEG(VT(obbi), It Ri ). (5) Scene Spatial Perception Considering that ego- centric views suffer from perspective overlap (An et al., 2023) and fail to capture 3D spatial rela- tionships (Gao et al., 2024b), we construct a 3D point map by projecting segmentation masks of observations into metric 3D space. Leveraging the RGB-D sensor’s depth map It Di and agent’s pose (R, T ), a geometric projector (GP) transforms each segmented pixel pik = (u, v) ∈ It Si labeled with caption ct ik into a 3D point Pik via: Pik = R · ID(u, v) · K−1 · p + T, (6) where K is the intrinsic matrix of the camera, while R ∈ SO(3) and T ∈ R3 represent the agent’s in- stantaneous orientation and position in world co- ordinates. Mapping the object caption from 2D masks to 3D point cloud, a local semantic point cloud {(Pik, ct ik)|i = 1, . . . , n, k = 1, . . . , m} is constructed, where n is the number of panoramic images and m is the number of pixels. 4.2 Hierarchical Semantic Planning Module 4.2.1 Landmark-level Planning Since aerial VLN tasks typically involve long- range decision-making (Liu et al., 2023b; Chen et al., 2019), directly assigning the entire naviga- tion task to the agent can hinder accurate alignment with the instructions and task progress tracking. A more effective approach is to decompose the task into a sequence of manageable sub-goals. By ad- dressing these sub-goals step by step, the agent can progressively reach the final destination. To achieve this, we propose a landmark-level plan- ner driven by LLM (Achiam et al., 2023) to parse free-form instructions T and extract a sequence of landmark phases L along the path through prompt engineering. These landmarks act as sub-goals for the agent. We present a simple prompt as follows (more details in Appendix A): You need to extract a landmark sequence from the given instruction. The sequence order should be consistent with their appear- ance order on the path. 4.2.2 Object-level Planning After landmark-level planning and obtaining a se- quence of sub-goals, the object-level planner OP(·) employs the LLM to further decompose these sub- goals into more achievable steps for the agent. The key idea is to leverage the commonsense knowl- edge of the LLM to reason for the visible object region most pertinent to the invisible sub-goal in the current panorama. This region is referred to as the object region of interest (OROI) in this paper. For example, if the agent only sees the buildings and a road in the current view while its sub-goal is the traffic light, by commonsense reasoning, the next OROI it should go is the road. We design a prompt that comprises the original navigation in- struction T , scene object captions ct, and current sub-goals Li for OP(·) to reason for OROI ct OROI , which is given by: ct OROI = OP(T, Li, ct), (7) Its template is (more details in Appendix A.1): The navigation instruction is: .... Your next navigation subgoal is: ... Objects or areas you observed: ... Based on the instruction, next sub- goal, and observation, list 3 objects most pertinent to the subgoal or you will probably go next from your [Observed Objects]. Output should be in descending order of probability. We select the OROI with the highest possibility given by LLM to the next landmark as the next waypoint for the agent. 4.2.3 Motion-level Planning Motion-level planning is responsible for translat- ing the output of high-level planning modules into reachable waypoints and executable actions for the agent. Given a reasoned ct OROI , the motion- level planner first determines corresponding points {(Pk, ct k)|ct k == ct OROI } from the semantic point cloud in §4.1 and compute the next waypoint by averaging the coordinates of selected points. Then, the planner decomposes the path to the waypoint into a sequence of executable actions for the agent. If the agent has reached a location close to the memory graph, the motion planner will directly use the memory graph to predict the agent’s future actions, which is introduced in the next section. 4.3 Global Memory Module Since the agent sometimes revisits the target or landmarks, we designed a global memory module with a memory graph that stores historical trajecto- ries, which helps to reduce the action space in mo- tion planning and improves navigation robustness. Different from prior works that rely on predefined memory graphs (Chen et al., 2021a, 2022) or 2D grid maps (Wang et al., 2023b) lacking height infor- mation, our approach constructs a 3D topological memory graph from the agent’s navigation history. Memory Graph Construction Each historical trajectory Hi can be represented as a topologi- cal graph Gi(Ni, Ei) whose nodes Ni encapsulate both the coordinates of the traversed waypoints and their panoramic observations, and edges Ei are weighted by the distance between adjacent way- points. The memory graph M is constructed by merging all the historical trajectory graphs, given by: M = G(N, E), N = N1 ∪ · · · ∪ Nd, E = E1 ∪ · · · ∪ Ed, (8) where d is the number of historical trajectories. Memory Graph Update The memory graph is up- dated progressively by merging newly generated historical trajectory graph Ghist. The merging pro- cess is similar to Equation 8 merging the nodes and edges of two graphs. In addition, it will generate new edges if M and Ghist are connective. We cal- culate the distance between every pair of nodes in the two graphs. If the distance between any pair of nodes is less than a threshold H=15m, it is inferred that these nodes are adjacent and a new edge is added to the merged graph. Note that the mem- ory graph only merges trajectories that successfully navigate to the target, ensuring the validity of the waypoints in the memory graph. Memory Graph Search for Motion-level Plan- ning When the agent reaches a waypoint in the memory graph, the agent directly leverages the graph to determine a path and action sequence to fulfill the remaining sub-goals from the landmark- level planner. Given a sequence of remaining sub- goals L(r) = {l1, . . . , lr} represented by land- mark phases, our target is to find a path V ∗ = {v1, . . . , vd} ⊆ N with the highest possibility of traversing Lr in order. Note that each node Ni in the graph contains visual observation ovi of the sur- roundings. Therefore, the possibility of traversing landmark lj by a node Ni can be formulated as P (lj |oi). And the objective function can be formu- lated as: V ∗ = max V rY k=1,1≤m1<···<mr ≤d P (lk|ovmk ). (9) We leverage the graph search algorithm in LM-Nav (Shah et al., 2023) to solve this problem, which is more detailed in Alg. 1 in Appendix A.2.1. Once V ∗ is obtained, the agent decomposes it into an executable action sequence to reach the target. To reduce the high complexity and inefficiency of searching over the full global memory graph, the agent extracts a subgraph for navigation. The agent first selects nodes and edges within a radius R to form a spherical subgraph. It then computes the semantic relevance between subgraph nodes and the landmarks, and applies non-maximum suppres- sion to prune redundant nodes. This results in a sparse memory subgraph for efficient and accurate navigation. More details in Appendix A.2.2. To conclude, with the two specially designed per- ception and planning modules, along with the mem- ory module, the aforementioned key challenges of the aerial VLN are addressed one by one.翻译一下
最新发布
12-18
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值