【转】pycharm+python+word2vec警告:C extension not loaded for Word2Vec, training will be slow.

部署运行你感兴趣的模型镜像

pycharm+python+word2vec警告:C extension not loaded for Word2Vec, training will be slow.

src="https://csdnimg.cn/release/blogv2/dist/pc/img/original.png" alt="">
于 2023-03-07 14:03:02 发布
阅读量460 收藏 1
点赞数
文章标签: python pycharm word2vec
版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接: https://blog.youkuaiyun.com/weixin_51867095/article/details/129380877
            <div class="operating">
                <a class="href-article-edit slide-toggle">版权</a>
            </div>
        </div>
    </div>
</div>

<article class="baidu_pl" article="">
    <div id="article_content" class="article_content clearfix">
    <link rel="stylesheet" href="https://csdnimg.cn/release/blogv2/dist/mdeditor/css/editerView/kdoc_html_views-1a98987dfd.css">
    <link rel="stylesheet" href="https://csdnimg.cn/release/blogv2/dist/mdeditor/css/editerView/ck_htmledit_views-044f2cf1dc.css">
         
            <div id="content_views" class="markdown_views prism-atom-one-dark">
                <svg xmlns="http://www.w3.org/2000/svg" style="display: none;">
                    <path stroke-linecap="round" d="M5,0 0,2.5 5,5z" id="raphael-marker-block" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0);"></path>
                </svg>
                <h3><a name="t0"></a><a id="_0"></a>问题</h3> 

运行word2vec时有警告:

E:\Projects\word2vec\venv\lib\site-packages\gensim\models\base_any2vec.py:742: UserWarning: C extension not loaded, training will be slow. Install a C compiler and reinstall gensim for fast training.
  warnings.warn(

 
  • 1
  • 2

后面运行16 words/s,感觉运行一天也跑不完。

解决

因为最开始是跟随『词向量』用Word2Vec训练中文词向量(一)—— 采用搜狗新闻数据集这篇文章进行的word2vec,所以在其推荐的三个方法中选择了最容易的。

解决方法1:降低gensim版本

先降低至3.7.1,不成功。
又降低至3.6,仍不成功。

解决方法2:conda

大部分文章都在推荐conda,然而并没有这个软件,但是第一个方法中的vs也没安装。最后研究半天决定安装anaconda,最终成功。

步骤1:anaconda安装

我没有勾选“Register Anaconda as my default Python 3.9”

步骤2:conda install
  • 错误尝试:打开Anaconda Prompt,cd到原项目位置,conda install,不成功。
  • 正确尝试:
    • 在pyhcarm中新建项目,在New environment using中换成Conda
    • 在下方Conda Executable中选择anaconda安装位置+“Scripts\conda.exe”(参考
    • 新建项目
    • 在Terminal输入(参考
conda install mingw libpython
pip uninsall scipy
conda install scipy
pip uninstall gensim
conda install gensim

 
  • 1
  • 2
  • 3
  • 4
  • 5
报错

第一步可能报错

PackagesNotFoundError: The following packages are not available from current channels:
  • mingw
  • 1
  • 2
  • 3
解决

输入

conda install -c free mingw

 
  • 1
运行成功

在这里插入图片描述

文章知识点与官方知识档案匹配,可进一步学习相关知识
Python入门技能树首页概览 427894 人正在系统学习中
确定要放弃本次机会?
福利倒计时
: :

立减 ¥

普通VIP年卡可用
立即使用
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  •           <a class="tool-item-href go-side-comment" data-report-click="{&quot;spm&quot;:&quot;1001.2101.3001.7009&quot;}">
              <img class="isdefault" src="https://csdnimg.cn/release/blogv2/dist/pc/img/newComment2021Black.png" alt="">
              <span class="count">0</span>
            </a>
            <div class="tool-hover-tip"><span class="text space">评论</span></div>
          </li>
          <li class="tool-item tool-item-bar">
          </li>
          <li class="tool-item tool-item-size tool-active tool-QRcode" data-type="article" id="tool-share">
            <a class="tool-item-href" href="javascript:;" data-report-click="{&quot;mod&quot;:&quot;1582594662_002&quot;,&quot;spm&quot;:&quot;1001.2101.3001.4129&quot;,&quot;ab&quot;:&quot;new&quot;}">
              <img class="isdefault" src="https://csdnimg.cn/release/blogv2/dist/pc/img/newShareBlack.png" alt="">
            </a>
              <div class="QRcode" id="tool-QRcode">
            <div class="share-bg-icon icon1 icon4" id="shareBgIcon"></div>
              <div class="share-bg-box">
                <div class="share-content">
                    <img class="share-avatar" src="https://i-blog.csdnimg.cn/blog_migrate/69fd44b07a143fa4c77896a0a3912d82.jpeg" alt="">
                  <div class="share-tit">
                    pycharm+python+word2vec警告:C extension not loaded for Word2Vec, training will be slow.
                  </div>
                  <div class="share-dec">
                    解决pycharm运行word2vec缺少C依赖的问题。
                  </div>
                  <a id="copyPosterUrl" class="url" data-report-click="{&quot;spm&quot;:&quot;1001.2101.3001.7493&quot;}">复制链接</a>
                </div>
                <div class="share-code">
                  <div class="share-code-box" id="shareCode"><canvas width="65" height="65"></canvas><img style="display: none;"></div>
                  <div class="share-code-text">扫一扫</div>
                </div>
              </div>
                <div class="share-code-type"><p class="hot" data-type="hot"><span>热门</span></p><p class="vip" data-type="vip"><span>VIP</span></p></div>
            </div>
          </li>
        </ul>
      </div>
      <div class="toolbox-right">
    
User Warning: C extens ion not load ed, train ing will be slow问题最佳解决方法
朋友在使用Gensim工具包训练词向量的时候,没有报错,但训练的速度奇慢无比,本来需要十几分钟就训练完的词向量,结果要花费几个小时,甚至更久,具体 警告如下:

UserWarning: C extension not loaded, training will be slow. Install a C compiler and reinstall gensim for fast training


Windows下运行gensim提示没有C编译器解决方案_windows没有c compi...
6-19
$ python setup.py build $ python setup.py install--record logfile.txt 1 2 然后再运行gensim就不会报 warning了。 Reference User Warning: C extens ion not load ed gensim: models. word2vecWord2vec emb edd ings 如何卸载命令 python setup.py install 安装的包?
User Warning: "C extens ion not load ed, train ing will be slow...
6-9
User Warning: "C extens ion not load ed, train ing will be slow. " 在用 word2vec为每个词创建词向量时出现了这个错误,解决方案: 管理员身份运行AnacondaPrompt,并在其中依次输入: conda install m ingw lib python pip un install gensim conda install gensimpip i nstallscipy...
gensim User Warning: C extens ion not load ed, train ing will be slow
08-26 837
原因分析:gensim 版本问题 解决方法:安装gensim==3.6版本即可 pip install gensim==3.6
Anaconda -- C extens ion not load ed for Word2Vec
Word2Vec训练词向量时,遇到下面的提示,所用的环境是Anaconda5.3 python3.7

UserWarning: C extension not loaded for Word2Vec, training will be slow.
Install a C compiler and reinstall gensim for fast training.

查阅资料,貌似是ge…


关于 python Word2Vec 的C extens ion not load ed解决方法_visual studio...
6-11
文章浏览阅读657次。安装lib python,scipy后,删掉gensim,再装gensim即可。再调用时不再报错。_visual studio extens ion was not load ed correctly.environment variable"oneapi
『词向量』用 Word2Vec训练中文词向量(一)—— 采用搜狗新闻数据集_搜 ...
6-23
User Warning: C extens ion not load ed, train ing will be slow.   我刚开始没看到,结果到后面一个 EPOCH 只能处理 160 词/s ,而一般是几十万,我整整训练了10个小时都没有结果!!!果然很 slow ::>_<::   在查阅资料后,我得知是缺少C扩展,有以下三种解决方案: 自己动手安装,需要 Visual Studio ...
cp8_Sentiment_urlretrieve_pyprind_tarfile_bag词袋_walk目录_regex_verbose_N-gram_Hash_colab_verbose_文本向量化
In the modern internet and social m edia age, people's opin ions, reviews, and recommendat ions have become a valuable resource for political science and businesses. Thanks to modern technologies, we are now able to collect and analyze such data most eff...
python 训练 word2vec时:C extens ion not load ed for Word2Vec, train ing will be slow.
03-10 9085
训练 word2vec时:

UserWarning: C extension not loaded for Word2Vec, training will be slow.
Install a C compiler and reinstall gensim for fast training.

解决:

anaconda里面输入:

conda install mingw libpyth…


User Warning: C extens ion not load ed, train ing will be slow.(非conda安装)
05-05 3414
一开始我的gensim报这个 User Warning: C extens ion not load ed, train ing will be slow.错误

根据报错大意,我得知是缺少C扩展

于是我踏上漫长的debug之旅

一开始百度,基本全部的都是提示卸载pip安装的gensim

然后用conda安装,说用conda安装,会自动绑定上C编译

可以说,这个确实很方便,但是我感觉这样就学不到任何…


基于 python招聘岗位数据爬虫及可视化分析设计 开发软件: Pycharm + Python3.7 + Requests库爬取
10-11
开发软件: Pycharm + Python3.7 + Requests库爬取 + Mysql + Echarts 兼职招聘分析系统的首页有各类图表的综合分析,用户在打开招聘分析系统后在首页就能看到相应的图表分析。通过后端的爬虫程序在各类在线平台...
Python基于Django毕业源码案例设计+ Pycharm + Python3.7 + Django.zip
12-18
使用 Python开发的完整的前后台系统项目源码,可用于毕业设计、课程设计、练手学习等
毕业设计+ Python基于Django网络健身俱乐部网站设计+ Pycharm+Django2.0 + sqlite.zip
08-30
毕业设计+ Python基于Django网络健身俱乐部网站设计+ Pycharm+Django2.0 + sqlite.zip毕业设计+ Python基于Django网络健身俱乐部网站设计+ Pycharm+Django2.0 + sqlite.zip毕业设计+ Python基于Django网络健身俱乐部网站...
Python基于Django毕业设计+毕业源码案例设计+ Pycharm + Python3.7 + Django.zip
08-01
资源是基于 Python语言和Web技术的项目源码和教程,适合毕设项目、课设作业。资源中的源码都是经过本地编译过可运行的,下载后按照文档配置好环境就可以运行。资源项目的难度比较适中,内容都是经过助教老师审定过的...
如何解决 User Warning: C extens ion not load ed, train ing will be slow
出现这个问题,说是需要安装gensim,但是我按照有的教程安装了pip install gensim==3.6这个版本的gensim,还是不行。

于是又搜教程安装了更高的版本pip install gensim==3.7.1

结果就可以了,而且速度是秒级的训练,非常的快,特此做一个记录。

参考博文:https://blog.youkuaiyun.com/weixin_40547993/article/d…


Windows: Word2Vec中出现C extens ion not load ed for Word2Vec, train ing will be slow.问题的解决办法
这里写自定义目录标题欢迎使用Markdown编辑器新的改变功能快捷键合理的创建标题,有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能,丰富你的文章UML 图表FLowchart流程图导出与导入导出导入 欢迎使用Ma...
Windows下C extens ion not load ed for Word2Vec, train ing will be slow.解决方法
在网上看了好多个博客,都没有很好解决,最后google.. 大概问题就是gensim库在安装时没有和其他一些包关联起来(可能是由于用pip安装的gensim导致这个问题),所以在用 Word2Vec时没法加速,训练很慢(好像要好几个小时) 解决方法:

conda install mingw libpython
pip uninsall scipy
conda install …


『词向量』用 Word2Vec训练中文词向量(一)—— 采用搜狗新闻数据集
热门推荐
03-12 1万+
用搜狗新闻数据集来训练中文词向量( Word2Vec),自己做的时候踩了很多的坑,希望分享出来让大家少走弯路。
Anaconda下 Gensim FAST_VERS ION 无效的解决方法
08-23 5225
Anaconda下 Gensim FAST_VERS ION 无效的解决方法 环境:ubuntu 14.04, anaconda python 2.7 在Anaconda环境下安装gensim,直接安装是无法使用FAST_MODE的。因为anaconda中带有的scipy是没有BLAS原生库支持的。这样安装的gensim调用scipy中的算法时,无法使用C语言原生库进行计算,速度会比较慢。 表现
windows中gensim安装_Windows下运行gensim提示没有C编译器解决方案
Windows下运行gensim提示没有C编译器解决方案在windows下装gensim跑 word2vec的时候遇到了一个坑,记录下来怕以后忘记。问题描述:没有C扩展最开始用pip安装gensim后,运行下面代码会报一个 warning。from gensim.test.utils import common_textsfrom gensim.models import Word2Vecmodel ...
使用fastapi和pulumi搭建基于Azure云的IAC Restful API服务 — 对外发布
最新发布
灵活的状态管理:虽然Terraform也有状态文件用于记录资源的现实状态,Pulumi进一步简化了状态管理,允许使用不同的存储后端,包括云存储和Pulumi自己的服务。使用常用编程语言:Pulumi允许使用者使用熟悉的编程语言编写基础设施配置,这样可以使用语言本身的功能,如条件语句、循环、函数和模块化,提高代码的可读性和重用性。接下来,我们就可以通过http请求的方式,调用pulumi的IAC能力,操作Azure云的资源了。根据自身需要,可配置一个资源CMDB,存放云资源实时参数配置与状态,以供实际使用。
python+qt+ pycharm_ Pycharm+ Python+PyQt5使用
05-15
PyCharm 是一款 Python 集成开发环境(IDE),可以方便地进行 Python 的开发、调试和测试。PyQt5 是一个用于 Python 编程语言的 GUI 框架,可以实现图形用户界面的开发。在 PyCharm 中使用 PyQt5 可以帮助你更快、更方便地进行 GUI 程序的开发。

下面是使用 PyCharm 和 PyQt5 进行 GUI 程序开发的步骤:

  1. 安装 PyCharm

首先,需要从 PyCharm 官网下载并安装 PyCharm。可以选择社区版或专业版,社区版是免费的,专业版需要付费。

  1. 安装 PyQt5

可以使用 pip 工具安装 PyQt5,命令如下:

pip <em>ins</em>tall pyqt5
  1. 创建 PyCharm 项目

打开 PyCharm,点击 “Create New Project” 创建一个新的项目。

  1. 创建 PyQt5 窗口

在项目中创建一个 Python 文件,并编写 PyQt5 窗口的代码。以下是一个简单的 PyQt5 窗口例子:

import sys
from PyQt5.QtWidgets import QApplicat<em>ion</em>, QWidget

if __name__ == '__main__':
    app = QApplicat<em>ion</em>(sys.argv)
    window = QWidget()
    window.setWindowTitle('PyQt5 Example')
    window.setGeometry(100, 100, 300, 200)
    window.show()
    sys.exit(app.exec_())
  1. 运行程序

PyCharm 中点击 “Run” 按钮运行程序,就可以看到创建的 PyQt5 窗口了。

以上就是使用 PyCharm 和 PyQt5 进行 GUI 程序开发的基本步骤。当然,还有很多其他的组件和功能可以用来开发更复杂的 GUI 程序。


您可能感兴趣的与本文相关的镜像

Python3.8

Python3.8

Conda
Python

Python 是一种高级、解释型、通用的编程语言,以其简洁易读的语法而闻名,适用于广泛的应用,包括Web开发、数据分析、人工智能和自动化脚本

import matplotlib try: import seaborn as sns seaborn_style = sns.axes_style("whitegrid") matplotlib.style.library['seaborn-whitegrid'] = seaborn_style except Exception as e: print("[WARN] Failed to add seaborn-whitegrid style to matplotlib: ", e) try: matplotlib.style.library['seaborn-whitegrid'] = {} except Exception: pass _orig_style_use = matplotlib.style.use def _safe_style_use(style_name): try: if isinstance(style_name, str) and style_name.lower() == 'seaborn-whitegrid': # 确保库中有这个样式 if 'seaborn-whitegrid' not in matplotlib.style.library: matplotlib.style.library['seaborn-whitegrid'] = {} return _orig_style_use('seaborn-whitegrid') return _orig_style_use(style_name) except Exception as e: print(f"[WARN] Error setting style {style_name}: {e}") if style_name.lower() == 'seaborn-whitegrid': return _orig_style_use('default') else: raise matplotlib.style.use = _safe_style_use # ---------------- seaborn style 补丁 END ---------------- import os import sys import json import logging import traceback import re from datetime import datetime from typing import Any, Dict, List, Optional, Tuple import numpy as np from pyquaternion import Quaternion # try to import NuScenes + NuScenesMap + PredictHelper MAP_API_AVAILABLE = True ARCLINE_AVAILABLE = True map_api_import_exception = None arcline_import_exception = None try: from nuscenes import NuScenes from nuscenes.prediction import PredictHelper try: from nuscenes.map_expansion.map_api import NuScenesMap except Exception as e_map: NuScenesMap = None MAP_API_AVAILABLE = False map_api_import_exception = e_map try: from nuscenes.map_expansion import arcline_path_utils except Exception as e_arc: arcline_path_utils = None ARCLINE_AVAILABLE = False arcline_import_exception = e_arc except Exception as e_all: # If entire import failed, keep going but mark unavailable NuScenes = None PredictHelper = None NuScenesMap = None arcline_path_utils = None MAP_API_AVAILABLE = False ARCLINE_AVAILABLE = False map_api_import_exception = e_all arcline_import_exception = e_all # ---------------- Config ---------------- NUSCENES_ROOT = os.getenv('NUSCENES_ROOT', '/home/ljc/nuscenes') OUTPUT_DIR = os.getenv('OUTPUT_DIR', '/home/ljc/RISG/RISG-nuscenes/datasets') LOGS_DIR = os.path.join(OUTPUT_DIR, 'logs') os.makedirs(LOGS_DIR, exist_ok=True) os.makedirs(OUTPUT_DIR, exist_ok=True) HISTORY_SECONDS = 2.0 FUTURE_SECONDS = 6.0 SAMPLE_INTERVAL = 0.5 MIN_HISTORY_STEPS = int(HISTORY_SECONDS / SAMPLE_INTERVAL) MIN_FUTURE_STEPS = int(FUTURE_SECONDS / SAMPLE_INTERVAL) DEFAULT_MAP_QUERY_RADIUS = 50.0 # meters logfile = os.path.join(LOGS_DIR, f'preprocess_map_fallback_{datetime.now().strftime("%Y%m%d_%H%M%S")}.log') logging.basicConfig( level=logging.DEBUG, format='%(asctime)s [%(levelname)s] %(message)s', handlers=[logging.FileHandler(logfile), logging.StreamHandler(sys.stdout)] ) logger = logging.getLogger('preprocess_map_fallback') logger.info("Starting preprocess_nuscenes_map_fallback.py (debug mode)") logger.info("NUSCENES_ROOT=%s OUTPUT_DIR=%s", NUSCENES_ROOT, OUTPUT_DIR) logger.info("MAP_API_AVAILABLE=%s ARCLINE_AVAILABLE=%s", MAP_API_AVAILABLE, ARCLINE_AVAILABLE) if not MAP_API_AVAILABLE: logger.warning("NuScenesMap import failed: %s", repr(map_api_import_exception)) if not ARCLINE_AVAILABLE: logger.warning("arcline_path_utils import failed: %s", repr(arcline_import_exception)) # ---------------- helpers ---------------- def sanitize_filename(name: str) -> str: name = name.strip().replace(' ', '_') return re.sub(r'[^A-Za-z0-9_\\-\\.]', '_', name) def global_to_ego_xy(global_xy: Tuple[float, float], ego_xy: Tuple[float, float], ego_yaw: float) -> List[float]: x, y = float(global_xy[0]), float(global_xy[1]) x0, y0 = float(ego_xy[0]), float(ego_xy[1]) dx, dy = x - x0, y - y0 cos_t, sin_t = np.cos(ego_yaw), np.sin(ego_yaw) x_ego = cos_t * dx + sin_t * dy y_ego = -sin_t * dx + cos_t * dy return [float(x_ego), float(y_ego)] class NumpyEncoder(json.JSONEncoder): def default(self, obj: Any): if isinstance(obj, (np.integer,)): return int(obj) if isinstance(obj, (np.floating,)): return float(obj) if isinstance(obj, (np.ndarray,)): return obj.tolist() if isinstance(obj, (np.bool_, bool)): return bool(obj) return super().default(obj) # ---------------- map-json parsing caches ---------------- _map_json_cache: Dict[str, Dict] = {} _node_map_cache: Dict[str, Dict[str, Tuple[float, float]]] = {} _line_map_cache: Dict[str, Dict[str, np.ndarray]] = {} def find_map_json_path(dataroot: str, map_name: str) -> Optional[str]: if not map_name: return None maps_dir = os.path.join(dataroot, 'maps') cand = os.path.join(maps_dir, f"{map_name}.json") if os.path.exists(cand): return cand if os.path.isdir(maps_dir): for f in os.listdir(maps_dir): if f.lower().endswith('.json') and f.lower().startswith(map_name.lower()): return os.path.join(maps_dir, f) return None def load_map_json_cached(path: Optional[str]) -> Optional[Dict]: if not path: return None if path in _map_json_cache: return _map_json_cache[path] try: with open(path, 'r', encoding='utf-8') as fh: mj = json.load(fh) _map_json_cache[path] = mj logger.debug("Loaded map json: %s (keys=%d)", path, len(mj.keys())) return mj except Exception: logger.exception("Failed to load map json: %s", path) return None def try_extract_coords_from_node(rec: Any) -> Optional[Tuple[float, float]]: # node records often have x,y or xyz list try: if rec is None: return None if isinstance(rec, dict): # common fields if 'x' in rec and 'y' in rec: return (float(rec['x']), float(rec['y'])) if 'xyz' in rec and isinstance(rec['xyz'], (list,tuple)) and len(rec['xyz']) >= 2: return (float(rec['xyz'][0]), float(rec['xyz'][1])) # sometimes 'translation' or 'center' etc if 'translation' in rec and isinstance(rec['translation'], (list,tuple)) and len(rec['translation']) >= 2: return (float(rec['translation'][0]), float(rec['translation'][1])) # sometimes node stores 'node_token' or 'token' with coords in nested field if 'node' in rec and isinstance(rec['node'], dict): return try_extract_coords_from_node(rec['node']) # if it's a list/tuple of coords if isinstance(rec, (list,tuple)) and len(rec) >= 2: return (float(rec[0]), float(rec[1])) except Exception: logger.debug("try_extract_coords_from_node failed for rec type %s", type(rec)) return None return None def build_node_map(map_json: Dict, map_json_path: str) -> Dict[str, Tuple[float, float]]: # build token -> (x,y) if map_json_path in _node_map_cache: return _node_map_cache[map_json_path] nodes = {} if not map_json: _node_map_cache[map_json_path] = nodes return nodes node_obj = None if 'node' in map_json: node_obj = map_json['node'] else: # find candidate for k in map_json.keys(): if 'node' in k.lower(): node_obj = map_json[k]; break if node_obj is None: logger.warning("No 'node' section in map JSON") _node_map_cache[map_json_path] = nodes return nodes if isinstance(node_obj, dict): for token, rec in node_obj.items(): coord = try_extract_coords_from_node(rec) if coord: nodes[str(token)] = coord elif isinstance(node_obj, list): for rec in node_obj: token = None coord = None if isinstance(rec, dict): token = rec.get('token') or rec.get('id') or rec.get('uid') or rec.get('node_token') coord = try_extract_coords_from_node(rec) elif isinstance(rec, (list,tuple)): coord = try_extract_coords_from_node(rec) if token and coord: nodes[str(token)] = coord logger.info("Built node map with %d entries from %s", len(nodes), map_json_path) _node_map_cache[map_json_path] = nodes return nodes def try_extract_coords(val: Any) -> Optional[np.ndarray]: """ Try to coerce val into an Nx2 numpy array of XY coordinates. Handles lists, strings that encode arrays, dicts with 'xyz' etc. """ if val is None: return None try: if isinstance(val, (list,tuple, np.ndarray)): arr = np.asarray(val) if arr.ndim >= 2 and arr.shape[1] >= 2: return arr[:, :2].astype(float) # list of dicts with x,y if len(val) > 0 and isinstance(val[0], dict): pts=[] for p in val: if 'x' in p and 'y' in p: pts.append([float(p['x']), float(p['y'])]) elif 'xyz' in p and isinstance(p['xyz'], (list,tuple)) and len(p['xyz'])>=2: pts.append([float(p['xyz'][0]), float(p['xyz'][1])]) else: return None if pts: return np.asarray(pts)[:, :2] if isinstance(val, dict): for key in ['xyz','polyline','coords','points','geometry','points_2d','line','translation','polyline_coords']: if key in val and val[key]: return try_extract_coords(val[key]) # nested search for k,v in val.items(): c = try_extract_coords(v) if c is not None: return c if isinstance(val, str): # try JSON parse try: parsed = json.loads(val) return try_extract_coords(parsed) except Exception: # regex float extraction floats = re.findall(r'[-+]?[0-9]*\\.?[0-9]+(?:[eE][-+]?[0-9]+)?', val) if len(floats) >= 2: arr = np.array([float(x) for x in floats], dtype=float) if arr.size % 2 == 0: pts = arr.reshape(-1,2) return pts if arr.size % 3 == 0: pts = arr.reshape(-1,3)[:, :2]; return pts except Exception: logger.exception("try_extract_coords failed for val type %s", type(val)) return None return None def build_line_map_from_map_json(map_json: Dict, map_json_path: str) -> Dict[str, np.ndarray]: """ Build mapping: line_token -> Nx2 coords. For nuScenes map format where 'line' entries reference 'node_tokens', we reconstruct coordinates using the 'node' section. """ if map_json_path in _line_map_cache: return _line_map_cache[map_json_path] line_map = {} if not map_json: _line_map_cache[map_json_path] = line_map return line_map # Build node map first node_map = build_node_map(map_json, map_json_path) lines_obj = None if 'line' in map_json: lines_obj = map_json['line'] else: for k in map_json.keys(): if 'line' in k.lower(): lines_obj = map_json[k]; break if lines_obj is None: logger.warning("No 'line' key in map json") _line_map_cache[map_json_path] = line_map return line_map # lines_obj usually a list of dicts if isinstance(lines_obj, dict): # token -> rec for lid, rec in lines_obj.items(): # try several ways to get coords coord = None if isinstance(rec, dict): # if rec has 'node_tokens' (nuScenes common) if 'node_tokens' in rec and rec['node_tokens']: pts=[] for nt in rec['node_tokens']: if nt in node_map: pts.append(list(node_map[nt])) else: # not found, skip pts=[] break if pts: coord = np.asarray(pts)[:, :2].astype(float) if coord is None: # try embedded coords for key in ['xyz','polyline','points','coords','geometry','line','polyline_coords']: if key in rec and rec[key]: coord = try_extract_coords(rec[key]) if coord is not None: break else: coord = try_extract_coords(rec) if coord is not None and coord.shape[0] >= 2: line_map[str(lid)] = coord elif isinstance(lines_obj, list): for rec in lines_obj: lid = None coord = None if isinstance(rec, dict): lid = rec.get('token') or rec.get('id') or rec.get('uid') # try node_tokens if 'node_tokens' in rec and rec['node_tokens']: pts=[] for nt in rec['node_tokens']: if nt in node_map: pts.append(list(node_map[nt])) else: pts=[] break if pts: coord = np.asarray(pts)[:, :2].astype(float) if coord is None: for key in ['xyz','polyline','points','coords','geometry','line','polyline_coords']: if key in rec and rec[key]: coord = try_extract_coords(rec[key]); break if coord is None: # fallback try entire rec coord = try_extract_coords(rec) else: coord = try_extract_coords(rec) if lid is None: lid = f"idx_{len(line_map)}" if coord is not None and coord.shape[0] >= 2: line_map[str(lid)] = coord logger.info("Built line_map entries: %d", len(line_map)) # log a few examples for i, (k,v) in enumerate(list(line_map.items())[:3]): logger.debug("line_map example %d: token=%s pts=%d first=%s", i, k, v.shape[0], v[0].tolist()) _line_map_cache[map_json_path] = line_map return line_map # ---------------- geometry helpers ---------------- def compute_centerline_segments(cl: np.ndarray): if cl is None or cl.shape[0] < 2: return np.zeros((0,2)), np.zeros((0,)) seg_mid = (cl[1:] + cl[:-1]) / 2.0 seg_vec = cl[1:] - cl[:-1] seg_len = np.linalg.norm(seg_vec[:, :2], axis=1) return seg_mid, seg_len def lane_distance_to_points(cl: np.ndarray, points: np.ndarray) -> float: if cl is None or cl.shape[0] < 2 or points.size == 0: return float('inf') min_d = float('inf') for i in range(cl.shape[0] - 1): a = cl[i]; b = cl[i+1] ab = b - a ab2 = ab.dot(ab) for pt in points: ap = pt - a if ab2 == 0: d = np.linalg.norm(ap) else: t = np.clip(ap.dot(ab) / ab2, 0.0, 1.0) proj = a + t * ab d = np.linalg.norm(pt - proj) if d < min_d: min_d = d if min_d == 0.0: return 0.0 return min_d # ---------------- map extraction pipeline ---------------- def extract_map_summary(map_api_obj: Optional[Any], map_name: str, positions_xy: np.ndarray, radius: float = DEFAULT_MAP_QUERY_RADIUS) -> Dict: """ Return summary dict matching previous format: lane_positions, lane_lengths, centerline_positions, centerline_lengths, centerline_to_lane, lane_adjacent_edges, lane_predecessor_edges, lane_successor_edges Strategy: 1) If NuScenesMap exists and provides useful methods, try those (preferred). 2) Else, fall back to parsing maps/<map_name>.json using node & line tables and select line segments near positions. In fallback, also attempt to parse lane connectivity from connector-like sections (names vary across map versions). """ empty = {'lane_positions': [], 'lane_lengths': [], 'centerline_positions': [], 'centerline_lengths': [], 'centerline_to_lane': [[], []], 'lane_adjacent_edges': [], 'lane_predecessor_edges': [], 'lane_successor_edges': []} if positions_xy is None or positions_xy.size == 0: logger.debug("positions empty -> return empty map summary") return empty cx = float(np.mean(positions_xy[:,0])); cy = float(np.mean(positions_xy[:,1])) logger.info("Query center for map extraction: (%.3f, %.3f) radius=%.1f", cx, cy, radius) # 1) try official API if available if map_api_obj is not None: logger.info("Trying NuScenesMap API methods (if available)") try: has_get_ids = hasattr(map_api_obj, 'get_lane_ids_in_xy_bbox') has_get_center = hasattr(map_api_obj, 'get_lane_segment_centerline') logger.debug("NuScenesMap has get_lane_ids_in_xy_bbox=%s get_lane_segment_centerline=%s", has_get_ids, has_get_center) lane_ids = [] if has_get_ids: try: lane_ids = map_api_obj.get_lane_ids_in_xy_bbox(cx, cy, radius) logger.info("get_lane_ids_in_xy_bbox returned %d ids", len(lane_ids) if lane_ids is not None else 0) except Exception: logger.exception("get_lane_ids_in_xy_bbox failed") lane_ids = [] # try closest lane if none found if (not lane_ids) and hasattr(map_api_obj, 'get_closest_lane'): try: lid = map_api_obj.get_closest_lane(cx, cy, radius) if lid: lane_ids = [lid] logger.info("get_closest_lane returned %s", lid) except Exception: logger.exception("get_closest_lane failed") if lane_ids: lp, ll, cp, cll, src, dst = [], [], [], [], [], [] lane_id_list = list(lane_ids) for li, lid in enumerate(lane_id_list): try: lane_record = None if hasattr(map_api_obj, 'get_lane'): lane_record = map_api_obj.get_lane(lid) cl = None # try arcline discretize if available if ARCLINE_AVAILABLE and lane_record is not None: try: poses = arcline_path_utils.discretize_lane(lane_record, resolution_meters=1) if poses and len(poses) >= 2: cl = np.array([[p[0], p[1]] for p in poses], dtype=float) except Exception: logger.debug("arcline discretize failed for %s", lid, exc_info=True) # fallback to centerline API if cl is None and hasattr(map_api_obj, 'get_lane_segment_centerline'): try: raw_cl = map_api_obj.get_lane_segment_centerline(lid) arr = np.asarray(raw_cl) if arr.ndim >= 2 and arr.shape[1] >= 2: cl = arr[:, :2].astype(float) logger.debug("get_lane_segment_centerline for %s: %s pts", lid, None if cl is None else cl.shape[0]) except Exception: logger.debug("get_lane_segment_centerline exception", exc_info=True) # fallback parse lane_record dict if cl is None and isinstance(lane_record, dict): for key in ['centerline','polyline','coords','points','xyz','geometry','line']: if key in lane_record and lane_record[key]: cl = try_extract_coords(lane_record[key]) if cl is not None: break if cl is None: logger.warning("No centerline for lane %s, skipping", lid) continue seg_mid, seg_len = compute_centerline_segments(cl) total_len = float(np.sum(seg_len)) if seg_len.size > 0 else 0.0 if total_len <= 0 or seg_mid.size == 0: continue rep = max(0, len(cl)//2) lp.append([float(cl[rep,0]), float(cl[rep,1])]) ll.append(total_len) for j in range(seg_mid.shape[0]): cp.append([float(seg_mid[j,0]), float(seg_mid[j,1])]) cll.append(float(seg_len[j])) src.append(len(cp)-1); dst.append(li) except Exception: logger.exception("Error processing lane id %s", lid) # connectivity via API (if available) lane_adj_src, lane_adj_dst = [], [] lane_pred_src, lane_pred_dst = [], [] lane_succ_src, lane_succ_dst = [], [] try: # get_incoming/get_outgoing available in map_api_obj? for li, lid in enumerate(lane_id_list): try: incoming = [] outgoing = [] if hasattr(map_api_obj, 'get_incoming_lane_ids'): incoming = map_api_obj.get_incoming_lane_ids(lid) or [] if hasattr(map_api_obj, 'get_outgoing_lane_ids'): outgoing = map_api_obj.get_outgoing_lane_ids(lid) or [] # predecessors for pred in incoming: if pred in lane_id_list: pidx = lane_id_list.index(pred) lane_pred_src.append(pidx); lane_pred_dst.append(li) # successors for succ in outgoing: if succ in lane_id_list: sidx = lane_id_list.index(succ) lane_succ_src.append(li); lane_succ_dst.append(sidx) except Exception: logger.debug("Failed to get incoming/outgoing for %s", lid, exc_info=True) except Exception: logger.debug("Connectivity extraction via NuScenesMap failed", exc_info=True) if lp: logger.info("NuScenesMap API extraction success: lanes=%d centerline_pts=%d", len(lp), len(cp)) return {'lane_positions': lp, 'lane_lengths': ll, 'centerline_positions': cp, 'centerline_lengths': cll, 'centerline_to_lane': [src, dst], 'lane_adjacent_edges': [lane_adj_src, lane_adj_dst], 'lane_predecessor_edges': [lane_pred_src, lane_pred_dst], 'lane_successor_edges': [lane_succ_src, lane_succ_dst]} else: logger.info("NuScenesMap returned no lanes for this area; will fallback to JSON parsing") except Exception: logger.exception("NuScenesMap API path threw exception; falling back to JSON parsing") # 2) fallback: parse maps/<map_name>.json directly map_json_path = find_map_json_path(NUSCENES_ROOT, map_name) if map_name else None logger.debug("Fallback map_json_path=%s", map_json_path) mj = load_map_json_cached(map_json_path) if map_json_path else None if mj is None: logger.warning("No map JSON available; returning empty map summary") return empty # Build line_map then select lines near positions line_map = build_line_map_from_map_json(mj, map_json_path) logger.debug("Total lines available in map json: %d", len(line_map)) if not line_map: logger.warning("line_map empty after parse; returning empty") return empty # select lines whose distance to positions <= radius selected = [] for lid, cl in line_map.items(): try: d = lane_distance_to_points(cl, positions_xy) if d <= radius: selected.append((lid, cl)) except Exception: logger.exception("Error computing distance for line %s", lid) logger.info("Selected %d line(s) near the query center", len(selected)) if not selected: return empty # Build lane-like output: treat each selected line as a lane representative lane_positions, lane_lengths, centerline_positions, centerline_lengths, src, dst = [], [], [], [], [], [] selected_lane_tokens = [] for li, (lid, cl) in enumerate(selected): seg_mid, seg_len = compute_centerline_segments(cl) total_len = float(np.sum(seg_len)) if seg_len.size>0 else 0.0 if total_len <= 0 or seg_mid.size==0: continue rep = max(0, len(cl)//2) lane_positions.append([float(cl[rep,0]), float(cl[rep,1])]) lane_lengths.append(total_len) selected_lane_tokens.append(lid) for j in range(seg_mid.shape[0]): centerline_positions.append([float(seg_mid[j,0]), float(seg_mid[j,1])]) centerline_lengths.append(float(seg_len[j])) src.append(len(centerline_positions)-1); dst.append(li) logger.info("Built lane_recs map entries: %d", len(selected_lane_tokens)) # Try to extract connectivity from the map JSON. # Map JSON field names vary across versions. We attempt several likely names: connectors_candidates = [] for key in ['lane_connector', 'lane_connectors', 'lane_link', 'lane_links', 'connectors', 'connector', 'lane_connectivity', 'laneconnector']: if key in mj and mj[key]: logger.info("Found connector-like key in map json: %s", key) connectors_candidates.append(mj[key]) # also some maps store connectors under other keys - broad scan for k,v in mj.items(): if isinstance(v, list) and len(v) > 0 and isinstance(v[0], dict): sample_keys = set(v[0].keys()) if any(x in sample_keys for x in ['from_lane_token','to_lane_token','in_lane_token','out_lane_token','start_lane_token','end_lane_token','from_lane','to_lane','in_lane','out_lane']): connectors_candidates.append(v) # flatten connectors candidates connectors = [] for c in connectors_candidates: if isinstance(c, list): connectors.extend(c) elif isinstance(c, dict): # dict of token->rec for tt,rec in c.items(): connectors.append(rec) # Additional: try to harvest incoming/outgoing stored inside lane records themselves lane_records_obj = None if 'lane' in mj: lane_records_obj = mj['lane'] else: for k in mj.keys(): if 'lane' in k.lower(): lane_records_obj = mj[k]; break logger.info("Found %d connector-like records in map json", len(connectors)) # Build token->index map for selected lanes token_to_idx = {tok: idx for idx, tok in enumerate(selected_lane_tokens)} lane_adj_src, lane_adj_dst = [], [] lane_pred_src, lane_pred_dst = [], [] lane_succ_src, lane_succ_dst = [], [] # parse connectors if connectors: logger.debug("Parsing %d connector records for connectivity", len(connectors)) for rec in connectors: try: if not isinstance(rec, dict): continue # try many possible field names to extract from/to tokens in_tok = (rec.get('in_lane_token') or rec.get('from_lane_token') or rec.get('from_lane') or rec.get('start_lane_token') or rec.get('from') or rec.get('in_lane') or rec.get('lane_from') or rec.get('lane_start_token')) out_tok = (rec.get('out_lane_token') or rec.get('to_lane_token') or rec.get('to_lane') or rec.get('end_lane_token') or rec.get('to') or rec.get('out_lane') or rec.get('lane_to') or rec.get('lane_end_token')) # sometimes fields are lists if isinstance(in_tok, list) and in_tok: in_tok = in_tok[0] if isinstance(out_tok, list) and out_tok: out_tok = out_tok[0] # try nested string detections if (not in_tok or not out_tok): # inspect all string fields and pick first two tokens that are in token_to_idx found = [] for v in rec.values(): if isinstance(v, str) and v in token_to_idx: found.append(v) if isinstance(v, list): for it in v: if isinstance(it, str) and it in token_to_idx: found.append(it) if len(found) >= 2: break if len(found) >= 2: in_tok, out_tok = found[0], found[1] if in_tok and out_tok: if (in_tok in token_to_idx) and (out_tok in token_to_idx): src_idx = token_to_idx[in_tok] dst_idx = token_to_idx[out_tok] lane_succ_src.append(src_idx); lane_succ_dst.append(dst_idx) lane_pred_src.append(dst_idx); lane_pred_dst.append(src_idx) except Exception: logger.debug("Connector parse failed for rec", exc_info=True) # parse incoming/outgoing fields inside lane records if present if lane_records_obj: logger.info("Inspecting lane records for incoming/outgoing/left/right adjacency fields") if isinstance(lane_records_obj, dict): for tok, rec in lane_records_obj.items(): try: if tok not in token_to_idx: continue idx = token_to_idx[tok] # incoming/outgoing lists (common names) incs = rec.get('incoming') or rec.get('incoming_lane_tokens') or rec.get('incoming_lane') or rec.get('incoming_lanes') or rec.get('incomings') outs = rec.get('outgoing') or rec.get('outgoing_lane_tokens') or rec.get('outgoing_lane') or rec.get('outgoing_lanes') or rec.get('outgoings') if isinstance(incs, list): for p in incs: if p in token_to_idx: lane_pred_src.append(token_to_idx[p]); lane_pred_dst.append(idx) if isinstance(outs, list): for s in outs: if s in token_to_idx: lane_succ_src.append(idx); lane_succ_dst.append(token_to_idx[s]) # left/right adjacency left = rec.get('left_lane') or rec.get('adjacent_left') or rec.get('left_neighbor') or rec.get('left') right = rec.get('right_lane') or rec.get('adjacent_right') or rec.get('right_neighbor') or rec.get('right') if left and left in token_to_idx: lane_adj_src.append(idx); lane_adj_dst.append(token_to_idx[left]) if right and right in token_to_idx: lane_adj_src.append(idx); lane_adj_dst.append(token_to_idx[right]) except Exception: logger.debug("lane record parse issue", exc_info=True) elif isinstance(lane_records_obj, list): for rec in lane_records_obj: try: tok = rec.get('token') or rec.get('id') or rec.get('uid') if not tok or tok not in token_to_idx: continue idx = token_to_idx[tok] incs = rec.get('incoming') or rec.get('incoming_lane_tokens') or rec.get('incoming_lanes') outs = rec.get('outgoing') or rec.get('outgoing_lane_tokens') or rec.get('outgoing_lanes') if isinstance(incs, list): for p in incs: if p in token_to_idx: lane_pred_src.append(token_to_idx[p]); lane_pred_dst.append(idx) if isinstance(outs, list): for s in outs: if s in token_to_idx: lane_succ_src.append(idx); lane_succ_dst.append(token_to_idx[s]) left = rec.get('left_lane') or rec.get('adjacent_left') or rec.get('left_neighbor') right = rec.get('right_lane') or rec.get('adjacent_right') or rec.get('right_neighbor') if left and left in token_to_idx: lane_adj_src.append(idx); lane_adj_dst.append(token_to_idx[left]) if right and right in token_to_idx: lane_adj_src.append(idx); lane_adj_dst.append(token_to_idx[right]) except Exception: logger.debug("lane record parse issue", exc_info=True) logger.debug("Parsed connectivity: succ=%d pred=%d adj=%d", len(lane_succ_src), len(lane_pred_src), len(lane_adj_src)) logger.info("Built fallback map summary: lanes=%d centerline_pts=%d", len(lane_positions), len(centerline_positions)) return {'lane_positions': lane_positions, 'lane_lengths': lane_lengths, 'centerline_positions': centerline_positions, 'centerline_lengths': centerline_lengths, 'centerline_to_lane': [src, dst], 'lane_adjacent_edges': [lane_adj_src, lane_adj_dst], 'lane_predecessor_edges': [lane_pred_src, lane_pred_dst], 'lane_successor_edges': [lane_succ_src, lane_succ_dst]} # ---------------- Preprocessor (annotation processing) ---------------- class Preprocessor: def __init__(self, dataroot: str, output_dir: str): self.dataroot = dataroot self.output_dir = output_dir logger.info("Loading NuScenes...") self.nusc = NuScenes(version='v1.0-trainval', dataroot=dataroot, verbose=False) self.helper = PredictHelper(self.nusc) self.scenes = list(self.nusc.scene) self.inst2cat = {inst['token']: inst.get('category_token') for inst in self.nusc.instance if 'token' in inst} def category_label(self, instance_token: str) -> int: cat_token = self.inst2cat.get(instance_token) if not cat_token: return 5 try: cat = self.nusc.get('category', cat_token) name = cat.get('name', '').lower() except Exception: return 5 if any(k in name for k in ['vehicle','car','truck','bus','motor_vehicle']): return 0 if any(k in name for k in ['pedestrian','person']): return 1 if any(k in name for k in ['bicycle','motorcycle','bike','motorbike']): return 2 if 'animal' in name: return 3 if any(k in name for k in ['barrier','obstacle','static','cone','debris','construction']): return 4 return 5 def process_annotation(self, ann_token: str) -> Dict: try: ann = self.nusc.get('sample_annotation', ann_token) sample = self.nusc.get('sample', ann['sample_token']) ts_current = int(sample['timestamp']) inst = ann['instance_token'] size = ann.get('size', [0.0,0.0,0.0]) area = float(size[0] * size[1]) if len(size) >= 2 else 0.0 trans = ann.get('translation', [0.0,0.0,0.0]) global_pos = [float(trans[0]), float(trans[1])] # ego pose: assume first ann in sample is ego s_anns = sample.get('anns', []) if len(s_anns) == 0: ego_xy = [0.0, 0.0]; ego_yaw = 0.0 else: ego_ann = self.nusc.get('sample_annotation', s_anns[0]) ego_t = ego_ann.get('translation', [0.0,0.0,0.0]) ego_xy = [float(ego_t[0]), float(ego_t[1])] ego_rot = ego_ann.get('rotation', None) ego_yaw = Quaternion(ego_rot).yaw_pitch_roll[0] if ego_rot is not None else 0.0 # get past and future full records past = self.helper.get_past_for_agent(inst, ann['sample_token'], seconds=HISTORY_SECONDS, in_agent_frame=False, just_xy=False) fut = self.helper.get_future_for_agent(inst, ann['sample_token'], seconds=FUTURE_SECONDS, in_agent_frame=False, just_xy=False) past_list = past if isinstance(past, list) else (list(past) if past is not None else []) fut_list = fut if isinstance(fut, list) else (list(fut) if fut is not None else []) history_points = [] if len(past_list) > 0: ordered = past_list[::-1] for r in ordered: try: samp_r = self.nusc.get('sample', r['sample_token']) t = int(samp_r['timestamp']) xy = r.get('translation', [0.0,0.0,0.0])[:2] ego_p = global_to_ego_xy((xy[0], xy[1]), tuple(ego_xy), ego_yaw) history_points.append({'global':[float(xy[0]), float(xy[1])], 'ego':[float(ego_p[0]), float(ego_p[1])], 'timestamp': t}) except Exception: logger.exception("history point parse failed") # append current history_points.append({'global':[float(global_pos[0]), float(global_pos[1])], 'ego': global_to_ego_xy((global_pos[0], global_pos[1]), tuple(ego_xy), ego_yaw), 'timestamp': ts_current}) future_points = [] if len(fut_list) > 0: for r in fut_list: try: samp_r = self.nusc.get('sample', r['sample_token']) t = int(samp_r['timestamp']) xy = r.get('translation', [0.0,0.0,0.0])[:2] ego_p = global_to_ego_xy((xy[0], xy[1]), tuple(ego_xy), ego_yaw) future_points.append({'global':[float(xy[0]), float(xy[1])], 'ego':[float(ego_p[0]), float(ego_p[1])], 'timestamp': t}) except Exception: logger.exception("future point parse failed") complete = (len(history_points) >= MIN_HISTORY_STEPS) and (len(future_points) >= MIN_FUTURE_STEPS) cat_label = self.category_label(inst) positions_arr = np.asarray([p['global'] for p in history_points]) if len(history_points) > 0 else np.zeros((0,2)) # derive map name from scene->log scene_rec = self.nusc.get('scene', sample.get('scene_token')) if sample else None map_name = None map_api_obj = None try: if scene_rec: log = self.nusc.get('log', scene_rec['log_token']) map_name = log.get('location', '').replace(' ', '-').lower() if NuScenesMap is not None and map_name: try: # try instantiate NuScenesMap (may fail due to matplotlib/style issues) map_api_obj = NuScenesMap(dataroot=self.nusc.dataroot, map_name=map_name) logger.debug("NuScenesMap instantiated for %s", map_name) except Exception: logger.exception("NuScenesMap init failed for %s", map_name) map_api_obj = None except Exception: logger.exception("map loading exception") map_api_obj = None map_summary = extract_map_summary(map_api_obj, map_name if map_name else '', positions_arr, radius=DEFAULT_MAP_QUERY_RADIUS) result = { 'sample_annotation_token': ann_token, 'scene_token': sample.get('scene_token'), 'timestamp': int(ts_current), 'category': int(cat_label), 'area': float(area), 'act': int(0) if any('moving' in self.nusc.get('attribute', at)['name'].lower() for at in ann.get('attribute_tokens', [])) else 1, 'global_position': [float(global_pos[0]), float(global_pos[1])], 'ego_position': [float(global_to_ego_xy((global_pos[0], global_pos[1]), tuple(ego_xy), ego_yaw)[0]), float(global_to_ego_xy((global_pos[0], global_pos[1]), tuple(ego_xy), ego_yaw)[1])], 'heading': float(Quaternion(ann.get('rotation', [1,0,0,0])).yaw_pitch_roll[0] if ann.get('rotation', None) else 0.0), 'trajectory': {'history': history_points, 'future': future_points}, 'timestamps': {'history_start': int(history_points[0]['timestamp']) if len(history_points) > 0 else 0, 'future_end': int(future_points[-1]['timestamp']) if len(future_points) > 0 else int(ts_current)}, 'complete': bool(complete), 'map': { 'lane_positions': map_summary.get('lane_positions', []), 'lane_lengths': map_summary.get('lane_lengths', []), 'centerline_positions': map_summary.get('centerline_positions', []), 'centerline_lengths': map_summary.get('centerline_lengths', []), 'centerline_to_lane': map_summary.get('centerline_to_lane', [[], []]), 'lane_adjacent_edges': map_summary.get('lane_adjacent_edges', []), 'lane_predecessor_edges': map_summary.get('lane_predecessor_edges', []), 'lane_successor_edges': map_summary.get('lane_successor_edges', []) } } return result except Exception: logger.exception("process_annotation failed for %s", ann_token) return {} def process_scene(self, scene_idx: int, scene_rec: Dict): scene_name = scene_rec.get('name', f"scene_{scene_idx+1:03d}") safe = sanitize_filename(scene_name) out_path = os.path.join(self.output_dir, f"{safe}.jsonl") logger.info("Processing scene %d/%d name=%s -> %s", scene_idx+1, len(self.scenes), scene_name, out_path) count = 0 with open(out_path, 'w', encoding='utf-8') as fout: s_token = scene_rec.get('first_sample_token') while s_token: sample = self.nusc.get('sample', s_token) for ann_token in sample.get('anns', []): rec = self.process_annotation(ann_token) if rec: fout.write(json.dumps(rec, cls=NumpyEncoder, ensure_ascii=False) + '\n') count += 1 s_token = sample.get('next', '') logger.info("Wrote %d annotations to %s", count, out_path) return out_path # ---------------- main ---------------- def main(): try: pre = Preprocessor(NUSCENES_ROOT, OUTPUT_DIR) total = len(pre.scenes) logger.info("Total scenes to process: %d", total) for i, s in enumerate(pre.scenes): pre.process_scene(i, s) except KeyboardInterrupt: logger.info("Interrupted by user") except Exception: logger.exception("Fatal error") traceback.print_exc() if __name__ == '__main__': main() 代码是这样的,应该如何修改
最新发布
09-16
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值