- 博客(196)
- 资源 (8)
- 收藏
- 关注
原创 StopIteration
StopIteration09 12:13:10 WRN A exception occurred during Engine initialization, give up running processTraceback (most recent call last): File "/home/lvying/lvying/code/AVSS/LVravs/dist_train.py", line 152, in train_losses = dist_train(eng
2024-12-09 12:38:25
155
原创 Cannot re-initialize CUDA in forked subprocess.
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start methodERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 37434) of binary: /home/lvying
2024-12-09 11:00:41
373
原创 ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 35585)
FutureWarning: The module torch.distributed.launch is deprecatedand will be removed in future. Use torchrun.Note that --use_env is set by default in torchrun.If your script expects `--local_rank` argument to be set, pleasechange it to read from `os.env
2024-12-08 13:11:10
569
原创 Error: L6200E: Symbol SystemInit multiply defined (by system_stm32f10x_1.o and system_stm32f10x.o).
报错如下:linking...解决办法:
2024-04-29 11:26:36
2357
1
原创 【MPU6050】requestFrom(): i2cWriteReadNonStop returned Error -1
当平台>= espressif32@~5.1.0时,出现此问题。平台= espressif32@~5.0.0。在platformio.ini中更改了版本。
2024-03-12 11:09:47
843
转载 torch.flatten()函数
1)flatten(x,1)是按照x的第1个维度拼接(按照列来拼接,横向拼接);2)flatten(x,0)是按照x的第0个维度拼接(按照行来拼接,纵向拼接);3)有时候会遇到flatten里面有两个维度参数,flatten(x, start_dim, end_dimension),此时flatten函数执行的功能是将从start_dim到end_dim之间的所有维度值乘起来,其他的维度保持不变。例如x是一个size为[4,5,6]的tensor, flatten(x, 0, 1)的结果是一个size为
2022-05-13 14:03:14
1004
转载 【转】DataParallel & DistributedDataParallel分布式训练-转载只为记录
model = nn.DataParallel(model)测试结果相差特别多(如下图所示),加了这句mIoU是第一个结果0.8847,没加是0.4929(以mIoU为例,可以看到其他各项指标也都掉的严重),所以决定花点心思把nn.DataParallel(model)搞清楚。指标结果掉的严重这里需要注意一下:多卡训练要考虑通信开销的, 是个trade-off的过程,不见得四块卡一定比两块卡快多少,训练到四块卡的时候可能io通信开销已经占了大头。nn.DataParallel()...
2022-03-17 11:27:04
1174
1
原创 144 UserWarning: semaphore_tracker: There appear to be 26 leaked semaphores to clean up at shutdown
/opt/conda/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 26 leaked semaphores to clean up at shutdown len(cache))转别人的解决办法export PYTHONWARNINGS='ignore:semaphore_tracker:UserWarning'...
2022-03-14 17:33:05
3092
原创 集群训练bug记录
-- Process 0 terminated with the following error:Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap fn(i, *args) File "/workspace/geniii-trainingcode-lane/lanenet/dist...
2022-03-08 11:06:18
2543
原创 多级多卡分布式训练时,报错 RuntimeError: Socket Timeout
Traceback (most recent call last): File "distribute_prune_erfnet_cluster.py", line 643, in <module> daemon=False, ) File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn return start_processes(f.
2022-02-24 15:39:44
5276
6
原创 解决pycharm ‘python tests in *****.py’
1、在Pycharm中右键运行python程序时出现Run 'pytest in *****.py' ,这是进入了Pytest模式。2、解决办法进入到File->Settings->Tools->Python integrated Tools页面找到Testing下的Defaulttestrunner把Pytest设置为Unittests就可以了...
2022-01-24 17:28:21
7546
9
原创 解决remote: You are not allowed to upload code.fatal: unable to access.The requested URL error:403
remote: You are not allowed to upload code.fatal: unable to access 'http://10.170.136.211/adas/perception/geniii-trainingcode-lane.git/': The requested URL returned error: 403困惑了我一整天,MD,希望你就到此为止了这个问题非常简单,看下你的用户名和密码 和凭据管理器的密码是不是都正确...
2022-01-19 16:39:15
3273
1
原创 详解lambda sorted(lane_coords[i], key=lambda pair: pair[1])
lambda表达式 作为函数的参数,举例:list = [(1, 4), (2, 3), (3, 2), (4, 1)]list.sort(key=lambda pair: pair[1])print(list) 输出结果为按照第二维度数据来排列:[(4, 1), (3, 2), (2, 3), (1, 4)]
2022-01-17 15:28:20
595
原创 【排列组合】itertools中combinations与permutations函数作用与区别
import itertoolss = [1, 2, 3]print(itertools.permutations(s,2))print(list(itertools.permutations(s,2))) #分顺序lis = list(itertools.combinations(s, 2)) #不分顺序print(lis)permutations和combinations都是得到一个迭代器。配合list使用。combinations方法是生成不分顺序的组合,permutatio.
2022-01-17 11:26:49
1001
原创 解救RuntimeError: Stop_waiting response is expected
RuntimeError: Stop_waiting response is expected
2022-01-05 09:15:46
1758
2
原创 多机多卡GPU分布式训练
Traceback (most recent call last): File "train_erfnet_cluster.py", line 714, in <module> os.environ['MASTER_ADDR'] = os.environ['PAI_HOST_IP_worker_0'] File "/opt/conda/lib/python3.7/os.py", line 681, in __getitem__ raise KeyError(key) fr...
2021-12-31 15:50:48
1842
1
原创 pytorch tensor的统计属性 (统计tensor中为0或1的数量)
tensor的统计属性morm范数查看范数范数1:所有元素的绝对值的求和范数2:所有元素的绝对值的平方和的开方例子1:a = torch.full([8],1)b = a.view(2,4)c = a.view(2,2,2)a.norm(1),b.norm(1),c.norm(1)#都是tensor(8)a.norm(2),b.norm(2),c.norm(2)#都是tensor(2.8284) 根号8例子2:在指定的维数上面进行norm的查看a = t
2021-12-30 15:12:27
12786
2
原创 记录车道线检测
1.基于循环特征位移聚合器的车道线检测(RESA: Recurrent Feature-Shift Aggregator for Lane Detection)https://www.bilibili.com/video/BV1664y1o7wghttps://arxiv.org/abs/2008.13719https://github.com/ZJULearning/resatusimple数据集img_height = 368img_width = 640...
2021-12-30 11:16:24
1652
原创 使程序在Linux下后台运行 (关掉终端继续让程序运行的方法)
(1)输入命令:nohup 你的shell命令 &(2)回车,使终端回到shell命令行;(3)输入exit命令退出终端:exit(4)现在可以关闭你的终端软件了,等过足够的时间,让你的shell命令执行完了再上去看结果吧。其中,nohup命令可以让你的shell命令忽略SIGHUP信号,即可以使之脱离终端运行;“&”可以让你的命令在后台运行。以脱离终端的方式在后台运行shell命令有这样几个好处:只要你执行过了命令,那么你的网络中断不会对你有任何影响,并且你就可以关
2021-12-23 18:47:20
7799
原创 读xml转yolov5的txt代码
import osfrom glob import globimport xml.etree.ElementTree as ETxml_dir = 'E:\\dataset\\钢铁划痕_东大数据\\train\\train\\ANNOTATIONS'output_txt_dir = 'E:\\dataset\\钢铁划痕_东大数据\\train\\train\\TXT'def convert(size, box): dw = 1./(size[0]) dh = 1./(size[.
2021-08-27 20:16:38
564
1
转载 Faste-RCNN模型结构
FasterRCNN( (backbone): ResNet( (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (maxp.
2021-07-20 12:06:48
424
原创 cmake命令出错:Could NOT find PythonLibs (missing: PYTHON_LIBRARIES PYTHON_INCLUDE_DIRS)
问题:Could NOT find PythonLibs (missing: PYTHON_LIBRARIES PYTHON_INCLUDE_DIRS)解决办法cmake -DPYTHON_INCLUDE_DIR=/usr/include/python2.7 -DPYTHON_LIBRARY=/usr/lib/python2.7/config/libpython2.7.so ..上面/usr/include/python2.7以及/usr/lib/python2.7/config
2021-07-14 20:01:46
3764
原创 Could NOT find PY_pip (missing: PY_PIP)
Could NOT find PY_pip (missing: PY_PIP)-- Found PythonInterp: /usr/bin/python3.6 (found suitable version "3.6.9", minimum required is "3.6")-- Found PythonLibs: /home/lvying/miniconda3/envs/ppocr/lib (found suitable version "3.7.10", minimum required
2021-07-14 20:00:19
1505
6
原创 CMake Error at cmake/nccl.cmake:18 (file): file failed to open for reading (No such file or direct
CMake Error at cmake/nccl.cmake:18 (file): file failed to open for reading (No such file or directory): /home/ps/lvying/Paddle/NCCL_INCLUDE_DIR-NOTFOUND/nccl.hCall Stack (most recent call first): CMakeLists.txt:268 (include)-- Current NCCL hea...
2021-07-14 18:21:23
2988
3
原创 sudo: /etc/sudoers is world writable
sudo: /etc/sudoers is world writablesudo: no valid sudoers sources found, quittingsudo: unable to initialize policy plugin
2021-06-07 21:21:46
191
原创 提升认知能力 | 塑造大脑,重新认识你自己
很多人并不真正了解自己,甚至并没有想过要去了解自己,你能“控制”自己吗?是否经常把糟糕的情绪发泄给别人,不能心平气和地说话,看不惯别人的言谈举止。这些都是“情绪脑”在控制你,而不是你“自己”(特指自己的大脑部分,“理智脑”)。潜意识不容易察觉,给人感觉是大脑驱动的,所以才会对自身的各种问题困惑不已。我们只能凭借自己的心情和感觉去做事情,那样得到的结果往往不是我们想要的。看似懂得很多道理,却还是过不好一生。当你管理好自己的情绪,你的内心也就越来越强大,...
2021-05-18 15:34:56
792
原创 为什么越努力,越焦虑?
前一段时间有个朋友跟我说@ta最近日常焦虑到睡不着觉了,觉得自己要被卷上天了。对我而言焦虑可能来源:①最近莫名其妙的就开始失眠了,大概率是因为最近看的东西太多,了解的太多,颠覆了之前的观念,瞬间觉得自己可以提升的空间很大,要学东西很多,自然而然就焦虑了。②当你见得越多,脱离最开始的圈子之后,进入另一个都是精英的圈子,就会焦虑,无关年龄,无论男女。因为你见到越来越牛逼的人,即使自己在一定的圈子里有一定的实力,也会无比焦虑,然后变得更强更好。这应...
2021-05-18 15:29:13
294
1
原创 mmdetection2 输出各类别Ap值
找到 "mmdetection/mmdet/datasets/coco.py" 371行修改classwise=True,如果想要IOU=0.5的各类AP值,那么请修改355行iou_thrs=[0.5],正常训练和测试就会输出你想要的结果。IoU=0.50意味着IoU大于0.5被认为是检测到,超出检测范围,所以Ap=-1...
2021-04-23 12:20:05
3080
2
原创 目标检测 数据增强方法
Cutmix def load_cutmix_image_and_boxes(self, index, imsize=1024): """ This implementation of cutmix author: https://www.kaggle.com/nvnnghia Refactoring and adaptation: https://www.kaggle.com/shonenkov """ w,
2021-04-14 16:12:00
415
原创 ModuleNotFoundError: No module named ‘projects‘
Traceback (most recent call last): File "tools/train.py", line 41, in <module> from ..projects.CenterNet2.centernet.config import add_centernet_configImportError: attempted relative import with no known parent package(detectron2) lvying@ps:~...
2021-04-14 15:06:51
3347
原创 RuntimeError: Input, output and indices must be on the current device
Traceback (most recent call last): File "detect.py", line 236, in <module> detect() File "detect.py", line 47, in detect torch.onnx.export(model, # model being run File "/home/zjkj/miniconda3/envs/yolov5/lib/python3.8/sit...
2021-04-13 18:13:36
1987
3
原创 AssertionError: Checkpoint /home/yourstorePath/model_final_5bd44e.pkl not found!
Traceback (most recent call last): File "train.py", line 304, in <module> launch( File "/10t/lvying/miniconda3/envs/detectron2/lib/python3.8/site-packages/detectron2/engine/launch.py", line 62, in launch main_func(*args) File "train.py...
2021-04-13 17:32:54
2445
原创 AssertionError: Config file ‘‘ does not exist!
Command Line Args: Namespace(config_file='', dist_url='tcp://127.0.0.1:50170', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[], resume=False)Traceback (most recent call last): File "train.py", line 219, in <module> launc...
2021-04-13 13:16:29
5990
13
原创 ImportError: cannot import name ‘_C‘ from ‘detectron2‘
将detectron2/detectron2中的整个detectron2删除即可,因为已经安装过了,用安装好的就行了。Traceback (most recent call last): File "register_dataset.py", line 69, in <module> from detectron2 import model_zoo File "/10t/lvying/detectron2/detectron2/model_zoo/__init__.py...
2021-04-13 11:28:45
5591
9
原创 MMdetection目标检测框架安装报错:ModuleNotFoundError: No module named ‘fvcore.nn.distributed‘解决办法
解决办法: pip install -U 'git+https://github.com/facebookresearch/fvcore'.Traceback (most recent call last): File "demo/demo.py", line 16, in <module> from predictor import VisualizationDemo File "/home/amax/iKang/dsr/Project/det2/demo/pred..
2021-04-13 10:51:26
3729
原创 将模型从Pytorvh导出到ONNX并使用ONNX RUNTIME运行
(可选)将模型从PYTORCH导出到ONNX并使用ONNX RUNTIME运行在本教程中,我们描述了如何将PyTorch中定义的模型转换为ONNX格式,然后在ONNX Runtime中运行它。ONNX Runtime是针对ONNX模型的以性能为中心的引擎,可跨多个平台和硬件(Windows,Linux和Mac,以及在CPU和GPU上)高效地进行推理。ONNX运行时已被证明大大增加了多种型号的性能,说明这里对于本教程,您将需要安装ONNX和ONNX Runtime。您可以使用来获取ONNX和ON.
2021-04-02 18:03:33
617
原创 pytorch 使用autocast半精度加速训练
pytorch 使用autocast半精度加速训练准备工作pytorch 1.6+如何使用autocast?根据官方提供的方法,如何在PyTorch中使用自动混合精度?答案:autocast + GradScaler。1.autocast正如前文所说,需要使用torch.cuda.amp模块中的autocast 类。使用也是非常简单的from torch.cuda.amp import autocast as autocast# 创建model,默认是torch.FloatT
2021-03-24 20:12:24
11416
原创 Error: EACCES: permission denied, open ‘/home/10t/lvying/yolov5/train.py‘
未能保存“train.py”: 无法写入文件"vscode-remote://ssh-remote+192.168.2.20/home/10t/lvying/yolov5/train.py"(NoPermissions (FileSystemError): Error: EACCES: permission denied, open '/home/10t/lvying/yolov5/train.py')解决办法:root@psdz:/home/10t# sudo chown -R lvying .
2021-03-16 10:57:49
638
《美团机器学习实践》_美团算法团队
2018-09-20
Machine Learning Yearning(吴恩达的书)--Andre Ng
2018-07-30
阿里强化学习
2018-07-30
动手学深度学习(MXNET框架)最新教程2018.7.25
2018-07-28
python语言及其应用
2018-07-28
python机器学习基础教程(原版+高清+标签)
2018-07-28
python数据科学手册(高清+标签+原版PDF)
2018-07-28
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人