虚拟linux环境下h5py的安装与使用

最新推荐文章于 2024-10-31 14:32:31 发布

未完成的梦orz

最新推荐文章于 2024-10-31 14:32:31 发布

阅读量4.1k

点赞数

分类专栏： python 文章标签： linux h5py HDF5

本文链接：https://blog.youkuaiyun.com/qq_23392341/article/details/73649271

版权

python 专栏收录该内容

17 篇文章

订阅专栏

本文介绍了在虚拟Linux环境下安装h5py库的步骤，推荐使用预构建安装方式。h5py提供了对HDF5文件的访问，其中包含数据集（类似NumPy数组）和类似文件夹的组对象。安装完成后，在Python环境中创建HDF5文件时遇到权限问题，通过检查和重新应用Docker的共享驱动设置解决了问题，最终成功创建.hdf5文件并能进行后续操作。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

h5py文档：http://docs.h5py.org/en/latest/high/file.html#
安装h5py，Pre-built installation (recommended)

pip install h5py

了解：An HDF5 file is a container for two kinds of objects: datasets, which are array-like collections of data, and groups, which are folder-like containers that hold datasets and other groups.
Groups work like dictionaries, and datasets work like NumPy arrays

安装完成后，python命令进入python交互式环境

import h5py
h5py.run_tests()
来测试h5py是否安装成功

新建一个h5py文件：
The very first thing you’ll need to do is create a new file:

>>> import h5py
>>> import numpy as np
>>>
>>> f = h5py.File("mytestfile.hdf5", "w")

使用python命令进入交互式环境新建，也可以直接把这代码放到一个.py文件里，然后python filename运行该文件。

测试已安装成功，可是新建文件会有permission denied问题。
原来是共享的文件夹掉了。。。。没连上linux吧。。。
然后打开docker->settings->shared Drives,重新应用一下，好了。。。

linux命令 切换到root权限：sudo -i
删除文件：rm filename

这样就新建好.hdf5格式的文件了。

对该文件进行操作：

#create_dataset
>>> dset=f.create_dataset("mydataset",(100,),dtype='i')
>>> dset.name
'/mydataset'
>>> dset.dtype
dtype('int32')
>>> dset.shape
(100,)
#support array-style slicing
>>> dset[...]=np.arange(100)
>>> dset[0]
0
>>> dset[2]
2
>>> dset[10]
10
>>> dset[0:100:10]
array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=int32)
>>> f.name
'/'
#create_group
>>> grp=f.create_group("subgroup")
#在这个group下create_dataset
>>> dset2=grp.create_dataset("another_dataset",(50,),dtype='f')
>>> dset2.name
'/subgroup/another_dataset'

#在根目录下create_dataset,把目录名写进数据集名中
>>> dset3=f.create_dataset('subgroup2/dataset_three',(10,),dtype='i')
>>> dset3.name
'/subgroup2/dataset_three'
>>> dataset_three=f['subgroup2/dataset_three']
>>> for name in f:
...     print name
  File "<stdin>", line 2
    print name
             ^
SyntaxError: Missing parentheses in call to 'print'
>>> SyntaxError: Missing parentheses in call

#照着文档上的写print name报错了，还是要加括号，使用print(name)
#查看f目录下的group和dataset
>>> for name in f:
...     print(name)
...
mydataset
subgroup
subgroup2
#由此可见，通过for循环遍历f只能生成f直接附加的成员，如果想要遍历整个文件，可以通过遍历组方法visit()或visititems(),如下（注意输出结果）：
>>> def printname(name):
...     print(name)
...
>>> f.visit(printname)
mydataset
subgroup
subgroup/another_dataset
subgroup2
subgroup2/dataset_three
#不仅输出了dataset还输出了groups

#判断某dataset是否在f中
>>> "mydataset" in f
True
>>> "somethingelse" in f
False
>>> "subgroup/another_dataset" in f
True
#可以将元数据存储在其描述的数据旁边
#所有的对象都支持把附加的数据位称为属性
#大概意思就是支持自定义属性吧（我是这样理解的）
>>> dset.attrs['temperature']=99.5
>>> dset.attrs['temperature']
99.5
>>> 'temperature' in dset.attrs
True