Read EDF Bio-Electricity Datafile

I was studying about sleeping steps, using this cap-sleeping dataset: https://physionet.org/content/capslpdb/1.0.0/ as reference.

However I encountered some problems decoding the EEG sequencial datafile which ends with *.edf.

This datasets contains several kinds of files, including *.edf, *.m, *.xlsx, *.st, *.txt. The following explainations are about these types of files:


*.edf:

Multi-Channels raw-signal datafile, and the full name of edf was European Data Format, whose name should be widely known in the realm of signal-processing. For python users, we can use pyEDFlib to analyze this file type.

https://pyedflib.readthedocs.io/en/latest/

pyEDFlib is a python library to read/write EDF+/BDF+ files based on EDFlib.

EDF means [European Data Format](http://www.edfplus.info/) and was firstly published [1992](http://www.sciencedirect.com/science/article/pii/0013469492900097). In 2003, an improved version of the file protokoll named EDF+ has been published and can be found [here](http://www.sciencedirect.com/science/article/pii/0013469492900097).

The EDF/EDF+ format saves all data with 16 Bit. A version which saves all data with 24 Bit, was introduces by the compony [BioSemi](http://www.biosemi.com/faq/file_format.htm).

The definition of the EDF/EDF+/BDF/BDF+ format can be found under [edfplus.info](http://www.edfplus.info/).

This python toolbox is a fork of the [toolbox from Christopher Lee-Messer](https://bitbucket.org/cleemesser/python-edf/) and uses the [EDFlib](http://www.teuniz.net/edflib/) from Teunis van Beelen. The EDFlib is able to read and write EDF/EDF+/BDF/BDF+ files.

*.st: 

The scores for each recording in PhysioBank-compatible format.

*.txt:

The scores for each recording in REMlogic report format.

The .txt score files have the following fields:

  • Sleep stage (W=wake, S1-S4=sleep stages, R=REM, MT=body movements)
  • Body position (Left, Right, Prone, or Supine; not recorded in some subjects)
  • Time of day [hh:mm:ss]
  • Event (either a sleep stage (SLEEP-S0..S4, REM, MT), or a phase A of CAP)
  • Duration (in seconds)
  • Location (the signal(s) in which the event can be observed)

*.m:

  Matlab scripts. for reading the .txt score files easily. By launching ScoringReader.m, an array containing the scoring of the macrostructure, and three arrays containing the starting time of the CAP A phases, their duration and their subtype, are generated. The hypnogram is reported as 0 for wake, 1 to 4 for the sleep stages, according to R&K rules, and 5 for REM. The Matlab function CAP.m computes, starting from these variables, the CAP time and CAP rate according to Terzano’s rules.

gender-age.xlsx:

report of age and gender of the subjects. The first row is what is wrong with the subjects. Here is a list. These names also showed in txt files.


Since we get the definitions, we can use tools to step out for reaching the benchmark.

for python user, use pip to install the edf analyze library:

pip install pyedflib

Then run the test code to draw a fragment of one sequencial data file:

>>> import pyedflib
>>> import numpy as np
>>> file_name = r"\\192.168.1.103\ssd\cap-sleep-database-1.0.0\n1.edf"
>>> f = pyedflib.EdfReader(file_name)
>>> n = f.signals_in_file
>>> f
<pyedflib.edfreader.EdfReader object at 0x000001AEDC0488D8>
>>> n
21
>>> signal_labels = f.getSignalLabels()
>>> signal_labels
['ROC-LOC', 'LOC-ROC', 'F2-F4', 'F4-C4', 'C4-P4', 'P4-O2', 'F1-F3', 'F3-C3', 'C3-P3', 'P3-O1', 'C4-A1', 'EMG1-EMG2', 'ECG1-ECG2', 'TERMISTORE', 'TORACE', 'ADDOME', 'Dx1-DX2', 'SX1-SX2', 'Posizione', 'HR', 'SpO2']
>>> len(signal_labels)
21
>>> sample = f.getNSamples()
>>> len(sample)
21
>>> sample
array([17725440, 17725440, 17725440, 17725440, 17725440, 17725440,
       17725440, 17725440, 17725440, 17725440, 17725440,  8862720,
       17725440,  2215680,  2215680,  2215680,  8862720,  8862720,
        2215680,    34620,    34620])
>>> f.readSignal(1)
array([ -28.96970238,  -19.93384158,   -8.21164377, ..., -244.94509158,
       -242.44191392, -241.61769689])
>>> signal_1 = f.readSignal(1)
>>> len(signal_1)
17725440
>>> import matplotlib.pyplot as plt
>>> plt.scatter([_ for _ in range(10000)], signal_1[:10000])
<matplotlib.collections.PathCollection object at 0x000001AEEF75BC50>
>>> plt.show()

 

Now with the labels, you can run test on this dataset. 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值