keras.datasets.imdb.py 源码分析

本文分析了Keras中`keras.datasets.imdb.py`的`load_data`和`get_word_index`方法。IMDB数据集包含50,000条评论,分为训练集和测试集,各占一半。`load_data`方法负责数据的下载、预处理,包括数据打乱、截取文本长度,并返回预处理后的训练集和测试集。`get_word_index`方法获取单词到整数索引的映射字典,用于文本处理。源码中,0,1,2保留作为特殊索引,其余单词按频率排序,截取前`num_words`个。" 51729012,5605978,iOS 获取当前日期与月份信息及星期,"['iOS开发', '日期处理', 'Swift']

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

keras.datasets.imdb.py 源码分析

概述

IMDB 数据集:包含来自互联网电影数据库(IMDB)的 50 000 条严重两极分化的评论。数据集被分为用于训练的 25 000 条评论与用于测试的 25 000 条评论,训练集和测试集都包含 50% 的正面评论和 50% 的负面评论。
imdb.py 实现了IMDB 数据集文件下载加载,及数据预处理功能。
包含两个方法:load_data(数据加载)和get_word_index(字典加载)

load_data方法源码:

	def load_data(path='imdb.npz', num_words=None, skip_top=0,
              maxlen=None, seed=113,
              start_char=1, oov_char=2, index_from=3, **kwargs):
  1. 处理原先版本的遗留问题,确保输入参数正确
	#Legacy support
    if 'nb_words' in kwargs:
        warnings.warn('The `nb_words` argument in `load_data` '
                      'has been renamed `num_words`.')
        num_words = kwargs.pop('nb_words')
    if kwargs:
        raise TypeError('Unrecognized keyword arguments: ' + str(kwargs))
  1. 获取imdb.npz
	path = get_file(path,
                    origin='https://s3.amazonaws.com/text-datasets/imdb.npz',
                    file_hash='599dadb1135973df5b59232a0e9a887c')

这里的get_file方法是Keras自带的keras.utils.data_utils.get_file方法

  1. 读取文件,获取数据
	with np.load(path) as f:
        x_train, labels_train = f[
--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 1 ----> 1 from keras.datasets import imdb 3 # 加载IMDB数据集(保留最常见的20,000词) 4 max_features = 20000 File A:\anaconda\Lib\site-packages\keras\__init__.py:7 1 """DO NOT EDIT. 2 3 This file was autogenerated. Do not edit it by hand, 4 since your modifications would be overwritten. 5 """ ----> 7 from keras import _tf_keras as _tf_keras 8 from keras import activations as activations 9 from keras import applications as applications File A:\anaconda\Lib\site-packages\keras\_tf_keras\__init__.py:1 ----> 1 from keras._tf_keras import keras File A:\anaconda\Lib\site-packages\keras\_tf_keras\keras\__init__.py:7 1 """DO NOT EDIT. 2 3 This file was autogenerated. Do not edit it by hand, 4 since your modifications would be overwritten. 5 """ ----> 7 from keras import activations as activations 8 from keras import applications as applications 9 from keras import callbacks as callbacks File A:\anaconda\Lib\site-packages\keras\activations\__init__.py:7 1 """DO NOT EDIT. 2 3 This file was autogenerated. Do not edit it by hand, 4 since your modifications would be overwritten. 5 """ ----> 7 from keras.src.activations import deserialize as deserialize 8 from keras.src.activations import get as get 9 from keras.src.activations import serialize as serialize File A:\anaconda\Lib\site-packages\keras\src\__init__.py:1 ----> 1 from keras.src import activations 2 from keras.src import applications 3 from keras.src import backend File A:\anaconda\Lib\site-packages\keras\src\activations\__init__.py:3 1 import types ----> 3 from keras.src.activations.activations import celu 4 from keras.src.activations.activations import elu 5 from keras.src.activations.activations import exponential File A:\anaconda\Lib\site-packages\keras\src\activations\activations.py:1 ----> 1 from keras.src import backend 2 from keras.src import ops 3 from keras.src.api_export import keras_export File A:\anaconda\Lib\site-packages\keras\src\backend\__init__.py:10 7 import torch 9 from keras.src.api_export import keras_export ---> 10 from keras.src.backend.common.dtypes import result_type 11 from keras.src.backend.common.keras_tensor import KerasTensor 12 from keras.src.backend.common.keras_tensor import any_symbolic_tensors File A:\anaconda\Lib\site-packages\keras\src\backend\common\__init__.py:2 1 from keras.src.backend.common import backend_utils ----> 2 from keras.src.backend.common.dtypes import result_type 3 from keras.src.backend.common.variables import AutocastScope 4 from keras.src.backend.common.variables import Variable as KerasVariable File A:\anaconda\Lib\site-packages\keras\src\backend\common\dtypes.py:5 3 from keras.src.api_export import keras_export 4 from keras.src.backend import config ----> 5 from keras.src.backend.common.variables import standardize_dtype 7 BOOL_TYPES = ("bool",) 8 INT_TYPES = ( 9 "uint8", 10 "uint16", (...) 16 "int64", 17 ) File A:\anaconda\Lib\site-packages\keras\src\backend\common\variables.py:11 9 from keras.src.backend.common.stateless_scope import get_stateless_scope 10 from keras.src.backend.common.stateless_scope import in_stateless_scope ---> 11 from keras.src.utils.module_utils import tensorflow as tf 12 from keras.src.utils.naming import auto_name 15 class Variable: File A:\anaconda\Lib\site-packages\keras\src\utils\__init__.py:1 ----> 1 from keras.src.utils.audio_dataset_utils import audio_dataset_from_directory 2 from keras.src.utils.dataset_utils import split_dataset 3 from keras.src.utils.file_utils import get_file File A:\anaconda\Lib\site-packages\keras\src\utils\audio_dataset_utils.py:4 1 import numpy as np 3 from keras.src.api_export import keras_export ----> 4 from keras.src.utils import dataset_utils 5 from keras.src.utils.module_utils import tensorflow as tf 6 from keras.src.utils.module_utils import tensorflow_io as tfio File A:\anaconda\Lib\site-packages\keras\src\utils\dataset_utils.py:9 5 from multiprocessing.pool import ThreadPool 7 import numpy as np ----> 9 from keras.src import tree 10 from keras.src.api_export import keras_export 11 from keras.src.utils import io_utils File A:\anaconda\Lib\site-packages\keras\src\tree\__init__.py:1 ----> 1 from keras.src.tree.tree_api import assert_same_paths 2 from keras.src.tree.tree_api import assert_same_structure 3 from keras.src.tree.tree_api import flatten File A:\anaconda\Lib\site-packages\keras\src\tree\tree_api.py:8 5 from keras.src.utils.module_utils import optree 7 if optree.available: ----> 8 from keras.src.tree import optree_impl as tree_impl 9 elif dmtree.available: 10 from keras.src.tree import dmtree_impl as tree_impl File A:\anaconda\Lib\site-packages\keras\src\tree\optree_impl.py:13 11 # Register backend-specific node classes 12 if backend() == "tensorflow": ---> 13 from tensorflow.python.trackable.data_structures import ListWrapper 14 from tensorflow.python.trackable.data_structures import _DictWrapper 16 try: File A:\anaconda\Lib\site-packages\tensorflow\__init__.py:55 53 from tensorflow._api.v2 import autograph 54 from tensorflow._api.v2 import bitwise ---> 55 from tensorflow._api.v2 import compat 56 from tensorflow._api.v2 import config 57 from tensorflow._api.v2 import data File A:\anaconda\Lib\site-packages\tensorflow\_api\v2\compat\__init__.py:8 3 """Public API for tf._api.v2.compat namespace 4 """ 6 import sys as _sys ----> 8 from tensorflow._api.v2.compat import v1 9 from tensorflow._api.v2.compat import v2 10 from tensorflow.python.compat.compat import forward_compatibility_horizon # line: 125 File A:\anaconda\Lib\site-packages\tensorflow\_api\v2\compat\v1\__init__.py:30 28 from tensorflow._api.v2.compat.v1 import autograph 29 from tensorflow._api.v2.compat.v1 import bitwise ---> 30 from tensorflow._api.v2.compat.v1 import compat 31 from tensorflow._api.v2.compat.v1 import config 32 from tensorflow._api.v2.compat.v1 import data File A:\anaconda\Lib\site-packages\tensorflow\_api\v2\compat\v1\compat\__init__.py:8 3 """Public API for tf._api.v2.compat namespace 4 """ 6 import sys as _sys ----> 8 from tensorflow._api.v2.compat.v1.compat import v1 9 from tensorflow._api.v2.compat.v1.compat import v2 10 from tensorflow.python.compat.compat import forward_compatibility_horizon # line: 125 File A:\anaconda\Lib\site-packages\tensorflow\_api\v2\compat\v1\compat\v1\__init__.py:32 30 from tensorflow._api.v2.compat.v1 import compat 31 from tensorflow._api.v2.compat.v1 import config ---> 32 from tensorflow._api.v2.compat.v1 import data 33 from tensorflow._api.v2.compat.v1 import debugging 34 from tensorflow._api.v2.compat.v1 import distribute File A:\anaconda\Lib\site-packages\tensorflow\_api\v2\compat\v1\data\__init__.py:8 3 """Public API for tf._api.v2.data namespace 4 """ 6 import sys as _sys ----> 8 from tensorflow._api.v2.compat.v1.data import experimental 9 from tensorflow.python.data.ops.dataset_ops import AUTOTUNE # line: 103 10 from tensorflow.python.data.ops.dataset_ops import DatasetV1 as Dataset # line: 3710 File A:\anaconda\Lib\site-packages\tensorflow\_api\v2\compat\v1\data\experimental\__init__.py:32 30 from tensorflow.python.data.experimental.ops.interleave_ops import parallel_interleave # line: 29 31 from tensorflow.python.data.experimental.ops.interleave_ops import sample_from_datasets_v1 as sample_from_datasets # line: 158 ---> 32 from tensorflow.python.data.experimental.ops.iterator_model_ops import get_model_proto # line: 25 33 from tensorflow.python.data.experimental.ops.iterator_ops import make_saveable_from_iterator # line: 38 34 from tensorflow.python.data.experimental.ops.lookup_ops import DatasetInitializer # line: 54 ModuleNotFoundError: No module named 'tensorflow.python.data.experimental.ops.iterator_model_ops' 运行加载Keras中的IMDB数据集的代码报错
最新发布
06-05
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值