TensorFlow batch_dot函数理解

本文详细介绍了在TensorFlow框架下使用Keras的batch_dot函数进行矩阵运算的方法。通过实例演示了不同axes参数设置下batch_dot的计算过程,并与tf.reduce_sum和tf.multiply结合使用验证结果的一致性。
部署运行你感兴趣的模型镜像

batch_dot实现

载入库

import tensorflow as tf
tf.enable_eager_execution()
import keras.backend as K
import numpy as np

生成输入数据

x1 = tf.convert_to_tensor([[1,2,3],[4,5,6]])
x2 = tf.convert_to_tensor([[1,2,3],[4,5,6]])

K.batch_dot(x1,x2,axes=1).numpy()

axes为1 的batch_dot输出如下:

array([[14],
       [77]], dtype=int32)

axes为2的batch_dot输出如下:

K.batch_dot(x1,x2,axes=0).numpy()
array([[17],
       [29],
       [45]], dtype=int32)

实际上与先经过位对位乘法然后按某一个轴作聚合加法返回的结果一直,下面是验证结果。

tf.reduce_sum(tf.multiply(x1 , x2) , axis=0).numpy()
array([17, 29, 45], dtype=int32)
tf.reduce_sum(tf.multiply(x1 , x2) , axis=1).numpy()
array([14, 77], dtype=int32)

您可能感兴趣的与本文相关的镜像

TensorFlow-v2.15

TensorFlow-v2.15

TensorFlow

TensorFlow 是由Google Brain 团队开发的开源机器学习框架,广泛应用于深度学习研究和生产环境。 它提供了一个灵活的平台,用于构建和训练各种机器学习模型

import numpy as np from sklearn.datasets import fetch_openml # 用于加载MNIST数据集 from sklearn.model_selection import train_test_split # ReLU 激活函数 def relu(x): return np.maximum(0, x) # ReLU 函数的导数 def relu_derivative(x): return (x > 0).astype(int) # 加载 MNIST 数据集(使用scikit-learn的fetch_openml) def load_mnist(): # 从openml加载MNIST数据集(784个特征对应28x28像素,目标是数字0-9) X, y = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False) # 转换标签为整数类型 y = y.astype(int) # 分割为训练集和测试集(MNIST标准分割:60000训练,10000测试) X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=10000, train_size=60000, random_state=42, stratify=y ) # 归一化像素值到[0, 1] X_train = X_train / 255.0 X_test = X_test / 255.0 return X_train, y_train, X_test, y_test # 加载数据 train_images, train_labels, test_images, test_labels = load_mnist() # 神经网络参数 input_size = 784 hidden_size1 = 128 hidden_size2 = 64 output_size = 10 learning_rate = 0.01 batch_size = 32 # 小批量大小 num_epochs = 10 # 训练轮次 # 权重初始化(使用Xavier初始化,更适合ReLU) W1 = np.random.randn(input_size, hidden_size1) * np.sqrt(2.0 / input_size) b1 = np.zeros((1, hidden_size1)) W2 = np.random.randn(hidden_size1, hidden_size2) * np.sqrt(2.0 / hidden_size1) b2 = np.zeros((1, hidden_size2)) W3 = np.random.randn(hidden_size2, output_size) * np.sqrt(2.0 / hidden_size2) b3 = np.zeros((1, output_size)) num_batches = len(train_images) // batch_size for epoch in range(num_epochs): # 每轮训练前打乱数据顺序 permutation = np.random.permutation(len(train_images)) train_images_shuffled = train_images[permutation] train_labels_shuffled = train_labels[permutation] # 小批量训练 for batch in range(num_batches): start = batch * batch_size end = start + batch_size batch_X = train_images_shuffled[start:end] batch_y = train_labels_shuffled[start:end] # 前向传播 z1 = np.dot(batch_X, W1) + b1 a1 = relu(z1) z2 = np.dot(a1, W2) + b2 a2 = relu(z2) z3 = np.dot(a2, W3) + b3 # 输出层logits # 计算softmax和损失梯度 exp_z3 = np.exp(z3 - np.max(z3, axis=1, keepdims=True)) # 数值稳定的softmax softmax = exp_z3 / np.sum(exp_z3, axis=1, keepdims=True) delta3 = softmax delta3[range(batch_size), batch_y] -= 1 # 交叉熵梯度 # 反向传播 delta2 = np.dot(delta3, W3.T) * relu_derivative(z2) # 使用z2计算ReLU导数更准确 delta1 = np.dot(delta2, W2.T) * relu_derivative(z1) # 更新权重(批量梯度下降) W3 -= learning_rate * np.dot(a2.T, delta3) / batch_size b3 -= learning_rate * np.sum(delta3, axis=0, keepdims=True) / batch_size W2 -= learning_rate * np.dot(a1.T, delta2) / batch_size b2 -= learning_rate * np.sum(delta2, axis=0, keepdims=True) / batch_size W1 -= learning_rate * np.dot(batch_X.T, delta1) / batch_size b1 -= learning_rate * np.sum(delta1, axis=0, keepdims=True) / batch_size # 测试集评估 correct = 0 # 分批处理测试集,减少内存占用 test_batch_size = 1000 for i in range(0, len(test_images), test_batch_size): batch_X = test_images[i:i + test_batch_size] batch_y = test_labels[i:i + test_batch_size] a1 = relu(np.dot(batch_X, W1) + b1) a2 = relu(np.dot(a1, W2) + b2) z3 = np.dot(a2, W3) + b3 predictions = np.argmax(z3, axis=1) correct += np.sum(predictions == batch_y) accuracy = correct / len(test_images) print(f'Epoch {epoch + 1}/{num_epochs}, Test Accuracy: {accuracy:.4f}')
最新发布
10-09
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值