tf.cast /tf.reduce_mean/tf.random_normal

本文介绍TensorFlow中tf.cast函数用于数据类型转换的方法,包括布尔型与浮点型之间的转换。同时讲解tf.reduce_mean函数求平均值的应用场景,并演示如何使用tf.random_normal生成指定形状的正态分布随机数。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

tf.cast(x, dtype, name=None) 
将x的数据格式转化成dtype.例如,原来x的数据格式是bool, 

那么将其转化成float以后,就能够将其转化成0和1的序列。反之也可以

a = tf.Variable([1,0,0,1,1])
b = tf.cast(a,dtype=tf.bool)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print(sess.run(b))
#[ True False False  True  True]

tfreduce_mean()主要是用来求均值,在机器学习中可以用来求准确度

x = np.array([[1.,2.,3.],[4.,5.,6.]])
sess = tf.Session()
mean_none = sess.run(tf.reduce_mean(x))#求全部的均值
mean_0 = sess.run(tf.reduce_mean(x, 0))#求列的均值
mean_1 = sess.run(tf.reduce_mean(x, 1))#求行的均值

tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

从正态分布中输出随机值。
参数:

  •     shape: 输出张量的形状,必选
  •     mean: 正态分布的均值,默认为0
  •     stddev: 正态分布的标准差,默认为1.0
  •     dtype: 输出的类型,默认为tf.float32
  •     seed: 随机数种子,是一个整数,当设置之后,每次生成的随机数都一样
  •     name: 操作的名称
随机生成一个两行三列的数组。标准差为1

  1. w1 = tf.Variable(tf.random_normal([23], stddev=1, seed=1))  


参考:https://blog.youkuaiyun.com/luoganttcc/article/details/70315538

对下面代码进行改错 import tensorflow.compat.v1 as tf tf.compat.v1.disable_eager_execution() from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) num_classes = 10 input_size = 784 hidden_units_size = 30 batch_size = 100 training_iterations = 10000 X = tf.placeholder(tf.float32, [None, input_size]) Y = tf.placeholder(tf.float32, [None, num_classes]) W1 = tf.Variable(tf.random_normal([input_size, hidden_units_size],stddev = 0.1)) B1 = tf.Variable(tf.constant([hidden_units_size])) W2 = tf.Variable(tf.random_normal ([hidden_units_size,num_classes],stddev = 0.1)) B2 = tf.Variable(tf.constant(0.1), [num_classes]) hidden_opt = tf.matmul(X, W1) + B1 hidden_opt = tf.nn.relu(hidden_opt) final_opt = tf.matmul(hidden_opt, W2) + B2 final_opt = tf.nn.relu(final_opt) loss1 = tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=final_opt) loss = tf.reduce_mean(loss1) opt = tf.train.GradientDescentOptimizer(0.05).minimize(loss) init = tf.global_variables_initializer() correct_prediction = tf.equal(tf.argmax(Y,1), tf.argmax(final_opt,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float')) sess = tf.Session() sess.run(init) for i in range(training_iterations): batch = mnist.train.next_batch(batch_size) batch_input = batch[0] batch_labels = batch[1] train_loss = sess.run([opt, loss], feed_dict={X: batch_input, Y: batch_labels}) if i % 100 == 0: train_accuracy = accuracy.eval (session = sess, feed_dict={X: batch_input, Y: batch_labels}) print("step %d, training accuracy %g" % (i, train_accuracy))
04-01
import tensorflow as tf from sklearn import datasets from matplotlib import pyplot as plt import numpy as np # 导入数据,分别为输入特征和标签 x_data = datasets.load_iris().data y_data = datasets.load_iris().target # 随机打乱数据(因为原始数据是顺序的,顺序不打乱会影响准确率) # seed: 随机数种子,是一个整数,当设置之后,每次生成的随机数都一样(为方便教学,以保每位同学结果一致) np.random.seed(116) # 使用相同的seed,保证输入特征和标签一一对应 np.random.shuffle(x_data) np.random.seed(116) np.random.shuffle(y_data) tf.random.set_seed(116) # 将打乱后的数据集分割为训练集和测试集,训练集为前120行,测试集为后30行 x_train = x_data[:-30] y_train = y_data[:-30] x_test = x_data[-30:] y_test = y_data[-30:] # 转换x的数据类型,否则后面矩阵相乘时会因数据类型不一致报错 x_train = tf.cast(x_train, tf.float32) x_test = tf.cast(x_test, tf.float32) # from_tensor_slices函数使输入特征和标签值一一对应。(把数据集分批次,每个批次batch组数据) train_db = tf.data.Dataset.from_tensor_slices((x_train, y_train)).batch(32) test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32) # 生成神经网络的参数,4个输入特征故,输入层为4个输入节点;因为3分类,故输出层为3个神经元 # 用tf.Variable()标记参数可训练 # 使用seed使每次生成的随机数相同(方便教学,使大家结果都一致,在现实使用时不写seed) w1 = tf.Variable(tf.random.truncated_normal([4, 3], stddev=0.1, seed=1)) b1 = tf.Variable(tf.random.truncated_normal([3], stddev=0.1, seed=1)) lr = 0.1 # 学习率为0.1 train_loss_results = [] # 将每轮的loss记录在此列表中,为后续画loss曲线提供数据 test_acc = [] # 将每轮的acc记录在此列表中,为后续画acc曲线提供数据 epoch = 500 # 循环500轮 loss_all = 0 # 每轮分4个step,loss_all记录四个step生成的4个loss的和 # 训练部分 for epoch in range(epoch): #数据集级别的循环,每个epoch循环一次数据集 for step, (x_train, y_train) in enumerate(train_db): #batch级别的循环 ,每个step循环一个batch with tf.GradientTape() as tape: # with结构记录梯度信息 y = tf.matmul(x_train, w1) + b1 # 神经网络乘加运算 y = tf.nn.softmax(y) # 使输出y符合概率分布(此操作后与独热码同量级,可相减求loss) y_ = tf.one_hot(y_train, depth=3) # 将标签值转换为独热码格式,方便计算loss和accuracy loss = tf.reduce_mean(tf.square(y_ - y)) # 采用均方误差损失函数mse = mean(sum(y-out)^2) loss_all += loss.numpy() # 将每个step计算出的loss累加,为后续求loss平均值提供数据,这样计算的loss更准确 # 计算loss对各个参数的梯度 grads = tape.gradient(loss, [w1, b1]) # 实现梯度更新 w1 = w1 - lr * w1_grad b = b - lr * b_grad w1.assign_sub(lr * grads[0]) # 参数w1自更新 b1.assign_sub(lr * grads[1]) # 参数b自更新 # 每个epoch,打印loss信息 print("Epoch {}, loss: {}".format(epoch, loss_all/4)) train_loss_results.append(loss_all / 4) # 将4个step的loss求平均记录在此变量中 loss_all = 0 # loss_all归零,为记录下一个epoch的loss做准备 # 测试部分 # total_correct为预测对的样本个数, total_number为测试的总样本数,将这两个变量都初始化为0 total_correct, total_number = 0, 0 for x_test, y_test in test_db: # 使用更新后的参数进行预测 y = tf.matmul(x_test, w1) + b1 y = tf.nn.softmax(y) pred = tf.argmax(y, axis=1) # 返回y中最大值的索引,即预测的分类 # 将pred转换为y_test的数据类型 pred = tf.cast(pred, dtype=y_test.dtype) # 若分类正确,则correct=1,否则为0,将bool型的结果转换为int型 correct = tf.cast(tf.equal(pred, y_test), dtype=tf.int32) # 将每个batch的correct数加起来 correct = tf.reduce_sum(correct) # 将所有batch中的correct数加起来 total_correct += int(correct) # total_number为测试的总样本数,也就是x_test的行数,shape[0]返回变量的行数 total_number += x_test.shape[0] # 总的准确率等于total_correct/total_number acc = total_correct / total_number test_acc.append(acc) print("Test_acc:", acc) print("--------------------------") 帮我看看这段代码,我主要是想知道“correct = tf.cast(tf.equal(pred, y_test), dtype=tf.int32) correct = tf.reduce_sum(correct)”这段的correct是怎么做到累积的,不是每次循环都变成0或1了吗
03-08
# Copyright 2018 DeepMind Technologies Limited # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # https://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ============================================================================ """Code defining LEO inner loop. See "Meta-Learning with Latent Embedding Optimization" by Rusu et al. (https://arxiv.org/pdf/1807.05960.pdf). """ from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np from six.moves import range from six.moves import zip import sonnet as snt import tensorflow as tf import tensorflow_probability as tfp import data as data_module def get_orthogonality_regularizer(orthogonality_penalty_weight): """Returns the orthogonality regularizer.""" def orthogonality(weight): """Calculates the layer-wise penalty encouraging orthogonality.""" with tf.name_scope(None, "orthogonality", [weight]) as name: w2 = tf.matmul(weight, weight, transpose_b=True) wn = tf.norm(weight, ord=2, axis=1, keepdims=True) + 1e-32 correlation_matrix = w2 / tf.matmul(wn, wn, transpose_b=True) matrix_size = correlation_matrix.get_shape().as_list()[0] base_dtype = weight.dtype.base_dtype identity = tf.eye(matrix_size, dtype=base_dtype) weight_corr = tf.reduce_mean( tf.squared_difference(correlation_matrix, identity)) return tf.multiply( tf.cast(orthogonality_penalty_weight, base_dtype), weight_corr, name=name) return orthogonality class LEO(snt.AbstractModule): """Sonnet module implementing the inner loop of LEO.""" def __init__(self, config=None, use_64bits_dtype=True, name="leo"): super(LEO, self).__init__(name=name) self._float_dtype = tf.float64 if use_64bits_dtype else tf.float32 self._int_dtype = tf.int64 if use_64bits_dtype else tf.int32 self._inner_unroll_length = config["inner_unroll_length"] self._finetuning_unroll_length = config["finetuning_unroll_length"] self._inner_lr_init = config["inner_lr_init"] self._finetuning_lr_init = config["finetuning_lr_init"] self._num_latents = config["num_latents"] self._dropout_rate = config["dropout_rate"] self._kl_weight = config["kl_weight"] # beta self._encoder_penalty_weight = config["encoder_penalty_weight"] # gamma self._l2_penalty_weight = config["l2_penalty_weight"] # lambda_1 # lambda_2 self._orthogonality_penalty_weight = config["orthogonality_penalty_weight"] assert self._inner_unroll_length > 0, ("Positive unroll length is necessary" " to create the graph") def _build(self, data, is_meta_training=True): """Connects the LEO module to the graph, creating the variables. Args: data: A data_module.ProblemInstance constaining Tensors with the following shapes: - tr_input: (N, K, dim) - tr_output: (N, K, 1) - tr_info: (N, K) - val_input: (N, K_valid, dim) - val_output: (N, K_valid, 1) - val_info: (N, K_valid) where N is the number of classes (as in N-way) and K and the and K_valid are numbers of training and validation examples within a problem instance correspondingly (as in K-shot), and dim is the dimensionality of the embedding. is_meta_training: A boolean describing whether we run in the training mode. Returns: Tensor with the inner validation loss of LEO (include both adaptation in the latent space and finetuning). """ if isinstance(data, list): data = data_module.ProblemInstance(*data) self.is_meta_training = is_meta_training self.save_problem_instance_stats(data.tr_input) latents, kl = self.forward_encoder(data) tr_loss, adapted_classifier_weights, encoder_penalty = self.leo_inner_loop( data, latents) val_loss, val_accuracy = self.finetuning_inner_loop( data, tr_loss, adapted_classifier_weights) val_loss += self._kl_weight * kl val_loss += self._encoder_penalty_weight * encoder_penalty # The l2 regularization is is already added to the graph when constructing # the snt.Linear modules. We pass the orthogonality regularizer separately, # because it is not used in self.grads_and_vars. regularization_penalty = ( self._l2_regularization + self._decoder_orthogonality_reg) batch_val_loss = tf.reduce_mean(val_loss) batch_val_accuracy = tf.reduce_mean(val_accuracy) return batch_val_loss + regularization_penalty, batch_val_accuracy @snt.reuse_variables def leo_inner_loop(self, data, latents): with tf.variable_scope("leo_inner"): inner_lr = tf.get_variable( "lr", [1, 1, self._num_latents], dtype=self._float_dtype, initializer=tf.constant_initializer(self._inner_lr_init)) starting_latents = latents loss, _ = self.forward_decoder(data, latents) for _ in range(self._inner_unroll_length): loss_grad = tf.gradients(loss, latents) # dLtrain/dz latents -= inner_lr * loss_grad[0] loss, classifier_weights = self.forward_decoder(data, latents) if self.is_meta_training: encoder_penalty = tf.losses.mean_squared_error( labels=tf.stop_gradient(latents), predictions=starting_latents) encoder_penalty = tf.cast(encoder_penalty, self._float_dtype) else: encoder_penalty = tf.constant(0., self._float_dtype) return loss, classifier_weights, encoder_penalty @snt.reuse_variables def finetuning_inner_loop(self, data, leo_loss, classifier_weights): tr_loss = leo_loss with tf.variable_scope("finetuning"): finetuning_lr = tf.get_variable( "lr", [1, 1, self.embedding_dim], dtype=self._float_dtype, initializer=tf.constant_initializer(self._finetuning_lr_init)) for _ in range(self._finetuning_unroll_length): loss_grad = tf.gradients(tr_loss, classifier_weights) classifier_weights -= finetuning_lr * loss_grad[0] tr_loss, _ = self.calculate_inner_loss(data.tr_input, data.tr_output, classifier_weights) val_loss, val_accuracy = self.calculate_inner_loss( data.val_input, data.val_output, classifier_weights) return val_loss, val_accuracy @snt.reuse_variables def forward_encoder(self, data): encoder_outputs = self.encoder(data.tr_input) relation_network_outputs = self.relation_network(encoder_outputs) latent_dist_params = self.average_codes_per_class(relation_network_outputs) latents, kl = self.possibly_sample(latent_dist_params) return latents, kl @snt.reuse_variables def forward_decoder(self, data, latents): weights_dist_params = self.decoder(latents) # Default to glorot_initialization and not stddev=1. fan_in = self.embedding_dim.value fan_out = self.num_classes.value stddev_offset = np.sqrt(2. / (fan_out + fan_in)) classifier_weights, _ = self.possibly_sample(weights_dist_params, stddev_offset=stddev_offset) tr_loss, _ = self.calculate_inner_loss(data.tr_input, data.tr_output, classifier_weights) return tr_loss, classifier_weights @snt.reuse_variables def encoder(self, inputs): with tf.variable_scope("encoder"): after_dropout = tf.nn.dropout(inputs, rate=self.dropout_rate) regularizer = tf.contrib.layers.l2_regularizer(self._l2_penalty_weight) initializer = tf.initializers.glorot_uniform(dtype=self._float_dtype) encoder_module = snt.Linear( self._num_latents, use_bias=False, regularizers={"w": regularizer}, initializers={"w": initializer}, ) outputs = snt.BatchApply(encoder_module)(after_dropout) return outputs @snt.reuse_variables def relation_network(self, inputs): with tf.variable_scope("relation_network"): regularizer = tf.contrib.layers.l2_regularizer(self._l2_penalty_weight) initializer = tf.initializers.glorot_uniform(dtype=self._float_dtype) relation_network_module = snt.nets.MLP( [2 * self._num_latents] * 3, use_bias=False, regularizers={"w": regularizer}, initializers={"w": initializer}, ) total_num_examples = self.num_examples_per_class*self.num_classes inputs = tf.reshape(inputs, [total_num_examples, self._num_latents]) left = tf.tile(tf.expand_dims(inputs, 1), [1, total_num_examples, 1]) right = tf.tile(tf.expand_dims(inputs, 0), [total_num_examples, 1, 1]) concat_codes = tf.concat([left, right], axis=-1) outputs = snt.BatchApply(relation_network_module)(concat_codes) outputs = tf.reduce_mean(outputs, axis=1) # 2 * latents, because we are returning means and variances of a Gaussian outputs = tf.reshape(outputs, [self.num_classes, self.num_examples_per_class, 2 * self._num_latents]) return outputs @snt.reuse_variables def decoder(self, inputs): with tf.variable_scope("decoder"): l2_regularizer = tf.contrib.layers.l2_regularizer(self._l2_penalty_weight) orthogonality_reg = get_orthogonality_regularizer( self._orthogonality_penalty_weight) initializer = tf.initializers.glorot_uniform(dtype=self._float_dtype) # 2 * embedding_dim, because we are returning means and variances decoder_module = snt.Linear( 2 * self.embedding_dim, use_bias=False, regularizers={"w": l2_regularizer}, initializers={"w": initializer}, ) outputs = snt.BatchApply(decoder_module)(inputs) self._orthogonality_reg = orthogonality_reg(decoder_module.w) return outputs def average_codes_per_class(self, codes): codes = tf.reduce_mean(codes, axis=1, keep_dims=True) # K dimension # Keep the shape (N, K, *) codes = tf.tile(codes, [1, self.num_examples_per_class, 1]) return codes def possibly_sample(self, distribution_params, stddev_offset=0.): means, unnormalized_stddev = tf.split(distribution_params, 2, axis=-1) stddev = tf.exp(unnormalized_stddev) stddev -= (1. - stddev_offset) stddev = tf.maximum(stddev, 1e-10) distribution = tfp.distributions.Normal(loc=means, scale=stddev) if not self.is_meta_training: return means, tf.constant(0., dtype=self._float_dtype) samples = distribution.sample() kl_divergence = self.kl_divergence(samples, distribution) return samples, kl_divergence def kl_divergence(self, samples, normal_distribution): random_prior = tfp.distributions.Normal( loc=tf.zeros_like(samples), scale=tf.ones_like(samples)) kl = tf.reduce_mean( normal_distribution.log_prob(samples) - random_prior.log_prob(samples)) return kl def predict(self, inputs, weights): after_dropout = tf.nn.dropout(inputs, rate=self.dropout_rate) # This is 3-dimensional equivalent of a matrix product, where we sum over # the last (embedding_dim) dimension. We get [N, K, N, K] tensor as output. per_image_predictions = tf.einsum("ijk,lmk->ijlm", after_dropout, weights) # Predictions have shape [N, K, N]: for each image ([N, K] of them), what # is the probability of a given class (N)? predictions = tf.reduce_mean(per_image_predictions, axis=-1) return predictions def calculate_inner_loss(self, inputs, true_outputs, classifier_weights): model_outputs = self.predict(inputs, classifier_weights) model_predictions = tf.argmax( model_outputs, -1, output_type=self._int_dtype) accuracy = tf.contrib.metrics.accuracy(model_predictions, tf.squeeze(true_outputs, axis=-1)) return self.loss_fn(model_outputs, true_outputs), accuracy def save_problem_instance_stats(self, instance): num_classes, num_examples_per_class, embedding_dim = instance.get_shape() if hasattr(self, "num_classes"): assert self.num_classes == num_classes, ( "Given different number of classes (N in N-way) in consecutive runs.") if hasattr(self, "num_examples_per_class"): assert self.num_examples_per_class == num_examples_per_class, ( "Given different number of examples (K in K-shot) in consecutive" "runs.") if hasattr(self, "embedding_dim"): assert self.embedding_dim == embedding_dim, ( "Given different embedding dimension in consecutive runs.") self.num_classes = num_classes self.num_examples_per_class = num_examples_per_class self.embedding_dim = embedding_dim @property def dropout_rate(self): return self._dropout_rate if self.is_meta_training else 0.0 def loss_fn(self, model_outputs, original_classes): original_classes = tf.squeeze(original_classes, axis=-1) # Tensorflow doesn't handle second order gradients of a sparse_softmax yet. one_hot_outputs = tf.one_hot(original_classes, depth=self.num_classes) return tf.nn.softmax_cross_entropy_with_logits_v2( labels=one_hot_outputs, logits=model_outputs) def grads_and_vars(self, metatrain_loss): """Computes gradients of metatrain_loss, avoiding NaN. Uses a fixed penalty of 1e-4 to enforce only the l2 regularization (and not minimize the loss) when metatrain_loss or any of its gradients with respect to trainable_vars are NaN. In practice, this approach pulls the variables back into a feasible region of the space when the loss or its gradients are not defined. Args: metatrain_loss: A tensor with the LEO meta-training loss. Returns: A tuple with: metatrain_gradients: A list of gradient tensors. metatrain_variables: A list of variables for this LEO model. """ metatrain_variables = self.trainable_variables metatrain_gradients = tf.gradients(metatrain_loss, metatrain_variables) nan_loss_or_grad = tf.logical_or( tf.is_nan(metatrain_loss), tf.reduce_any([tf.reduce_any(tf.is_nan(g)) for g in metatrain_gradients])) regularization_penalty = ( 1e-4 / self._l2_penalty_weight * self._l2_regularization) zero_or_regularization_gradients = [ g if g is not None else tf.zeros_like(v) for v, g in zip(tf.gradients(regularization_penalty, metatrain_variables), metatrain_variables)] metatrain_gradients = tf.cond(nan_loss_or_grad, lambda: zero_or_regularization_gradients, lambda: metatrain_gradients, strict=True) return metatrain_gradients, metatrain_variables @property def _l2_regularization(self): return tf.cast( tf.reduce_sum(tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)), dtype=self._float_dtype) @property def _decoder_orthogonality_reg(self): return self._orthogonality_reg 中解码器将潜在代码映射为分类器参数,然后将分类器参数用于更新分类器,再通过更新后的分类器求损失,这些对应哪段代码?
最新发布
05-15
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值