WGAN-GP(Wasserstein GAN with Gradient Penalty)是一种改进的生成对抗网络(GAN),它通过添加梯度惩罚项来加强Wasserstein距离的计算,从而改善训练的稳定性和生成样本的质量。当应用到处理连续变量时,比如时间序列数据,你可能需要考虑如何设置批次大小(batch size)、特征数(feature number)和序列长度(sequence length)。这些参数的选择会影响到模型的性能、训练速度以及内存使用。
### Batch Size
- **定义**:一次迭代中使用的样本数量。
- **影响**:较大的批次大小可以提供更稳定的梯度估计,但会增加内存需求,并且可能减少模型泛化能力;较小的批次大小则相反,可能导致训练过程中的方差更大,但是有助于探索更多的局部极小值。
- **选择建议**:对于时间序列数据,通常根据可用硬件资源(如GPU显存)来确定一个合适的范围。常见的选择有32, 64, 128等。如果资源允许,你可以从一个适中的数值开始尝试,然后根据实验结果调整。
### Feature Number
- **定义**:每个时间点的数据维度。
- **影响**:这取决于你的具体应用场景。在金融领域,例如股票价格预测,可能只包含开盘价、收盘价等几个特征;而在其他复杂场景下,可能涉及到数百甚至数千个特征。
- **选择建议**:应该基于实际问题的需求来决定。确保所选特征对任务是相关的,并尽可能进行特征工程以提高模型的表现。
### Sequence Length
- **定义**:每次输入给模型的时间步长。
- **影响**:较长的序列能够捕捉更长期的依赖关系,但也增加了模型的复杂性及过拟合的风险;较短的序列虽然简化了模型,但可能无法充分表示数据间的联系。
- **选择建议**:同样地,这取决于具体的业务场景。一般来说,可以从一个合理的猜测开始(比如一周或一个月的数据量),然后通过交叉验证等方式逐步优化。此外,还可以利用领域知识帮助设定合理的序列长度。
综上所述,在实际操作过程中,这三个参数往往需要结合实际情况灵活调整。实践中常用的方法包括网格搜索、随机搜索或是贝叶斯优化等超参数调优技术,通过多次试验找到最佳配置。同时,考虑到WGAN-GP的特性,保持判别器与生成器之间的平衡也很重要,有时候还需要调整学习率、权重剪枝等其他因素以获得更好的效果。
在TensorFlow中实现WGAN-GP并应用于处理连续变量(例如时间序列数据)时,你需要构建生成器和判别器网络,并且要确保实现梯度惩罚。以下是一个简化的示例代码框架,它展示了如何设置批次大小、特征数和序列长度。
首先,确保你已经安装了TensorFlow。如果没有,请使用pip进行安装:
```bash
pip install tensorflow
```
通讯的演示代码
# !pip
# install
# tensorflow == 2.0
# .0
import numpy as np
# % matplotlib
# inline
import matplotlib.pyplot as plt
import warnings
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=FutureWarning)
import tensorflow as tf
import os
tf.__version__
from tensorflow import keras
import time
from __future__ import absolute_import, division, print_function, unicode_literals
import pandas as pd
import sys
assert sys.version_info >= (3, 5)
# % matplotlib
# inline
import matplotlib as mpl
mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import OneHotEncoder
import pandas as pd
from scipy import special
from tensorflow.keras import layers
np.random.seed(42)
tf.random.set_seed(42)
k = 4 # Number of information bits per message, i.e., M=2**k
M = 2 ** k
n = 2 # Number of real channel uses per message
# k = int(np.log2(M))
# n = 2
print(M)
SNR = 7
time_to_train_w_gan = 0
gen_learning_rate = 0.0001
disc_learning_rate = 0.0001 # 0.0001
randN_initial = keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None)
ones_initial = keras.initializers.GlorotNormal()
EncIn = tf.keras.layers.Input(shape=(M,)) # , dtype= tf.int32)
e1 = tf.keras.layers.Dense(2 * n, activation=None)
e2 = tf.keras.layers.Lambda(lambda x: tf.reshape(x, shape=[-1, int(n / 2), 2]))
EncOut = tf.keras.layers.Lambda(lambda x: x / tf.sqrt(2 * tf.reduce_mean(tf.square(x))))
GenIn = tf.keras.layers.Lambda(lambda x: tf.reshape(x, (tf.shape(x)[0], -1)))
# = tf.keras.layers.Lambda(generator)
DecIn = tf.keras.layers.Lambda(lambda x: tf.reshape(x, shape=[-1, int(n / 2), 2]))
d1 = tf.keras.layers.Lambda(lambda x: tf.reshape(x, shape=[-1, n]))
d2 = tf.keras.layers.Dense(M, activation='relu')
DecOut = tf.keras.layers.Dense(M, activation='softmax')
# noise_std = EbNo_to_noise(TRAINING_SNR)
# custom functions / layers without weights
norm_layer = keras.layers.Lambda(lambda x: tf.divide(x, tf.sqrt(2 * tf.reduce_mean(tf.square(x)))))
shape_layer = keras.layers.Lambda(lambda x: tf.reshape(x, shape=[-1, 2, n]))
shape_layer2 = keras.layers.Lambda(lambda x: tf.reshape(x, shape=[-1, n]))
channel_layer = keras.layers.Lambda(lambda x: x + tf.random.normal(tf.shape(x), mean=0.0, stddev=noise_std))
def EbNo2Sigma(ebnodb):
'''Convert Eb/No in dB to noise standard deviation'''
ebno = 10 ** (ebnodb / 10)
return 1 / np.sqrt(2 * (2 * k / n) * ebno)
def EbNo_to_noise(ebnodb):
'''Transform EbNo[dB]/snr to noise power'''
ebno = 10 ** (ebnodb / 10)
noise_std = 1 / np.sqrt(2 * (2 * k / n) * ebno)
return noise_std
def real_channel(x, noise_std):
# Black-box Channel
# AWGN
return x + tf.random.normal(tf.shape(x), mean=0.0, stddev=noise_std)
def rayleigh_channel(x, noise_std):
return x + tf.sqrt(tf.square(tf.random.normal(tf.shape(x), mean=0.0, stddev=noise_std)) + tf.square(
tf.random.normal(tf.shape(x), mean=0.0, stddev=noise_std)))
# Uniform U(-3;3)
# return x + tf.random_uniform(tf.shape(x), minval=-2, maxval=2)
def B_Ber(input_msg, msg):
'''Calculate the Batch Bit Error Rate'''
pred_error = tf.not_equal(tf.argmax(msg, 1), tf.argmax(input_msg, 1))
bber = tf.reduce_mean(tf.cast(pred_error, tf.float32))
return bber
def random_sample(batch_size=32):
msg = np.random.randint(M, size=batch_size)
return msg
def B_Ber_m(input_msg, msg):
'''Calculate the Batch Bit Error Rate'''
pred_error = tf.not_equal(input_msg, tf.argmax(msg, 1))
bber = tf.reduce_mean(tf.cast(pred_error, tf.float32))
return bber
def SNR_to_noise(snrdb):
'''Transform EbNo[dB]/snr to noise power'''
snr = 10 ** (snrdb / 10)
noise_std = 1 / np.sqrt(2 * snr)
return noise_std
noise_std = EbNo2Sigma(SNR)
print(EbNo2Sigma(SNR))
print(EbNo_to_noise(SNR))
def test_encoding(M=16, n=1):
inp = np.arange(0, M)
coding = encoder.predict(inp)
fig = plt.figure(figsize=(4, 4))
plt.plot(coding[:, 0], coding[:, 1], "b.")
plt.xlabel("$x_1$", fontsize=18)
plt.ylabel("$x_2$", fontsize=18, rotation=0)
plt.grid(True)
plt.gca().set_ylim(-2, 2)
plt.gca().set_xlim(-2, 2)
plt.show()
def get_generator(n):
input1 = tf.keras.layers.Input(shape=(n,))
x1 = tf.keras.layers.Dense(n, kernel_initializer=randN_initial)(input1)
input2 = tf.random.normal(shape=tf.shape(input1))
# input2 =tf.random.normal([tf.shape(input1)[0],n]) r
x2 = tf.keras.layers.Dense(n, kernel_initializer=randN_initial)(input2)
subtracted = tf.keras.layers.Concatenate(1)([x1, x2])
h1 = tf.keras.layers.Dense(64, use_bias=True, activation='relu')(subtracted)
h2 = tf.keras.layers.Dense(64, use_bias=True, kernel_initializer=ones_initial, activation='relu')(h1)
out = tf.keras.layers.Dense(n, use_bias=True, kernel_initializer=ones_initial, activation='linear')(h2)
generator = tf.keras.models.Model(inputs=[input1], outputs=out)
return generator
def get_discriminator(n):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(32, use_bias=True, activation='relu', input_shape=((2 * n,))))
model.add(tf.keras.layers.Dense(32, use_bias=True, activation='relu'))
model.add(tf.keras.layers.Dense(1, use_bias=False, activation='sigmoid'))
return model
generator = get_generator(n)
discriminator = get_discriminator(n)
def get_gan_encoder(M):
model = keras.models.Sequential([
keras.layers.Embedding(M, M, embeddings_initializer='glorot_normal'),
keras.layers.Dense(M * 2, activation="elu"),
# keras.layers.Dense(M*2, activation="elu"),
keras.layers.Dense(n, kernel_initializer=ones_initial, activation=None),
e2,
EncOut,
GenIn])
return model
def get_gan_decoder(M):
model = keras.models.Sequential([
# DecIn,
# d1,
keras.layers.Input(shape=(n,)),
keras.layers.Dense(M * 2, activation="elu"),
# keras.layers.Dense(M*2, activation="elu"),
keras.layers.Dense(M, kernel_initializer=ones_initial, activation="softmax")
])
return model
# optimizers
w_gen_optimizer = tf.keras.optimizers.RMSprop() # RMSprop(0.0001)#(0.0001, beta_1=0.5,beta_2 =0.1) # works as well
w_disc_optimizer = tf.keras.optimizers.RMSprop() # (0.0001, beta_1=0.5,beta_2 =0.1) #(0.001)# train the model
encoder = get_gan_encoder(M)
decoder = get_gan_decoder(M)
@tf.function
def train(batch_size):
gen_gradients, disc_gradients = compute_gradients(batch_size)
apply_gradients(gen_gradients, disc_gradients)
@tf.function
def compute_gradients(batch_size):
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
disc_loss, gen_loss = compute_loss(batch_size)
# tf.print("gen_loss:", gen_loss)
disc_gradients = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
gen_gradients = gen_tape.gradient(gen_loss, generator.trainable_variables)
return gen_gradients, disc_gradients
def compute_loss(batch_size):
""" passes through the network and computes loss
"""
### pass through network
# generating noise from a uniform distribution
####Mein noise ist anders als hier
gradient_penalty_weight = 0.1 # 0.5
# print(x_samp)
# run noise through generator
m = random_sample(batch_size)
r = encoder(m)
real_data = tf.concat(values=[real_channel(r, noise_std), r], axis=1)
fake_data = tf.concat(values=[generator(r), r], axis=1)
# discriminate x and x_gen
logits_x = discriminator(real_data)
logits_x_gen = discriminator(fake_data)
# gradient penalty
d_regularizer = gradient_penalty(real_data, fake_data)
### losses
disc_loss = (tf.reduce_mean(logits_x) - tf.reduce_mean(logits_x_gen) + d_regularizer * gradient_penalty_weight)
# losses of fake with label "1"
gen_loss = tf.reduce_mean(logits_x_gen)
return disc_loss, gen_loss
def apply_gradients(gen_gradients, disc_gradients):
w_gen_optimizer.apply_gradients(zip(gen_gradients, generator.trainable_variables))
w_disc_optimizer.apply_gradients(zip(disc_gradients, discriminator.trainable_variables))
def gradient_penalty(x, x_gen):
epsilon = tf.random.uniform([x.shape[0], 1, 1, 1], 0.0, 1.0)
# epsilon = tf.random.uniform(shape = x.shape, minval= 0.0, maxval= 1.0)
x_hat = epsilon * x + (1 - epsilon) * x_gen
with tf.GradientTape() as t:
t.watch(x_hat)
d_hat = discriminator(x_hat)
gradients = t.gradient(d_hat, x_hat)
ddx = tf.sqrt(tf.reduce_sum(gradients ** 2, axis=[1, 2]))
d_regularizer = tf.reduce_mean((ddx - 1.0) ** 2)
# tf.print("gradient_penalty")
# tf.print(d_regularizer)
return d_regularizer
def generate_evaluation_data(batch_size=100):
x = tf.random.normal((batch_size, n), dtype=tf.dtypes.float32) # randomly sample input data ("fake" AE messages)
x = x / tf.sqrt(2 * tf.reduce_mean(
tf.square(x))) # Average power normalization (not required if standard normal distribution is used )
fake_eval_data = generator([x])
real_eval_data = real_channel(x, noise_std) # tf.concat(values=[real_channel(x),x], axis=1)
inputs = x
return real_eval_data, fake_eval_data, inputs
def get_evaluation_data(evaluation_per_epochs=100):
real_eval_data = []
fake_eval_data = []
inputs = []
for i in range(evaluation_per_epochs):
data = generate_evaluation_data()
real_eval_data.append(data[0])
fake_eval_data.append(data[1])
inputs.append(data[2])
return real_eval_data, fake_eval_data, inputs
def test_eval(real_eval_data, fake_eval_data, inputs):
hist_range = 1
fake_output_hist = np.mean(fake_eval_data, axis=0) # Changed from 0 to 1
real_output_hist = np.mean(real_eval_data, axis=0)
inputs_hist = np.mean(inputs, axis=0)
fake_output_hist1 = np.reshape(fake_output_hist, [-1, ])
real_output_hist1 = np.reshape(real_output_hist, [-1, ])
plt.hist(fake_output_hist1, bins=100, range=(-hist_range, hist_range), density=True, histtype='step')
plt.hist(real_output_hist1, bins=100, range=(-hist_range, hist_range), density=True, histtype='step')
plt.title("noise distribution")
plt.legend(["generator", "target"])
plt.show()
losses = pd.DataFrame(columns=['disc_loss', 'gen_loss'])
def gen_train(n_epochs, batch_size):
generator.trainable = True
start = time.time()
for epoch in range(n_epochs):
x = tf.random.normal((batch_size, n), dtype=tf.dtypes.float32)
x_samp = x / tf.sqrt(2 * tf.reduce_mean(tf.square(x)))
train(batch_size)
# test on holdout
loss = []
if epoch % 500 == 0:
real_c = tf.concat(values=[real_channel(x, noise_std), x], axis=1)
fake_c = generator(x)
real_eval_data, fake_eval_data, inputs = get_evaluation_data()
test_eval(real_eval_data, fake_eval_data, inputs)
tf.print(fake_c[0])
# tf.print(disc_loss, gen_loss)
real_data = tf.concat(values=[real_channel(x, noise_std), x], axis=1)
loss.append(compute_loss(batch_size))
losses.loc[len(losses)] = np.mean(loss, axis=0)
if epoch % 100 == 0:
print(
"Epoch: {} | disc_loss: {} | gen_loss: {}".format(
epoch, losses.disc_loss.values[-1], losses.gen_loss.values[-1]
))
tf.saved_model.save(generator, '/tmp/saved_model/')
plt.plot(losses.disc_loss.values)
generator.trainable = False
def creating_and_train_gan(epochs, n_steps, batch_size, SNR_level, n): # optional Leraning Rates
generator.trainable = True
train_gan(epochs, n_steps, batch_size, SNR_level)
# 4 after GAN training
# generator.trainable = False
# tf.print(generator.trainable)
def gan_Test_AE(data):
'''Calculate Bit Error for varying SNRs'''
snr_range = np.linspace(0, 15, 31)
bber_vec = [None] * len(snr_range)
for db in range(len(snr_range)):
noise_std = EbNo_to_noise(snr_range[db])
code_word = encoder(data)
rcvd_word = code_word + tf.random.normal(tf.shape(code_word), mean=0.0, stddev=noise_std)
dcoded_msg = decoder(rcvd_word)
bber_vec[db] = B_Ber_m(data, dcoded_msg)
if (db % 6 == 0) & (db > 0):
print(f'Progress: {db} of {30} parts')
return (snr_range, bber_vec)
def Test_AE_rayleigh(data):
'''Calculate Bit Error for varying SNRs'''
snr_range = np.linspace(0, 15, 31)
bber_vec = [None] * len(snr_range)
for db in range(len(snr_range)):
noise_std = EbNo_to_noise(snr_range[db])
code_word = encoder(data)
rcvd_word = rayleigh_channel(code_word, noise_std)
dcoded_msg = decoder(rcvd_word)
bber_vec[db] = B_Ber_m(data, dcoded_msg)
if (db % 6 == 0) & (db > 0):
print(f'Progress: {db} of {30} parts')
return (snr_range, bber_vec)
# Resiver Training
decoder = get_gan_decoder(M)
encoder = get_gan_encoder(M)
channel_layer = keras.layers.Lambda(lambda x: real_channel(x, noise_std))
# Decoder Training
data = random_sample(100)
loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_object_a = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
decoder_optimizer = tf.keras.optimizers.Adam()
def decoder_loss(decoder, x, y):
x_ = encoder(x)
y1 = channel_layer(x_)
# y1 = generator(x_)
y_ = decoder(y1)
return loss_object(y_true=y, y_pred=y_)
# def decoder_loss2(decoder, x, y):
# x_ = encoder(x)
# y1 = generator(x_)
# y_ = decoder(y1)
# return loss_object(y_true=y, y_pred=y_)
@tf.function
def decoder_grad(model, inputs, targets):
with tf.GradientTape() as decoder_tape:
loss_value = decoder_loss(model, inputs, targets)
return loss_value, decoder_tape.gradient(loss_value, model.trainable_variables)
# @tf.function
# def decoder_grad2(model, inputs, targets):
# with tf.GradientTape() as decoder_tape:
# loss_value = decoder_loss2(model, inputs, targets)
# return loss_value, decoder_tape.gradient(loss_value, model.trainable_variables)
# Keep results for plotting
train_loss_results = []
train_accuracy_results = []
def decoder_training(num_epochs, batch_size):
for epoch in range(num_epochs):
epoch_loss_avg = tf.keras.metrics.Mean()
epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
data = random_sample(batch_size)
loss_value = decoder_step_training()
epoch_loss_avg(loss_value) # Add current batch loss
# Compare predicted label to actual label
epoch_accuracy(data, decoder(channel_layer(encoder(data))))
train_loss_results.append(epoch_loss_avg.result())
train_accuracy_results.append(epoch_accuracy.result())
if epoch % 500 == 0:
print("Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(epoch,
epoch_loss_avg.result(),
epoch_accuracy.result()))
@tf.function
def decoder_step_training():
loss_value, decoder_grads = decoder_grad(decoder, data, data)
decoder_optimizer.apply_gradients(zip(decoder_grads, decoder.trainable_variables))
return loss_value
# @tf.function
# def decoder_step_training2():
# loss_value, decoder_grads = decoder_grad2(decoder, data, data)
# decoder_optimizer.apply_gradients(zip(decoder_grads, decoder.trainable_variables))
# return loss_value
# Transmitter Training
encoder_optimizer = tf.keras.optimizers.Adam()
def encoder_loss(encoder, x, y):
x_ = encoder(x)
y1 = generator(x_)
y_ = decoder(y1)
return loss_object_a(y_true=y, y_pred=y_)
@tf.function
def encoder_grad(model, inputs, targets):
with tf.GradientTape() as encoder_tape:
loss_value = encoder_loss(model, inputs, targets)
return loss_value, encoder_tape.gradient(loss_value, model.trainable_variables)
loss_value, encoder_grads = encoder_grad(encoder, data, data)
print("Step: {}, Initial Loss: {}".format(encoder_optimizer.iterations.numpy(),
loss_value.numpy()))
encoder_optimizer.apply_gradients(zip(encoder_grads, encoder.trainable_variables))
print("Step: {}, Loss: {}".format(encoder_optimizer.iterations.numpy(),
encoder_loss(encoder, data, data).numpy()))
# Keep results for plotting
train_loss_results = []
train_accuracy_results = []
def encoder_training(num_epochs, batch_size=400):
for epoch in range(num_epochs):
epoch_loss_avg = tf.keras.metrics.Mean()
epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
data = random_sample(batch_size)
loss_value = encoder_step_training()
# loss_value, encoder_grads = encoder_grad(encoder, data, data)
# encoder_optimizer.apply_gradients(zip(encoder_grads, encoder.trainable_variables))
epoch_loss_avg(loss_value) # Add current batch loss
# Compare predicted label to actual label
epoch_accuracy(data, decoder(generator(encoder(data))))
# train_loss_results.append(epoch_loss_avg.result())
# train_accuracy_results.append(epoch_accuracy.result())
if epoch % 500 == 0:
print("Epoch {:03d}: Loss: {:.3f}, Accuracy: {:.3%}".format(epoch,
epoch_loss_avg.result(),
epoch_accuracy.result()))
@tf.function
def encoder_step_training():
loss_value, encoder_grads = encoder_grad(encoder, data, data)
encoder_optimizer.apply_gradients(zip(encoder_grads, encoder.trainable_variables))
return loss_value
% % time
start = time.time()
for i in range(3):
test_encoding(M, n)
decoder_training(2001, 100)
gen_train(2001, 100)
encoder_training(1001, 100)
test_encoding(M, n)
decoder_training(1001, 100)
gen_train(1001, 100)
print("Round", i)
time_to_train_gan = time.time() - start
tf.print('Time for the training is {} sec,'.format(time.time() - start))
% % time
generator.trainable = False
encoder.trainable = True
decoder.trainable = True
gan_AE = tf.keras.models.Sequential([encoder, generator, decoder])
data = random_sample(10000000)
start = time.time()
gan_AE.compile(optimizer=keras.optimizers.Nadam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
history = gan_AE.fit(data, data, batch_size=500, steps_per_epoch=400, epochs=10)
test_encoding(M, n)
# AE training
# test msg sequence for normal encoding
N_test = 500000
test_msg = np.random.randint(M, size=N_test)
# Approximate 16 QAM Error
def SIXT_QAM_sim(ebno):
return (3.0 / 2) * special.erfc(np.sqrt((4.0 / 10) * 10. ** (ebno / 10)))
ebnodbs = np.linspace(0, 15, 16)
fig = plt.figure(figsize=(8, 5))
plt.semilogy(gan_bber_data[0], gan_bber_data[1], '^-')
plt.semilogy(ebnodbs, SIXT_QAM_sim(ebnodbs), '*-');
plt.gca().set_ylim(1e-5, 1)
plt.gca().set_xlim(0, 15)
plt.ylabel("Batch Symbol Error Rate", fontsize=14, rotation=90)
plt.xlabel("SNR [dB]", fontsize=18)
plt.legend(['AE with GAN', '16QAM'],
prop={'size': 14}, loc='upper right');
plt.grid(True, which="both")
# print('time to train the AE Model with GAN',time_to_train_gan)
def SIXT_QAM_sim(ebno):
return (3.0 / 2) * special.erfc(np.sqrt((4.0 / 10) * 10. ** (ebno / 10)))
ebnodbs = np.linspace(0, 15, 16)
fig = plt.figure(figsize=(8, 5))
plt.semilogy(gan_bber_data[0], gan_bber_data[1], 'o-')
plt.semilogy(bber_data_rayleigh[0], bber_data_rayleigh[1], '+-')
# plt.semilogy(ebnodbs, SIXT_QAM_sim(ebnodbs), '^-');
plt.gca().set_ylim(1e-5, 1)
plt.gca().set_xlim(0, 15)
plt.ylabel("Batch Symbol Error Rate", fontsize=14, rotation=90)
plt.xlabel("SNR [dB]", fontsize=18)
plt.legend(['BLER for Gauss', 'BLER for Rayleigh'],
prop={'size': 14}, loc='upper right');
plt.grid(True, which="both")
# plt.savefig('home/ben/Downloads/MineRayleigh.eps', format='eps')
接下来是实现WGAN-GP的基本代码(这个可能不是很靠谱,是按照卷积神经网络做的):
```python
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
# 定义超参数
batch_size = 64
seq_length = 100 # 序列长度
feature_num = 5 # 每个时间点的特征数量
latent_dim = 128 # 隐空间维度
epochs = 10000
n_critic = 5 # 每训练n次判别器后训练一次生成器
gp_weight = 10.0 # 梯度惩罚权重
# 构建生成器
def build_generator():
model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_dim=latent_dim))
model.add(layers.Reshape((1, 256)))
model.add(layers.Conv1DTranspose(128, 4, strides=2, padding='same'))
model.add(layers.ReLU())
model.add(layers.Conv1DTranspose(64, 4, strides=2, padding='same'))
model.add(layers.ReLU())
model.add(layers.Conv1DTranspose(feature_num, 4, strides=2, padding='same', activation='tanh'))
return model
# 构建判别器
def build_discriminator():
model = models.Sequential()
model.add(layers.Conv1D(64, 4, strides=2, padding='same', input_shape=[seq_length, feature_num]))
model.add(layers.LeakyReLU(alpha=0.2))
model.add(layers.Dropout(0.3))
model.add(layers.Conv1D(128, 4, strides=2, padding='same'))
model.add(layers.LeakyReLU(alpha=0.2))
model.add(layers.Dropout(0.3))
model.add(layers.Flatten())
model.add(layers.Dense(1))
return model
generator = build_generator()
discriminator = build_discriminator()
# 损失函数与优化器
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
# 判别器损失
def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss
# 生成器损失
def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)
generator_optimizer = tf.keras.optimizers.Adam(1e-4, beta_1=0.5, beta_2=0.9)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4, beta_1=0.5, beta_2=0.9)
# 训练步骤
@tf.function
def train_step(real_data):
for _ in range(n_critic):
noise = tf.random.normal([batch_size, latent_dim])
with tf.GradientTape() as disc_tape:
generated_data = generator(noise, training=True)
real_output = discriminator(real_data, training=True)
fake_output = discriminator(generated_data, training=True)
disc_loss = discriminator_loss(real_output, fake_output)
# 添加梯度惩罚
alpha = tf.random.uniform(shape=[batch_size, 1, 1], minval=0., maxval=1.)
interpolated = real_data + alpha * (generated_data - real_data)
with tf.GradientTape() as gp_tape:
gp_tape.watch(interpolated)
pred = discriminator(interpolated, training=True)
grads = gp_tape.gradient(pred, [interpolated])[0]
norm = tf.sqrt(tf.reduce_sum(tf.square(grads), axis=[1, 2]))
gradient_penalty = tf.reduce_mean((norm - 1.) ** 2)
disc_loss += gp_weight * gradient_penalty
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
noise = tf.random.normal([batch_size, latent_dim])
with tf.GradientTape() as gen_tape:
generated_data = generator(noise, training=True)
fake_output = discriminator(generated_data, training=True)
gen_loss = generator_loss(fake_output)
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
# 准备数据
# 这里假设有data作为输入数据,形状为[样本数, seq_length, feature_num]
# data = ...
# 训练循环
for epoch in range(epochs):
for i in range(0, len(data), batch_size):
batch = data[i:i+batch_size]
train_step(batch)
# 可以在这里添加打印损失等信息
```
这个例子提供了一个基本的框架,你可以根据具体的需求调整网络结构、超参数等。请注意,实际应用中可能需要更多的工程实践来优化模型性能,比如数据预处理、学习率衰减策略等。