风格迁移（Style Transfer）

最新推荐文章于 2025-05-03 09:29:39 发布

a496knl

最新推荐文章于 2025-05-03 09:29:39 发布

阅读量1k

点赞数 8

CC 4.0 BY-SA版权

文章标签： python

本文链接：https://blog.youkuaiyun.com/a496knl/article/details/145840110

1. 什么是风格迁移（Style Transfer）：

简单介绍风格迁移的概念，指的是将一张图像的内容与另一张图像的艺术风格结合起来，从而生成一个新的图像。例如，将一张风景图像的内容与一幅著名艺术作品（如梵高的《星夜》）的风格结合。

应用场景：风格迁移常用于图像生成、艺术创作和增强现实等领域。
目标：本文将讲解如何使用 PyTorch 和 VGG19 模型实现风格迁移，并展示其核心代码。

2.风格迁移的原理

在这一部分，深入介绍风格迁移的工作原理，帮助读者理解为什么该方法有效。

内容损失与风格损失：风格迁移的核心思想是计算两个损失：
- 内容损失：保留目标图像的内容结构不变。
- 风格损失：保留目标图像的风格元素（如色调、纹理等）。
VGG19 模型的作用：VGG19 是一种常用的预训练卷积神经网络，用于提取图像的高层特征。我们通过它提取图像的内容特征和风格特征，进而计算损失。

3. 环境设置与依赖安装

介绍所需的依赖和环境设置。为了能够运行代码，读者需要：

Python 3
PyTorch 和 torchvision
matplotlib
PIL（Python Imaging Library

pip install torch torchvision matplotlib Pillow

4. 代码实现

这部分是技术博文的核心部分，逐步解析代码。可以分为以下几个小节来讲解：

4.1 设置计算设备

解释为什么需要选择计算设备（CPU 或 GPU），以及如何在 PyTorch 中设置。

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

4.2 定义风格层和内容层

解释什么是“风格层”和“内容层”，这些层是从 VGG19 模型中提取的特定卷积层，用于计算内容损失和风格损失

style_layers = ['0', '5', '10', '19', '28']
content_layers = ['21']

import torch
import torch.optim as optim
import torchvision
from torchvision import transforms
from PIL import Image
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np
import torch.nn.functional as F

# 设置计算设备
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# ==================== 定义特征层 ====================
style_layers = ['0', '5', '10', '19', '28']  # 风格层
content_layers = ['21']  # 内容层


# ==================== 图像加载处理 ====================
def image_loader(image_name, max_size=400):
    loader = transforms.Compose([
        transforms.Resize((max_size, max_size)),  # 固定尺寸
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225])
    ])
    image = Image.open(image_name).convert("RGB")
    image = loader(image).unsqueeze(0)
    return image.to(device, torch.float)


# ==================== 模型加载 ====================
def load_vgg():
    weights = torchvision.models.VGG19_Weights.IMAGENET1K_V1
    vgg = torchvision.models.vgg19(weights=weights).features
    for param in vgg.parameters():
        param.requires_grad_(False)
    return vgg.to(device).eval()


vgg = load_vgg()


# ==================== 特征提取 ====================
def get_features(image, model, layers):
    features = []
    x = image
    for name, layer in model._modules.items():
        x = layer(x)
        if name in layers:
            features.append(x)
    return features


# ==================== 损失计算 ====================
def content_loss(content, target):
    # 调整目标张量的尺寸以匹配输入张量
    if content.shape != target.shape:
        target = F.interpolate(target, size=content.shape[2:], mode='bilinear', align_corners=False)
    return F.mse_loss(content, target)


def gram_matrix(input_tensor):
    a, b, c, d = input_tensor.size()
    features = input_tensor.view(a * b, c * d)
    G = torch.mm(features, features.t())
    return G.div(a * b * c * d)


def style_loss(style, target):
    return F.mse_loss(gram_matrix(style), gram_matrix(target))


# ==================== 优化迁移过程 ====================
def run_style_transfer(content_img, style_img, model,
                       style_layers, content_layers,  # 添加层参数
                       num_steps=300,
                       style_weight=1e6,
                       content_weight=1):
    target = content_img.clone().requires_grad_(True)
    optimizer = optim.Adam([target], lr=0.02)

    # 预提取特征
    with torch.no_grad():
        style_features = get_features(style_img, model, style_layers)
        content_features = get_features(content_img, model, content_layers)

    for step in range(num_steps):
        def closure():
            optimizer.zero_grad()
            target_features = get_features(target, model, style_layers + content_layers)

            # 计算内容损失
            content_loss_value = content_loss(
                target_features[len(style_layers)],
                content_features[0]
            )

            # 计算风格损失
            style_loss_value = 0
            for i in range(len(style_layers)):
                style_loss_value += style_loss(
                    target_features[i],
                    style_features[i]
                )

            total_loss = content_weight * content_loss_value + style_weight * style_loss_value
            total_loss.backward()

            if step % 50 == 0:
                print(f"Step {step}, Total Loss: {total_loss.item():.2f}")
            return total_loss

        optimizer.step(closure)

    return target


# ==================== 主流程 ====================
if __name__ == "__main__":
    # 加载图像
    content_img = image_loader("img_1.png")
    style_img = image_loader("img.png", max_size=content_img.shape[2])

    # 执行风格迁移（传递层参数）
    output_img = run_style_transfer(
        content_img,
        style_img,
        vgg,
        style_layers,  # 传递风格层
        content_layers,  # 传递内容层
        num_steps=300
    )


    # 结果可视化
    def imshow(tensor, save_path="output.jpg"):
        image = tensor.cpu().clone().squeeze(0)
        image = image.permute(1, 2, 0).detach().numpy()
        image = image * np.array([0.229, 0.224, 0.225]) + np.array([0.485, 0.456, 0.406])
        image = np.clip(image, 0, 1)
        plt.imshow(image)
        plt.axis('off')
        plt.savefig(save_path, bbox_inches='tight', pad_inches=0)
        plt.close()


    imshow(output_img)
    print("风格迁移完成，结果已保存为output.jpg")

Copy