
生成式对抗网络(GAN)
生成式对抗网络(GAN, Generative Adversarial Networks )是一种深度学习模型,是近年来复杂分布上无监督学习最具前景的方法之一。模型通过框架中(至少)两个模块:生成模型(Generative Model)和判别模型(Discriminative Model)的互相博弈学习产生相当好的输出。原始 GAN 理论中,并不要求 G 和 D 都是神经网络,只需要是能拟合相应生成和判别的函数即可。但实用中一般均使用深度神经网络作为 G 和 D 。一个优秀的GAN应用需要有良好的训练方法,否则可能由于神经网络模型的自由性而导致输出不理想。

简单来说,GAN由俩个模块组成,生成模型以及判别模型,生成器将生成假的图片,并欺骗判别器让它误以为真。而判别器则会筛选生成器生成的图片,从中检测真伪。这个有趣的对抗概念是由Ian Goodfellow在2014年提出的。已经有很多学习GAN的资源,因此,为了避免重复,我将不再解释GAN。
StyleGAN(2.0)
NVIDIA在2018年发布了StyleGAN论文“基于GAN样式的体系结构”。该论文提出了GAN的新生成器体系结构,该体系结构使他们可以调节粗略细节(如头部形状)或精细的细节(例如,眼睛颜色)
StyleGAN还结合了Progressive GAN的想法,该网络最初以较低的分辨率(4x4)进行网络训练,然后在稳定后逐渐添加较大的层。这样,训练时间变得更快,并且训练也更加稳定。

生成动漫人物
我将使用Aaron Gokaslan预先训练好的Anime Style2,以便我们可以立即加载模型并生成动漫面孔。因此,打开您的Jupyter笔记本或Google Colab,让我们开始编码。
注意 : 需要科学上网,如果你不知道如何使用 COLAB ,请移步至此
首先,我们需要克隆styleGAN存储库。
$ git clone https://github.com/NVlabs/stylegan2.git #如果你使用的是COLAB,请用 ! 代替 $ ! git clone https://github.com/NVlabs/stylegan2.git
接下来,我们需要下载之前提到的,已训练完毕的模型,COLAB需要设置为GPU运行。

%tensorflow_version 1.x
import tensorflow as tf
# Make sure you use tensoflow version 1
print('Tensorflow version: {}'.format(tf.__version__) )
# Use '%' prefix in colab or run this in command line
%cd /content/stylegan2
import pretrained_networks
from google_drive_downloader import GoogleDriveDownloader as gdd
# Links to the pre-trained anime StyleGAN weights, you can copy the file to your own drive, if it's over the download limit
url = 'https://drive.google.com/open?id=1WNQELgHnaqMTq3TlrnDaVkyrAH8Zrjez'
model_id = url.replace('https://drive.google.com/open?id=', '')
network_pkl = '/content/models/model_%s.pkl' % model_id#(hashlib.md5(model_id.encode()).hexdigest())
gdd.download_file_from_google_drive(file_id=model_id,
dest_path=network_pkl)
# It returns 3 networks, we will be mainly using Gs
# _G = Instantaneous snapshot of the generator. Mainly useful for resuming a previous training run.
# _D = Instantaneous snapshot of the discriminator. Mainly useful for resuming a previous training run.
# Gs = Long-term average of the generator. Yields higher-quality results than the instantaneous snapshot.
_G, _D, Gs = pretrained_networks.load_networks(network_pkl)
填入并执行以上代码,即可完成对模型的下载。
现在,我们需要生成随机向量z,以用作生成器的输入。让我们创建一个方法,以从给定的种子生成潜在代码z
import numpy as np
def generate_zs_from_seeds(seeds):
zs = []
for seed_idx, seed in enumerate(seeds):
rnd = np.random.RandomState(seed)
z = rnd.randn(1, *Gs.input_shape[1:]) # [minibatch, component]
zs.append(z)
return zs
最后, 我们可以创建一个方法,该方法采用生成的随机向量z并生成图像。
import dnnlib
import dnnlib.tflib as tflib
import PIL.Image
from tqdm import tqdm
# Get tf noise variables, for the stochastic variation
noise_vars = [var for name, var in Gs.components.synthesis.vars.items() if name.startswith('noise')]
# Trunctation psi value needed for the truncation trick
def generate_images(zs, truncation_psi):
Gs_kwargs = dnnlib.EasyDict()
Gs_kwargs.output_transform = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)
Gs_kwargs.randomize_noise = False
if not isinstance(truncation_psi, list):
truncation_psi = [truncation_psi] * len(zs)
imgs = []
for z_idx, z in tqdm(enumerate(zs)):
Gs_kwargs.truncation_psi = truncation_psi[z_idx]
noise_rnd = np.random.RandomState(1) # fix noise
tflib.set_vars({var: noise_rnd.randn(*var.shape.as_list()) for var in noise_vars}) # [height, width]
images = Gs.run(z, None, **Gs_kwargs) # [minibatch, height, width, channel]
imgs.append(PIL.Image.fromarray(images[0], 'RGB'))
# Return array of PIL.Image
return imgs
def generate_images_from_seeds(seeds, truncation_psi):
return generate_images(generate_zs_from_seeds(seeds), truncation_psi)
好了,接下来让我们生成一组图片尝试一下
# generate 9 random seeds seeds = np.random.randint(10000000, size=9) print(seeds) zs = generate_zs_from_seeds(seeds) imgs = generate_images(zs, 0.5)
#生成了9张图片,使用下标切换显示各个图片 imgs[0]

显示一张图片太麻烦了,我们可以绘制一个九宫格来显示所有图片
from math import ceil
def createImageGrid(images, scale=0.25, rows=1):
w,h = images[0].size
w = int(w*scale)
h = int(h*scale)
height = rows*h
cols = ceil(len(images) / rows)
width = cols*w
canvas = PIL.Image.new('RGBA', (width,height), 'white')
for i,img in enumerate(images):
img = img.resize((w,h), PIL.Image.ANTIALIAS)
canvas.paste(img, (w*(i % cols), h*(i // cols)))
return canvas
createImageGrid(imgs,rows=3)

我们还可以在z向量中获取俩个点,让其进行插值计算,使我们可以查看其过渡效果
#定义方法
def interpolate(zs, steps):
out = []
for i in range(len(zs)-1):
for index in range(steps):
fraction = index/float(steps)
out.append(zs[i+1]*fraction + zs[i]*(1-fraction))
return out
seeds = np.random.randint(10000000, size=2) print(seeds) zs = generate_zs_from_seeds(seeds) imgs = generate_images(interpolate(zs,7), 0.5) createImageGrid(imgs,rows=1)

现在,我们已经完成了插值。我们最终可以尝试在上面的缩略图中制作GIF。我们将使用moviepy库来创建视频或GIF文件。
import scipy
import moviepy.editor
grid_size = [3,3]
duration_sec = 5
smoothing_sec = 1.0
image_zoom = 1
fps = 15
random_seed = np.random.randint(0, 999)
num_frames = int(np.rint(duration_sec * fps))
random_state = np.random.RandomState(random_seed)
# Generate latent vectors
shape = [num_frames, np.prod(grid_size)] + Gs.input_shape[1:] # [frame, image, channel, component]
all_latents = random_state.randn(*shape).astype(np.float32)
all_latents = scipy.ndimage.gaussian_filter(all_latents, [smoothing_sec * fps] + [0] * len(Gs.input_shape), mode='wrap')
all_latents /= np.sqrt(np.mean(np.square(all_latents)))
def create_image_grid(images, grid_size=None):
assert images.ndim == 3 or images.ndim == 4
num, img_h, img_w, channels = images.shape
if grid_size is not None:
grid_w, grid_h = tuple(grid_size)
else:
grid_w = max(int(np.ceil(np.sqrt(num))), 1)
grid_h = max((num - 1) // grid_w + 1, 1)
grid = np.zeros([grid_h * img_h, grid_w * img_w, channels], dtype=images.dtype)
for idx in range(num):
x = (idx % grid_w) * img_w
y = (idx // grid_w) * img_h
grid[y : y + img_h, x : x + img_w] = images[idx]
return grid
# Frame generation func for moviepy.
def make_frame(t):
frame_idx = int(np.clip(np.round(t * fps), 0, num_frames - 1))
latents = all_latents[frame_idx]
fmt = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)
images = Gs.run(latents, None, truncation_psi=0.7,
randomize_noise=False, output_transform=fmt,
minibatch_size=16)
grid = create_image_grid(images, grid_size)
if image_zoom > 1:
grid = scipy.ndimage.zoom(grid, [image_zoom, image_zoom, 1], order=0)
if grid.shape[2] == 1:
grid = grid.repeat(3, 2) # grayscale => RGB
return grid
# Generate video.
video_clip = moviepy.editor.VideoClip(make_frame, duration=duration_sec)
# Use this if you want to generate .mp4 video instead
# 建议使用以下代码输出MP4,GIF文件将近有100MB大小
# video_clip.write_videofile('random_grid_%s.mp4' % random_seed, fps=fps, codec='libx264', bitrate='2M')
video_clip.write_gif('random_grid_%s.gif' % random_seed, fps=fps)
当你成功运行以上代码时,一个GIF或者MP4会成功的保存到你的COLAB目录中,手动查看即可。

恭喜,你已经成功使用STYLEGAN2生成新的动漫图片,你可以继续尝试自己的想法,调试以上代码的所有参数。
更进一步
- 已生成的图片浏览器
- 之后你可以尝试训练自己的模型,以让机器生成不同风格的图片
- 当然,你也可以在GITHUB上寻找其他的数据集(模型)
- 共享的 COLAB 工程
索引
- 文章整理自 Generating Anime Characters with StyleGAN2
- 翻译/整理/调试 BY Kamakoto
- COLAB

Comments | NOTHING