Top  ---  Bottom

# 1、Batch Normalization

## 1.1 数据准备

``````import torch
BatchNorm2d = torch.nn.BatchNorm2d
test = torch.rand((1,3,2,2))
``````

## 1.2 数据展示

``````test[0,:,:,:]
``````
``````tensor([[[6.7027e-01, 5.3149e-01],
[4.6797e-01, 3.1028e-02]],

[[4.1371e-01, 1.2022e-04],
[2.3150e-01, 2.5120e-01]],

[[5.2258e-01, 9.6350e-02],
[4.6467e-01, 3.6091e-01]]])
``````

## 1.3 BN转化

``````BatchNorm2d(3)(test)
Out:
tensor([[[[ 1.0252e+00,  4.4465e-01],
[ 1.7894e-01, -1.6488e+00]],

[[ 1.2858e+00, -1.5194e+00],
[ 4.9993e-02,  1.8359e-01]],

[[ 9.8744e-01, -1.6194e+00],
[ 6.3328e-01, -1.3302e-03]]]], grad_fn=<NativeBatchNormBackward>)
``````
``````(test[:,0,:,:] - test[:,0,:,:].numpy().mean()) / test[0,0,:,:].numpy().std()
Out:
tensor([[ 1.0253,  0.4447],
[ 0.1790, -1.6489]])
``````

``````test = torch.rand((10,3,2,2))
BatchNorm2d(3)(test)[0,0,:,:]
Out:
tensor([[ 1.6257,  0.5479],
[-1.3761,  0.8000]], grad_fn=<SliceBackward>)
``````
``````res = (test[:,0,:,:] - test[:,0,:,:].numpy().mean()) / test[:,0,:,:].numpy().std()
res[0,:,:]
Out:
tensor([[ 1.6258,  0.5479],
[-1.3762,  0.8000]])
``````

Top  ---  Bottom

# 2、Layer Normalization

LN(Layer Normalization)也是做标准化，但是它不是在样本间，标准化的数据采集只会在单个样本内。

``````>>> input = torch.randn(20, 5, 10, 10)
>>> # With Learnable Parameters
>>> m = nn.LayerNorm(input.size()[1:])
>>> # Without Learnable Parameters
>>> m = nn.LayerNorm(input.size()[1:], elementwise_affine=False)
>>> # Normalize over last two dimensions
>>> m = nn.LayerNorm([10, 10])
>>> # Normalize over last dimension of size 10
>>> m = nn.LayerNorm(10)
>>> # Activating the module
>>> output = m(input)
``````

## 2.1 数据展示

``````squence = torch.rand((2,3,10))
squence
Out:
tensor([[[0.1151, 0.9571, 0.5986, 0.4692, 0.7029, 0.5159, 0.4494, 0.9428,
0.9714, 0.9938],
[0.6456, 0.5997, 0.7542, 0.7266, 0.7021, 0.2900, 0.7044, 0.1627,
0.3725, 0.9454],
[0.9398, 0.3861, 0.5276, 0.8783, 0.8319, 0.1181, 0.6185, 0.9689,
0.6393, 0.7770]],

[[0.2786, 0.8901, 0.7228, 0.3740, 0.4186, 0.6857, 0.8438, 0.4762,
0.4106, 0.4823],
[0.5199, 0.7644, 0.2987, 0.3745, 0.6000, 0.7266, 0.0854, 0.1954,
0.5413, 0.1656],
[0.5487, 0.2655, 0.9256, 0.7352, 0.4081, 0.8017, 0.7130, 0.5364,
0.5441, 0.8483]]])
``````

## 2.2 指定一个维度

``````LN = torch.nn.LayerNorm
LN(10)(squence)
Out:
tensor([[[-1.9932,  1.0227, -0.2616, -0.7251,  0.1120, -0.5578, -0.7961,
0.9712,  1.0739,  1.1540],
[ 0.2423,  0.0411,  0.7180,  0.5971,  0.4899, -1.3160,  0.5000,
-1.8739, -0.9546,  1.5561],
[ 1.0619, -1.1060, -0.5519,  0.8214,  0.6396, -2.1551, -0.1960,
1.1760, -0.1146,  0.4246]],

[[-1.3968,  1.6568,  0.8218, -0.9200, -0.6974,  0.6363,  1.4258,
-0.4100, -0.7372, -0.3793],
[ 0.4093,  1.4885, -0.5673, -0.2324,  0.7629,  1.3218, -1.5084,
-1.0233,  0.5037, -1.1548],
[-0.4265, -1.8654,  1.4884,  0.5210, -1.1411,  0.8590,  0.4084,
-0.4893, -0.4500,  1.0955]]], grad_fn=<NativeLayerNormBackward>)
``````
``````(squence[0,0,:] - squence[0,0,:].numpy().mean()) / squence[0,0,:].numpy().std()
Out:
tensor([-1.9934,  1.0227, -0.2617, -0.7252,  0.1120, -0.5578, -0.7961,
0.9713,  1.0740,  1.1540])
``````

## 2.3 指定两个维度

``````squence2 = torch.rand((2,2,7))
LN([2,7])(squence2)
Out:
tensor([[[-0.1525, -0.3791,  1.9005,  0.9187, -1.2562, -0.9069,  0.4788],
[-0.9507, -0.5147, -1.1867,  1.9212,  0.4739, -0.4837,  0.1374]],

[[-0.4490, -1.2532,  1.2571, -0.7904, -0.7550, -1.0003,  0.2586],
[ 1.2673, -0.8106, -0.2374,  1.4318,  0.0237,  1.8428, -0.7854]]],
grad_fn=<NativeLayerNormBackward>)
``````
``````(squence2[0,:,:] - squence2[0,:,:].numpy().mean()) / squence2[0,:,:].numpy().std()
Out:
tensor([[-0.1525, -0.3791,  1.9006,  0.9188, -1.2563, -0.9070,  0.4788],
[-0.9508, -0.5148, -1.1867,  1.9214,  0.4739, -0.4838,  0.1374]])
``````

Top  ---  Bottom