m/2018/yohei/experiments/transfer_learning
をテンプレートにして作成
[
トップ
] [
新規
|
一覧
|
単語検索
|
最終更新
|
ヘルプ
|
ログイン
]
開始行:
CENTER:&size(30){''Transfer Learning''};
#br
----
#contents
----
#br
- コード: https://github.com/haru-256/comparison
- Azureと,tlab 1-602のローカルマシンとの性能比較実験
#br
* 性能条件 [#o1ca66aa]
[[cuda compute capability:https://developer.nvidia.com/cu...
[[Linuxでコマンドラインからマシンスペックを確認する方法:h...
** tortoise7 [#qfdc0d69]
[[m/2018/yohei/config#w702bc5c]]
-CPU: Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz
#pre{{
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
$ cudnn_version
CUDNN_VERSION: 7005
}}
- cuDNNのバージョン確認方法: http://stmind.hatenablog.com...
** rinshpc01[#m733d198]
- CPU Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
#pre{{
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
$ cudnn_version
CUDNN_VERSION: 7005
}}
* ResNet [#b1875e1a]
- [[''[Kaiming He 2015/12]'' Deep Residual Learning for I...
-- Kaiming He さんや.Heの初期値以来の再会です.
- ディープラーニング ResNet のヒミツ: http://terada-h.hat...
- Residual Network(ResNet)の理解とチューニングのベストプ...
:https://deepage.net/deep_learning/2016/11/30/resnet.html
- ResNetの仕組み: https://www.slideshare.net/KotaNagasato...
- 機械学習論文読み:Deep Residual Learning for Image Reco...
** 実験で使うResNetの詳細 [#h1523bb5]
実験では[[Pytorchが公式で取り扱っているResNet18:https://p...
- ResNet18
#pre{{
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2...
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=...
(relu): ReLU(inplace)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1...
(layer1): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
(relu): ReLU(inplace)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
(relu): ReLU(inplace)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
)
)
(layer2): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride...
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(...
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
)
)
(layer3): Sequential(
(0): BasicBlock(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=...
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
)
)
(1): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
)
)
(layer4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=...
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
)
)
(1): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
)
)
(avgpool): AvgPool2d(kernel_size=7, stride=1, padding=0)
(fc): Linear(in_features=512, out_features=1000, bias=T...
)
}}
出力層が Linear(in_features=512, out_features=1000, bias=...
出力クラスをFood101に合わせる.
* データ [#pd11fc3b]
データは[[Food101:https://www.vision.ee.ethz.ch/datasets_...
** 概要 [#mfc46d4c]
- 以下公式より
>On purpose, the training images were not cleaned, and th...
- 学習データ: 75750
-- この学習データから一割を検証データとした.その際,クラ...
- テストデータ: 25250
** データの前処理 [#yde689c7]
今回の実験で扱うような,ImageNetで学習させたモデルを使う...
>All pre-trained models expect input images normalized in...
#pre{{
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.40...
std=[0.229, 0.224, 0.225])
}}
よってデータにも同様の前処理を行った.
実際の前処理は以下のようにした.これはPytorchの公式のtuto...
- 学習データ
#pre{{
trainTransform = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
}}
- 検証データ & テストデータ
#pre{{
transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], ...
}}
* 実験 [#zdde406c]
seed値を3つ変え,各seed値で実行時間の計測を行い,平均値を...
** 実験条件 [#pfe9adf7]
- seed: 0, 1, 2
- epoch: 30
- batch size: 128
** tortoise7[#u335f531]
以下の全ての値は,30 Epoch のうち,Val の値が最も良かった...
Elapsed Timeは,学習のループ(epochについてのfor文)内につ...
学習前の変数の初期化や,検証データセットの作成等を含めた...
|seed|Epoch|Train Loss|Train Acc|Val Loss|Val Acc|Test Lo...
|0|29|0.9391|0.7533|0.9768|0.7516|0.7002|0.8071|2:35:24.6...
|1|30|0.9278|0.7567|0.9445|0.7498|0.6990|0.8086|2:35:16.8...
|2|30|0.9229|0.7580|0.9520|0.7549|0.6995|0.8069|2:35:01.9...
** rinshpc01 [#v93e2750]
以下のTrain, Val, Test の値は,100Epoch のうち,Val の値...
Elapsed Timeは,学習のループ(epochについてのfor文)内につ...
学習前の変数の初期化や,検証データセットの作成等を含めた...
|seed|Epoch|Train Loss|Train Acc|Val Loss|Val Acc|Test Lo...
|0| | | | | | | | | |
|1| | | | | | | | | |
|2| | | | | | | | | |
- 2018/08/03に5epochほど実験してみたが,終了予告時間が
tortoise07の方が30分ほど早かった.この時,学習中においてG...
-- 追記 2018-08-06 chainerでの実装も同様にGPUが使われてい...
終了行:
CENTER:&size(30){''Transfer Learning''};
#br
----
#contents
----
#br
- コード: https://github.com/haru-256/comparison
- Azureと,tlab 1-602のローカルマシンとの性能比較実験
#br
* 性能条件 [#o1ca66aa]
[[cuda compute capability:https://developer.nvidia.com/cu...
[[Linuxでコマンドラインからマシンスペックを確認する方法:h...
** tortoise7 [#qfdc0d69]
[[m/2018/yohei/config#w702bc5c]]
-CPU: Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz
#pre{{
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
$ cudnn_version
CUDNN_VERSION: 7005
}}
- cuDNNのバージョン確認方法: http://stmind.hatenablog.com...
** rinshpc01[#m733d198]
- CPU Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
#pre{{
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
$ cudnn_version
CUDNN_VERSION: 7005
}}
* ResNet [#b1875e1a]
- [[''[Kaiming He 2015/12]'' Deep Residual Learning for I...
-- Kaiming He さんや.Heの初期値以来の再会です.
- ディープラーニング ResNet のヒミツ: http://terada-h.hat...
- Residual Network(ResNet)の理解とチューニングのベストプ...
:https://deepage.net/deep_learning/2016/11/30/resnet.html
- ResNetの仕組み: https://www.slideshare.net/KotaNagasato...
- 機械学習論文読み:Deep Residual Learning for Image Reco...
** 実験で使うResNetの詳細 [#h1523bb5]
実験では[[Pytorchが公式で取り扱っているResNet18:https://p...
- ResNet18
#pre{{
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2...
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=...
(relu): ReLU(inplace)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1...
(layer1): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
(relu): ReLU(inplace)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
(relu): ReLU(inplace)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=...
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, aff...
)
)
(layer2): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride...
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(...
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, af...
)
)
(layer3): Sequential(
(0): BasicBlock(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=...
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
)
)
(1): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, af...
)
)
(layer4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=...
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
)
)
(1): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), strid...
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
(relu): ReLU(inplace)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), strid...
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, af...
)
)
(avgpool): AvgPool2d(kernel_size=7, stride=1, padding=0)
(fc): Linear(in_features=512, out_features=1000, bias=T...
)
}}
出力層が Linear(in_features=512, out_features=1000, bias=...
出力クラスをFood101に合わせる.
* データ [#pd11fc3b]
データは[[Food101:https://www.vision.ee.ethz.ch/datasets_...
** 概要 [#mfc46d4c]
- 以下公式より
>On purpose, the training images were not cleaned, and th...
- 学習データ: 75750
-- この学習データから一割を検証データとした.その際,クラ...
- テストデータ: 25250
** データの前処理 [#yde689c7]
今回の実験で扱うような,ImageNetで学習させたモデルを使う...
>All pre-trained models expect input images normalized in...
#pre{{
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.40...
std=[0.229, 0.224, 0.225])
}}
よってデータにも同様の前処理を行った.
実際の前処理は以下のようにした.これはPytorchの公式のtuto...
- 学習データ
#pre{{
trainTransform = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
}}
- 検証データ & テストデータ
#pre{{
transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], ...
}}
* 実験 [#zdde406c]
seed値を3つ変え,各seed値で実行時間の計測を行い,平均値を...
** 実験条件 [#pfe9adf7]
- seed: 0, 1, 2
- epoch: 30
- batch size: 128
** tortoise7[#u335f531]
以下の全ての値は,30 Epoch のうち,Val の値が最も良かった...
Elapsed Timeは,学習のループ(epochについてのfor文)内につ...
学習前の変数の初期化や,検証データセットの作成等を含めた...
|seed|Epoch|Train Loss|Train Acc|Val Loss|Val Acc|Test Lo...
|0|29|0.9391|0.7533|0.9768|0.7516|0.7002|0.8071|2:35:24.6...
|1|30|0.9278|0.7567|0.9445|0.7498|0.6990|0.8086|2:35:16.8...
|2|30|0.9229|0.7580|0.9520|0.7549|0.6995|0.8069|2:35:01.9...
** rinshpc01 [#v93e2750]
以下のTrain, Val, Test の値は,100Epoch のうち,Val の値...
Elapsed Timeは,学習のループ(epochについてのfor文)内につ...
学習前の変数の初期化や,検証データセットの作成等を含めた...
|seed|Epoch|Train Loss|Train Acc|Val Loss|Val Acc|Test Lo...
|0| | | | | | | | | |
|1| | | | | | | | | |
|2| | | | | | | | | |
- 2018/08/03に5epochほど実験してみたが,終了予告時間が
tortoise07の方が30分ほど早かった.この時,学習中においてG...
-- 追記 2018-08-06 chainerでの実装も同様にGPUが使われてい...
ページ名: