國立虎尾科技大學 |

FCN和Segnet語義分割模型優化 = = Optimization of FCN and Segnet Semantic Segmentation Models /

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	FCN和Segnet語義分割模型優化 =/ 林柏宏.
其他題名:	Optimization of FCN and Segnet Semantic Segmentation Models /
其他題名:	Optimization of FCN and Segnet Semantic Segmentation Models.
作者:	林柏宏
出版者:	雲林縣 :國立虎尾科技大學 , : 民113.06.,
面頁冊數:	[7], 48 ,[7]面 :圖 ; : 30公分.;
附註:	指導教授: 陳柏宏.
標題:	深度學習. -
電子資源:	電子資源

FCN和Segnet語義分割模型優化 = = Optimization of FCN and Segnet Semantic Segmentation Models /
林柏宏

FCN和Segnet語義分割模型優化 =Optimization of FCN and Segnet Semantic Segmentation Models /Optimization of FCN and Segnet Semantic Segmentation Models.林柏宏. - 初版. - 雲林縣 :國立虎尾科技大學 ,民113.06. - [7], 48 ,[7]面 :圖 ;30公分.

指導教授: 陳柏宏.

碩士論文--國立虎尾科技大學電子工程系碩士班.

含參考書目.

自從VGG模型在ImageNet挑戰數據集上表現突出，大量投資研究不斷改進卷積神經網路的架構與訓練方法。使得CNN在圖像提取、圖像分割及自然語言處理任務中得到重視。近年來自動駕駛、無人載具、虛擬實境有快速發展的趨勢出現。諸多發展的技術中，語義分割是屬於圖像處理的重要技術，使用語義分割模型辨識準確率，在這方面仍有巨大的挑戰存在。Badrinarayanan提出一種用於語義分割的Encoder-Decoder模型架構，許多基於Encoder-Decoder卷積神經網絡的架構已被開發應用於語義圖像分割中。提出經典語義分割FCN及Segnet模型。本文首先研究兩個經典模型各自架構功能，FCN model使用Add功能Encoder與Decoder特徵相互結合能保留更多語義特徵。Segnet model分使用MaxUnpooling與Enocder特徵比對進行傳遞，Segnet-like model使用Upsampling恢復特徵大小。Encoder使用預訓練Vgg16權重，在基礎上有穩定的準確率，為了增強模型的辨識能力，將兩個功能進行結合。提出新的模型SemNet V1、SemNet V2、SemNet V3與SemNet V4模型並優化其參數，以1:3、2:2及3:1等組態，比對模型參數量，明顯看出使用Addition功能逐點相加的方式參數量最多，總參數量高達上億量，模型較為複雜。最後本文將模型在三個不同的數據集中訓練，Camvid數據集，提出的SemNet V3(22)訓練後驗證準確率94.39%，相比原始只使用Addition功能逐點相加的方式提升7.18%，其mIOU值69.86%。SemNet V3(22)對於車輛道路辨識準確率為98.45%，能應用於自動駕駛技術中，辨識車道與周圍建築物。在VOC2012數據集，提出的SemNet V2(13)_BN訓練後驗證準確率76.55%，相比原始只使用MaxUnpooling功能保留Maxpooling後的特徵訊息提升3.31%，其mIOU值10.40%，也發現使用MaxUnpooling功能保留Maxpooling後的特徵訊息添加BatchNormalization後準確率更高。SemNet V2(13)_BN對於背景辨識準確率為95.37%，能應用於辨識圖像中的背景區域。最後在Coco2017數據集，提出SemNet V4(13)訓練後驗證準確率68.54%%，相比原始只使用Upsampling恢復特徵大小提升0.99%，其mIOU值10.88%。實驗證明結合後的模型準確率有顯著提升。.

(平裝)Subjects--Topical Terms:

1127425
深度學習.

FCN和Segnet語義分割模型優化 = = Optimization of FCN and Segnet Semantic Segmentation Models /
LDR:06356cam a2200241 i 4500 001 1129684
008 241015s2024 ch ak erm 000 0 chi d
035 $a (THES)112NYPI0428007
040 $a NFU $b chi $c NFU $e CCR
041 0 # $a chi $b chi $b eng
084 $a 008.160M $b 4443 113 $2 ncsclt
100 1 $a 林柏宏 $e 譯 $3 1222727
245 1 0 $a FCN和Segnet語義分割模型優化 = $b Optimization of FCN and Segnet Semantic Segmentation Models / $c 林柏宏.
246 1 1 $a Optimization of FCN and Segnet Semantic Segmentation Models.
250 $a 初版.
260 # $a 雲林縣 : $b 國立虎尾科技大學 , $c 民113.06.
300 $a [7], 48 ,[7]面 : $b 圖 ; $c 30公分.
500 $a 指導教授: 陳柏宏.
500 $a 學年度: 112.
502 $a 碩士論文--國立虎尾科技大學電子工程系碩士班.
504 $a 含參考書目.
520 3 $a 自從VGG模型在ImageNet挑戰數據集上表現突出，大量投資研究不斷改進卷積神經網路的架構與訓練方法。使得CNN在圖像提取、圖像分割及自然語言處理任務中得到重視。近年來自動駕駛、無人載具、虛擬實境有快速發展的趨勢出現。諸多發展的技術中，語義分割是屬於圖像處理的重要技術，使用語義分割模型辨識準確率，在這方面仍有巨大的挑戰存在。Badrinarayanan提出一種用於語義分割的Encoder-Decoder模型架構，許多基於Encoder-Decoder卷積神經網絡的架構已被開發應用於語義圖像分割中。提出經典語義分割FCN及Segnet模型。本文首先研究兩個經典模型各自架構功能，FCN model使用Add功能Encoder與Decoder特徵相互結合能保留更多語義特徵。Segnet model分使用MaxUnpooling與Enocder特徵比對進行傳遞，Segnet-like model使用Upsampling恢復特徵大小。Encoder使用預訓練Vgg16權重，在基礎上有穩定的準確率，為了增強模型的辨識能力，將兩個功能進行結合。提出新的模型SemNet V1、SemNet V2、SemNet V3與SemNet V4模型並優化其參數，以1:3、2:2及3:1等組態，比對模型參數量，明顯看出使用Addition功能逐點相加的方式參數量最多，總參數量高達上億量，模型較為複雜。最後本文將模型在三個不同的數據集中訓練，Camvid數據集，提出的SemNet V3(22)訓練後驗證準確率94.39%，相比原始只使用Addition功能逐點相加的方式提升7.18%，其mIOU值69.86%。SemNet V3(22)對於車輛道路辨識準確率為98.45%，能應用於自動駕駛技術中，辨識車道與周圍建築物。在VOC2012數據集，提出的SemNet V2(13)_BN訓練後驗證準確率76.55%，相比原始只使用MaxUnpooling功能保留Maxpooling後的特徵訊息提升3.31%，其mIOU值10.40%，也發現使用MaxUnpooling功能保留Maxpooling後的特徵訊息添加BatchNormalization後準確率更高。SemNet V2(13)_BN對於背景辨識準確率為95.37%，能應用於辨識圖像中的背景區域。最後在Coco2017數據集，提出SemNet V4(13)訓練後驗證準確率68.54%%，相比原始只使用Upsampling恢復特徵大小提升0.99%，其mIOU值10.88%。實驗證明結合後的模型準確率有顯著提升。.
520 3 $a Since the VGG model demonstrated outstanding performance on the ImageNet challenge dataset, significant investments and research have continually improved convolutional neural network (CNN) architectures and training methods. This has highlighted the importance of CNNs in image extraction, image segmentation, and natural language processing tasks. Recently, there has been a rapid development trend in autonomous driving, unmanned vehicles, and virtual reality. Among the many developing technologies, semantic segmentation is a crucial technique in image processing. However, achieving high accuracy in semantic segmentation models remains a significant challenge.Badrinarayanan proposed an Encoder-Decoder model architecture for semantic segmentation. Numerous Encoder-Decoder-based CNN architectures have since been developed and applied to semantic image segmentation, with the classic FCN and SegNet models being notable examples. This paper first examines the structure and functions of these two classic models. The FCN model uses an addition function to combine Encoder and Decoder features, preserving more semantic features. The SegNet model employs MaxUnpooling and matches Encoder features for transmission, while the SegNet-like model uses Upsampling to restore feature sizes. The Encoder utilizes pre-trained VGG16 weights, providing a stable accuracy foundation. To enhance the model's recognition capability, the functionalities of both models were combined.New models, SemNet V1, SemNet V2, SemNet V3, and SemNet V4, were proposed and their parameters optimized with configurations such as 1:3, 2:2, and 3:1. Comparing model parameter quantities, it was evident that using the Addition function, which sums points, resulted in the highest parameter count, reaching over a billion, making the model more complex.Finally, the models were trained on three different datasets. On the CamVid dataset, the proposed SemNet V3(22) achieved a validation accuracy of 94.39% after training, which is a 7.18% improvement over the original method that only used the Addition function. Its mIOU value was 69.86%. The SemNet V3(22) demonstrated a vehicle and road recognition accuracy of 98.45%, making it applicable in autonomous driving technology for identifying lanes and surrounding buildings.On the VOC2012 dataset, the proposed SemNet V2(13)_BN achieved a validation accuracy of 76.55% after training, a 3.31% improvement over the original method that only used the MaxUnpooling function to retain features post-Maxpooling. Its mIOU value was 10.40%. It was also observed that adding BatchNormalization after retaining features with the MaxUnpooling function resulted in higher accuracy. The SemNet V2(13)_BN had a background recognition accuracy of 95.37%, making it suitable for recognizing background regions in images.Lastly, on the Coco2017 dataset, the proposed SemNet V4(13) achieved a validation accuracy of 68.54% after training, a 0.99% improvement over the original method that only used Upsampling to restore feature sizes. Its mIOU value was 10.88%. The experiments demonstrated significant accuracy improvements in the combined models..
563 $a (平裝)
650 # 4 $a 深度學習. $3 1127425
650 # 4 $a 語義分割. $3 1383627
650 # 4 $a 卷積神經網路. $3 1127424
650 # 7 $a 人工智慧. $2 lcstt $3 833500
650 # 4 $a Deep Learning. $3 1127423
650 # 4 $a Semantic Segmentation. $3 1247352
650 # 4 $a Convolutional Neural Networks. $3 1328816
650 # 4 $a Artificial Intelligence. $3 646849
650 # 4 $a FCN. $3 1365590
650 # 4 $a Segnet. $3 1451675
856 7 # $u https://handle.ncl.edu.tw/11296/y45q9x $z 電子資源 $2 http