COMP390 - Review of Multi-view 3D Deep Learning Techniques

Project Webpage

Static Webpage

Alternative way1: https://kylelowryr.github.io/

Github: https://github.com/kylelowryr/kylelowryr.github.io/

Python Code: (Including Readme.)

https://github.com/kylelowryr/COMP390_Project

Introduction

Background:

This project mainly focuses on Multi-view 3D deep learning techniques in 3D object classification. To be exact, 2 models( MVCNN and View-GCN ) will be examined in this project.

Webpage Description:

Now this webpage contains contents covering Introduction/benchmarks/experiment recording/conclusion/user manual/data sources/references.

The main page only contains description, basic results/conclusions and partial extraction of data. All detailed data generated in experiments will be recorded in corresponding subpages, see remarks.

Datasets, Models, Sources

Datasets are open source or generated by myself, this subpage is aiming for storing the license.

License/Citation

User Manual

1. Preparing models

See ‘Source MVCNN & View-GCN’ part, there are 2 github projects of MVCNN & View-GCN. All of the main framework and debugging test should be guided over their README file.

Basing on my test, only need to pay attention to .float() &.double() ,due to the inconsistency within pytorch versions updating.

2. Preparing ModelNet40

See‘Source ModelNet40’, I provide the official webpage of ModelNet project, it allows to download the .off files(source files) of ModelNet10/40. Also, in the previous github projects, there are some prepared datasets too.

2.1 Generating own ModelNet40 dataset.

See ‘Source Blender’, Using Blender & add-on functions to generate view images in batch.

2.2 Manual alignment

Using Blender to amend the deployment of surroundings and alignment of objects.

I also provide 2 versions of dataset generated by myself with difference of shadow, both test part are all aligned.

3. Running models and gathering info

The 1st part in my code is to running the code to train/test model and then integrating the data to a dataframe. Saving and preparing for next step.

4. Data Analysis

The 2nd part in my code is to analyzing the .csv file, so it can be separately running as another project. Actually I did it like this way on my PC to make it more clear.

The different meaning of functions are all recorded in README files.

Datasets

ModelNet 40

Description:

Princeton ModelNet Project, including CAD models which covers 40 common items in daily life.

Including:

9843 models for training

2468 models for testing

CNN Models

MVCNN

Basic Description:

Using a basic way of pool-mixing to pick the features from different views.

H. Su, 2015

View-GCN

Basic Description:

Adopting hieratical graph CNN(GCN)s to generate mixed features from different views.

X. Wei, 2020

Benchmark & Basic results

	Time Per Testing (Avg.)	*1 group of testing contains 2468 image models
MVCNN	2m40
View-GCN	3m306 ( ~+ 31.25%)

Remark: The complexity of View-GCN makes it time consuming.

		Group Views	Single View
		Overall	>0.75	>0.5
MVCNN	Ep1-S	2	2,9,19	2,9,3
	Ep1-NoS	2	2,6,12	2,6,9
	Ep10-S	1,4	2,8,9	2,9,12
	Ep10-NoS	2	2,19,9	2,7,19
View-GCN	Ep1-S	4	9,17,13	9,17,13
	Ep1-NoS	3	9,13,11	9,13,11
	Ep10-S	3,4	6,12,10	6,12,10
	Ep10-NoS	4	6,10,12,2	6,10,2

Remark: In MVCNN, View-2 always appears, and group-2(Layer-2) contributes most. In View-GCN, the training epochs make the results different, View-9 for low epochs, View 6 for high epochs. And group-4 contributes most.

	Training Config	No Disturbance	Disturb 5 Avg.	Disturb 10	Disturb 15	Disturb 20
MVCNN	Ep1-S	90.64%	- 0.79%	- 1.8%	- 3.2%	- 4.46%
	Ep1-NoS	89.47%	- 0.96%	- 2.55%	- 5.52%	- 8.99%
	Ep10-S	91.86%	- 1.56%	- 2.48%	- 3.5%	- 4.4%
	Ep10-NoS	92.18%	- 1.66%	- 3.28%	- 5.46%	- 7.32%
View-GCN	Ep1-S	81.89%	- 0.96%	- 3.8%	- 6.2%	- 8.26%
	Ep1-NoS	80.92%	- 0.3%	- 2.23%	- 5.07%	- 7.88%
	Ep10-S	91.53%	- 0.87%	- 2.17%	- 3.27%	- 3.77%
	Ep10-NoS	92.59%	- 0.67%	- 1.35%	- 1.93%	- 2.56%

Remark:

For MVCNN, the shade/noshade influence more, getting 2 times gap in self-comparison.

For View-GCN, the gap presents in the training epochs. Training more makes it more stable.

In conclusion, View-GCN seems more stable than MVCNN.

		Noshade (Avg.)	Shade (Avg.)
MVCNN	Ep1-S	0.8901945 ( - 1.62%)	/
	Ep1-NoS	/	0.85737437 ( - 3.73%)
	Ep10-S	0.8837115( - 3.48%)	/
	Ep10-NoS	/	0.899919 ( - 2.19%)
View-GCN	Ep1-S	0.6807131 ( - 13.82%)	/
	Ep1-NoS	/	0.70016205 ( - 10.9%)
	Ep10-S	0.84724474 ( -6.81%)	/
	Ep10-NoS	/	0.8059157 ( - 11.99%)

Remark: MVCNN presents good stability in cross-over experiment, the fluctuation of acc with View-GCN seems large.

Sub pages

See the following page with specific benchmark and analysis

Example: MVCNN with ModelNet40 -12 views

Tools using/local Setup

Python related	*Basic	Python - v3.8.8	numpy v1.20.1	pandas v1.2.4	seaborn v0.11.1	pillow v8.2.0	matplotlib v3.3.4
	*Important	Pytorch - v1.10.1
PC related	*Hardware	Driver Prog ver.496.76	CUDA v11.3	cuDNN v8.2.1	Windows 10	GPU RTX3070
	*Software	Datawrapper	Blender v2.79b/3.0	Google Drive	Github	Github Page	Notion