COMP390 - Review of Multi-view 3D Deep Learning Techniques


Project Webpage

Static Webpage

GitHub - kylelowryr/kylelowryr.github.io
You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or window. Reload to refresh your session. Reload to refresh your session.
https://github.com/kylelowryr/kylelowryr.github.io/

Alternative way1: https://kylelowryr.github.io/

Github: https://github.com/kylelowryr/kylelowryr.github.io/

Python Code: (Including Readme.)

https://github.com/kylelowryr/COMP390_Project


Introduction

Background:

This project mainly focuses on Multi-view 3D deep learning techniques in 3D object classification. To be exact, 2 models( MVCNN and View-GCN ) will be examined in this project.

Webpage Description:

Now this webpage contains contents covering Introduction/benchmarks/experiment recording/conclusion/user manual/data sources/references.

The main page only contains description, basic results/conclusions and partial extraction of data. All detailed data generated in experiments will be recorded in corresponding subpages, see remarks.

Datasets, Models, Sources

Datasets are open source or generated by myself, this subpage is aiming for storing the license.


User Manual

1. Preparing models

See ‘Source MVCNN & View-GCN’ part, there are 2 github projects of MVCNN & View-GCN. All of the main framework and debugging test should be guided over their README file.

Basing on my test, only need to pay attention to .float() &.double() ,due to the inconsistency within pytorch versions updating.

2. Preparing ModelNet40

See‘Source ModelNet40’, I provide the official webpage of ModelNet project, it allows to download the .off files(source files) of ModelNet10/40. Also, in the previous github projects, there are some prepared datasets too.

2.1 Generating own ModelNet40 dataset.

See ‘Source Blender’, Using Blender & add-on functions to generate view images in batch.

2.2 Manual alignment

Using Blender to amend the deployment of surroundings and alignment of objects.

I also provide 2 versions of dataset generated by myself with difference of shadow, both test part are all aligned.

3. Running models and gathering info

The 1st part in my code is to running the code to train/test model and then integrating the data to a dataframe. Saving and preparing for next step.

4. Data Analysis

The 2nd part in my code is to analyzing the .csv file, so it can be separately running as another project. Actually I did it like this way on my PC to make it more clear.

The different meaning of functions are all recorded in README files.


Datasets

Description:

Princeton ModelNet Project, including CAD models which covers 40 common items in daily life.

Including:

9843 models for training

2468 models for testing


CNN Models

Basic Description:

Using a basic way of pool-mixing to pick the features from different views.

H. Su, 2015

Basic Description:

Adopting hieratical graph CNN(GCN)s to generate mixed features from different views.

X. Wei, 2020


Benchmark & Basic results

Time Per Testing (Avg.)*1 group of testing contains 2468 image models
MVCNN2m40
View-GCN3m306 ( ~+ 31.25%)

Remark: The complexity of View-GCN makes it time consuming.

Group ViewsSingle View*Group has 4 candidates/layers, see ‘ModelNet 40 Subpage’ *Single view only presents top_3 candidates here.
Overall>0.75>0.5
MVCNNEp1-S22,9,192,9,3
Ep1-NoS22,6,122,6,9
Ep10-S1,42,8,92,9,12
Ep10-NoS22,19,92,7,19
View-GCNEp1-S49,17,139,17,13
Ep1-NoS39,13,119,13,11
Ep10-S3,46,12,106,12,10
Ep10-NoS46,10,12,26,10,2

Remark: In MVCNN, View-2 always appears, and group-2(Layer-2) contributes most. In View-GCN, the training epochs make the results different, View-9 for low epochs, View 6 for high epochs. And group-4 contributes most.

Training ConfigNo DisturbanceDisturb 5 Avg.Disturb 10Disturb 15Disturb 20
MVCNNEp1-S90.64%- 0.79%- 1.8%- 3.2%- 4.46%
Ep1-NoS89.47%- 0.96%- 2.55%- 5.52%- 8.99%
Ep10-S91.86%- 1.56%- 2.48%- 3.5%- 4.4%
Ep10-NoS92.18%- 1.66%- 3.28%- 5.46%- 7.32%
View-GCNEp1-S81.89%- 0.96%- 3.8%- 6.2%- 8.26%
Ep1-NoS80.92%- 0.3%- 2.23%- 5.07%- 7.88%
Ep10-S91.53%- 0.87%- 2.17%- 3.27%- 3.77%
Ep10-NoS92.59%- 0.67%- 1.35%- 1.93%- 2.56%

Remark:

For MVCNN, the shade/noshade influence more, getting 2 times gap in self-comparison.

For View-GCN, the gap presents in the training epochs. Training more makes it more stable.

In conclusion, View-GCN seems more stable than MVCNN.

Noshade (Avg.)Shade (Avg.)
MVCNNEp1-S0.8901945 ( - 1.62%)/
Ep1-NoS/0.85737437 ( - 3.73%)
Ep10-S0.8837115( - 3.48%)/
Ep10-NoS/0.899919 ( - 2.19%)
View-GCNEp1-S0.6807131 ( - 13.82%)/
Ep1-NoS/0.70016205 ( - 10.9%)
Ep10-S0.84724474 ( -6.81%)/
Ep10-NoS/ 0.8059157 ( - 11.99%)

Remark: MVCNN presents good stability in cross-over experiment, the fluctuation of acc with View-GCN seems large.

Sub pages

See the following page with specific benchmark and analysis


MVCNN - Single View


MVCNN - Group View


View-GCN - Single View


View-GCN - Group View


Robustness


Tools using/local Setup

Python related*BasicPython - v3.8.8numpy v1.20.1pandas v1.2.4seaborn v0.11.1pillow v8.2.0matplotlib v3.3.4
*ImportantPytorch - v1.10.1
PC related*HardwareDriver Prog ver.496.76CUDA v11.3cuDNN v8.2.1Windows 10GPU RTX3070
*SoftwareDatawrapperBlender v2.79b/3.0Google DriveGithubGithub PageNotion


Further Development

About Model(s)

Exploring more models and examining the the corresponding outputs in this framework.

About Benchmark/Framework

Adding more indicators to improve the framework.

Exploring the relationships in views and classes, making more accurate definition to each class and getting detailed classification.

About Dataset(s)

Looking for/Generating more realistic datasets for training and testing which following the consistency with 3D model & real-world.

Amending more different tough situations and conditions in datasets besides shade/shadow and sheltering.


Sources

Models

MVCNN: https://github.com/suhangpro/mvcnn

View-GCN: https://github.com/weixmath/view-GCN

Datasets

ModelNet: https://modelnet.cs.princeton.edu/

Tools

Blender tools:

Personal/Prepared sources

Datasets:


Hao Cao

University of Liverpool, Computer Science

Comp 390 Project

sghcao6@liverpool.ac.uk