COMP390 - Review of Multi-view 3D Deep Learning Techniques
Project Webpage
Static Webpage
Alternative way1: https://kylelowryr.github.io/
Github: https://github.com/kylelowryr/kylelowryr.github.io/
Python Code: (Including Readme.)
https://github.com/kylelowryr/COMP390_Project
Introduction
Background:
This project mainly focuses on Multi-view 3D deep learning techniques in 3D object classification. To be exact, 2 models( MVCNN and View-GCN ) will be examined in this project.
Webpage Description:
Now this webpage contains contents covering Introduction/benchmarks/experiment recording/conclusion/user manual/data sources/references.
The main page only contains description, basic results/conclusions and partial extraction of data. All detailed data generated in experiments will be recorded in corresponding subpages, see remarks.
Datasets, Models, Sources
Datasets are open source or generated by myself, this subpage is aiming for storing the license.
User Manual
1. Preparing models
See ‘Source MVCNN & View-GCN’ part, there are 2 github projects of MVCNN & View-GCN. All of the main framework and debugging test should be guided over their README file.
Basing on my test, only need to pay attention to .float()
&.double()
,due to the inconsistency within pytorch versions updating.
2. Preparing ModelNet40
See‘Source ModelNet40’, I provide the official webpage of ModelNet project, it allows to download the .off files(source files) of ModelNet10/40. Also, in the previous github projects, there are some prepared datasets too.
2.1 Generating own ModelNet40 dataset.
See ‘Source Blender’, Using Blender & add-on functions to generate view images in batch.
2.2 Manual alignment
Using Blender to amend the deployment of surroundings and alignment of objects.
I also provide 2 versions of dataset generated by myself with difference of shadow, both test part are all aligned.
3. Running models and gathering info
The 1st part in my code is to running the code to train/test model and then integrating the data to a dataframe. Saving and preparing for next step.
4. Data Analysis
The 2nd part in my code is to analyzing the .csv file, so it can be separately running as another project. Actually I did it like this way on my PC to make it more clear.
The different meaning of functions are all recorded in README files.
Datasets
Description:
Princeton ModelNet Project, including CAD models which covers 40 common items in daily life.
Including:
9843 models for training
2468 models for testing
CNN Models
Basic Description:
Using a basic way of pool-mixing to pick the features from different views.
H. Su, 2015
Basic Description:
Adopting hieratical graph CNN(GCN)s to generate mixed features from different views.
X. Wei, 2020
Benchmark & Basic results
Time Per Testing (Avg.) | *1 group of testing contains 2468 image models | |
---|---|---|
MVCNN | 2m40 | |
View-GCN | 3m306 ( ~+ 31.25%) |
Remark: The complexity of View-GCN makes it time consuming.
Group Views | Single View | *Group has 4 candidates/layers, see ‘ModelNet 40 Subpage’ *Single view only presents top_3 candidates here. | |||
---|---|---|---|---|---|
Overall | >0.75 | >0.5 | |||
MVCNN | Ep1-S | 2 | 2,9,19 | 2,9,3 | |
Ep1-NoS | 2 | 2,6,12 | 2,6,9 | ||
Ep10-S | 1,4 | 2,8,9 | 2,9,12 | ||
Ep10-NoS | 2 | 2,19,9 | 2,7,19 | ||
View-GCN | Ep1-S | 4 | 9,17,13 | 9,17,13 | |
Ep1-NoS | 3 | 9,13,11 | 9,13,11 | ||
Ep10-S | 3,4 | 6,12,10 | 6,12,10 | ||
Ep10-NoS | 4 | 6,10,12,2 | 6,10,2 |
Remark: In MVCNN, View-2 always appears, and group-2(Layer-2) contributes most. In View-GCN, the training epochs make the results different, View-9 for low epochs, View 6 for high epochs. And group-4 contributes most.
Training Config | No Disturbance | Disturb 5 Avg. | Disturb 10 | Disturb 15 | Disturb 20 | |
---|---|---|---|---|---|---|
MVCNN | Ep1-S | 90.64% | - 0.79% | - 1.8% | - 3.2% | - 4.46% |
Ep1-NoS | 89.47% | - 0.96% | - 2.55% | - 5.52% | - 8.99% | |
Ep10-S | 91.86% | - 1.56% | - 2.48% | - 3.5% | - 4.4% | |
Ep10-NoS | 92.18% | - 1.66% | - 3.28% | - 5.46% | - 7.32% | |
View-GCN | Ep1-S | 81.89% | - 0.96% | - 3.8% | - 6.2% | - 8.26% |
Ep1-NoS | 80.92% | - 0.3% | - 2.23% | - 5.07% | - 7.88% | |
Ep10-S | 91.53% | - 0.87% | - 2.17% | - 3.27% | - 3.77% | |
Ep10-NoS | 92.59% | - 0.67% | - 1.35% | - 1.93% | - 2.56% |
Remark:
For MVCNN, the shade/noshade influence more, getting 2 times gap in self-comparison.
For View-GCN, the gap presents in the training epochs. Training more makes it more stable.
In conclusion, View-GCN seems more stable than MVCNN.
Noshade (Avg.) | Shade (Avg.) | ||
---|---|---|---|
MVCNN | Ep1-S | 0.8901945 ( - 1.62%) | / |
Ep1-NoS | / | 0.85737437 ( - 3.73%) | |
Ep10-S | 0.8837115( - 3.48%) | / | |
Ep10-NoS | / | 0.899919 ( - 2.19%) | |
View-GCN | Ep1-S | 0.6807131 ( - 13.82%) | / |
Ep1-NoS | / | 0.70016205 ( - 10.9%) | |
Ep10-S | 0.84724474 ( -6.81%) | / | |
Ep10-NoS | / | 0.8059157 ( - 11.99%) |
Remark: MVCNN presents good stability in cross-over experiment, the fluctuation of acc with View-GCN seems large.
Sub pages
See the following page with specific benchmark and analysis
MVCNN - Single View
MVCNN - Group View
View-GCN - Single View
View-GCN - Group View
Robustness
Tools using/local Setup
Python related | *Basic | Python - v3.8.8 | numpy v1.20.1 | pandas v1.2.4 | seaborn v0.11.1 | pillow v8.2.0 | matplotlib v3.3.4 |
*Important | Pytorch - v1.10.1 | ||||||
PC related | *Hardware | Driver Prog ver.496.76 | CUDA v11.3 | cuDNN v8.2.1 | Windows 10 | GPU RTX3070 | |
*Software | Datawrapper | Blender v2.79b/3.0 | Google Drive | Github | Github Page | Notion |
Further Development
About Model(s)
Exploring more models and examining the the corresponding outputs in this framework.
About Benchmark/Framework
Adding more indicators to improve the framework.
Exploring the relationships in views and classes, making more accurate definition to each class and getting detailed classification.
About Dataset(s)
Looking for/Generating more realistic datasets for training and testing which following the consistency with 3D model & real-world.
Amending more different tough situations and conditions in datasets besides shade/shadow and sheltering.
Sources
Models
MVCNN: https://github.com/suhangpro/mvcnn
View-GCN: https://github.com/weixmath/view-GCN
Datasets
ModelNet: https://modelnet.cs.princeton.edu/
Tools
Blender tools:
https://github.com/zeaggler/ModelNet_Blender_OFF2Multiview
Personal/Prepared sources
Datasets:
ModelNet40 - 20views - shade: https://drive.google.com/file/d/1EK7ApY3f_LAy8x1GlFfFFDVnpg8wJIaS/view?usp=sharing
ModelNet40 - 20views - Noshade: https://drive.google.com/file/d/1AO_RQGQ3_aoXpbqzdpGXUC6x4tafuuZy/view?usp=sharing
Pretrained .pth files collection with different configs: https://drive.google.com/file/d/1ejeF8C7n47Chzkt-6iw4NJtkntbWAnyc/view?usp=sharing
Output .csv files collection: https://drive.google.com/file/d/1iRep8bsAQ0ze1BvbPyoii5JKi81qYK_U/view?usp=sharing
Hao Cao
University of Liverpool, Computer Science
Comp 390 Project
sghcao6@liverpool.ac.uk