Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Internal] Performance Consistency Check Leaderboard #53

Open
Frozenmad opened this issue Nov 2, 2021 · 10 comments
Open

[Internal] Performance Consistency Check Leaderboard #53

Frozenmad opened this issue Nov 2, 2021 · 10 comments
Labels

Comments

@Frozenmad
Copy link
Contributor

Frozenmad commented Nov 2, 2021

This issue is created to check whether the library has the same performance features with the native implemented models.

WARNING: This is not the evaluation results of this library. For benchmarking of AutoGL, please see the examples provided.

Guide to developers

What do we mean when we are checking performance?

First, remember that the performance inconsistency may not be because of our implementations. Sometimes you need to increase the repeat number, or change the range of seeds to see whether the performances match with each other under the "same" setting.

If the rules above do not apply, you need to carefully check whether there are some unwanted implementations in your code. Also, there are still chances that the performance check codes are incorrect, in which case you should point out to @Frozenmad .

Note

All the performance check results are listed below. All the performances inconsistencies are represented as bold in the Table.

@Frozenmad Frozenmad changed the title [Internal] Performance Check Leaderboard [Internal] Performance Consistency Check Leaderboard Nov 2, 2021
@Frozenmad
Copy link
Contributor Author

Frozenmad commented Nov 2, 2021

[DGL] Homogeneous Node Classification

Starting cmd:

python test/performance/node_classification/dgl/xxx.py --model gcn/gat/sage --repeat 10 --dataset Cora/PubMed/CiteSeer

Environment: Tesla V100S-PCIE-32GB

model cora citeseer pubmed
base - gcn 77.93 ~ 1.43 (1.57s/it) 63.59 ~ 1.81 (1.56s/it) 75.91 ~ 0.73 (1.67s/it)
model - gcn 77.93 ~ 1.43 (1.63s/it) 63.59 ~ 1.81 (1.70s/it) 75.91 ~ 0.73 (1.76s/it)
model (decouple) - gcn 77.93 ~ 1.43 (1.60s/it) 63.59 ~ 1.81 (1.58s/it) 75.91 ~ 0.73 (1.59s/it)
trainer - gcn 77.93 ~ 1.43 (1.94s/it) 63.59 ~ 1.81 (1.96s/it) 75.91 ~ 0.73 (2.02s/it)
trainer + dataset - gcn 77.93 ~ 1.43 (1.97s/it) 63.59 ~ 1.81 (1.97s/it) 75.91 ~ 0.73 (1.96s/it)
solver - gcn 77.93 ~ 1.43 (2.04s/it) 63.59 ~ 1.81 (1.99s/it) 75.91 ~ 0.73 (2.00s/it)
base - gat 81.41 ~ 0.80 (2.21s/it) 67.51 ~ 1.03 (2.29s/it) 75.55~ 0.91 (2.35s/it)
model - gat 81.41 ~ 0.80 (2.39s/it) 67.51 ~ 1.03 (2.35s/it) 75.55~ 0.91 (2.53s/it)
model (decouple) - gat 81.41 ~ 0.80 (2.20s/it) 67.51 ~ 1.03 (2.53s/it) 75.55~ 0.91 (2.38s/it)
trainer - gat 81.41 ~ 0.80 (2.83s/it) 67.51 ~ 1.03 (2.90s/it) 75.55~ 0.91 (2.94s/it)
trainer + dataset - gat 81.41 ~ 0.80 (2.85s/it) 67.51 ~ 1.03 (2.92s/it) 75.55~ 0.91 (3.04s/it)
solver - gat 81.41 ~ 0.80 (2.95s/it) 67.51 ~ 1.03 (2.84s/it) 75.55~ 0.91 (3.05s/it)
base - sage 81.23 ~ 0.52 (1.20s/it) 69.51 ~ 1.12 (1.19s/it) 76.25 ~ 0.43 (1.27s/it)
model - sage 81.23 ~ 0.52 (1.19s/it) 69.50 ~ 1.14 (1.18s/it) 76.25 ~ 0.43 (1.27s/it)
model (decouple) - sage 81.23 ~ 0.52 (1.19s/it) 69.50 ~ 1.14 (1.27s/it) 76.25 ~ 0.43 (1.34s/it)
trainer - sage 81.23 ~ 0.52 (1.55s/it) 69.50 ~ 1.14 (1.58s/it) 76.25 ~ 0.43 (1.67s/it)
trainer + dataset - sage 81.23 ~ 0.52 (1.53s/it) 69.50 ~ 1.14 (1.58s/it) 76.25 ~ 0.43 (1.65s/it)
solver - sage 81.23 ~ 0.52 (1.57s/it) 69.50 ~ 1.14 (1.61s/it) 76.25 ~ 0.43 (1.64s/it)

@general502570
Copy link
Contributor

general502570 commented Nov 2, 2021

[PYG] Homogeneous Node Classification

Starting cmd:

python test/performance/node_classification/pyg/xxx.py --model gcn/gat/sage --repeat 10 --dataset Cora/PubMed/CiteSeer

Environment: Tesla V100S-PCIE-32GB

model cora citeseer pubmed
base - gcn 79.92 ~ 0.45 (1.15s/it) 67.13 ~ 1.71 (1.12s/it) 76.74 ~ 0.36 (1.12s/it)
model - gcn 79.92 ~ 0.45 (1.12s/it) 67.13 ~ 1.71 (1.13s/it) 76.74 ~ 0.36 (1.16s/it)
model (decouple) - gcn 79.92 ~ 0.45 (1.14s/it) 67.13 ~ 1.71 (1.16s/it) 76.74 ~ 0.36 (1.22s/it)
trainer - gcn 79.93 ~ 0.45 (1.42s/it) 67.13 ~ 1.71 (1.43s/it) 76.74 ~ 0.36 (1.47s/it)
trainer + dataset - gcn 79.92 ~ 0.45 (1.43s/it) 67.13 ~ 1.71 (1.42s/it) 76.74 ~ 0.36 (1.42s/it)
solver - gcn 79.92 ~ 0.45 (1.53s/it) 67.13 ~ 1.71 (1.60s/it) 76.74 ~ 0.36 (1.53s/it)
base - gat 81.80 ~ 1.24 (1.73s/it) 70.75 ~ 0.85 (1.94s/it) 76.65 ~ 1.02 (1.86s/it)
model - gat 81.80 ~ 1.24 (1.76s/it) 70.75 ~ 0.85 (1.82s/it) 76.65 ~ 1.02 (1.87s/it)
model (decouple) - gat 81.80 ~ 1.24 (1.80s/it) 70.75 ~ 0.85 (1.78s/it) 76.65 ~ 1.02 (2.05s/it)
trainer - gat 81.80 ~ 1.24 (2.31s/it) 70.75 ~ 0.85 (2.28s/it) 76.65 ~ 1.02 (2.40s/it)
trainer + dataset - gat 81.80 ~ 1.24 (2.30s/it) 70.75 ~ 0.85 (2.31s/it) 76.65 ~ 1.02 (2.39s/it)
solver - gat 81.80 ~ 1.24 (2.05s/it) 70.75 ~ 0.85 (2.24s/it) 76.65 ~ 1.02 (2.33s/it)
base - sage 78.21 ~ 0.60 (1.14s/it) 67.24 ~ 0.99 (1.18s/it) 75.61 ~ 0.53 (1.34s/it)
model - sage 78.21 ~ 0.60 (1.05s/it) 67.24 ~ 0.99 (1.24s/it) 75.61 ~ 0.53 (1.35s/it)
trainer - sage 78.21 ~ 0.60 (1.24s/it) 67.24 ~ 0.99 (1.48s/it) 75.61 ~ 0.53 (1.63s/it)
trainer + dataset - sage 78.21 ~ 0.60 (1.24s/it) 67.24 ~ 0.99 (1.48s/it) 75.62 ~ 0.51 (1.62s/it)
solver - sage 78.21 ~ 0.60 (1.30s/it) 67.24 ~ 0.99 (1.67s/it) 75.62 ~ 0.51 (1.77s/it)

@Frozenmad
Copy link
Contributor Author

Frozenmad commented Nov 19, 2021

[DGL] Heterogeneous Node Classification

Starting cmd:

python test/performance/node_classification/dgl/hetero_xxx.py --model xxx --repeat 10 --dataset xxx

Environment: [fill this env]

model ACM ACM3025 xxx
base - hgt 0.4025 ~ 0.0055 (119.67s/it)
model - hgt 0.4007 ~ 0.0051 (119.35s/it)
trainer - hgt 0.3946 ~ 0.0067 (33.49s/it)
trainer + dataset - hgt
solver - hgt
base - heteroRGCN 0.4033 ~ 0.0013 (16.20s/it)
model - heteroRGCN 0.4043 ~ 0.0015 (14.90s/it)
trainer - heteroRGCN 0.3995 ~ 0.0015 (7.20s/it)
trainer + dataset - heteroRGCN
solver - heteroRGCN
base - han 0.9123 ~ 0.0072 (43.62s/it) 0.8655 ~ 0.0139 (44.58s/it)
model - han 0.9000 ~ 0.0099 (159.69s/it)
trainer - han 0.9048 ~ 0.0055 (92.51s/it)
trainer + dataset - han
solver - han

@Frozenmad
Copy link
Contributor Author

Frozenmad commented Nov 20, 2021

[PyG] Homogeneous Graph Classification

Starting cmd:

python test/performance/graph_classification/pyg/xxx.py --model gin --repeat 10 --dataset MUTAG/COLLAB/IMDBBINARY

Environment: Tesla V100S-PCIE-32GB

model MUTAG COLLAB IMDBBINARY
base - gin 82.31 ~ 8.63 (6.68s/it)
model - gin 89.23 ~ 4.49 (5.87s/it)
trainer - gin 80.00 ~ 7.05 (6.14s/it)
trainer + dataset - gin 76.92 ~ 7.50 (5.42s/it)
solver - gin 90.38 ~ 6.26 (5.91s/it)

Environment: Tesla V100S-PCIE-32GB
Considering the randomness in pyg, the results repeated 100 are reported below:

model MUTAG COLLAB IMDBBINARY
base - gin 85.27 ~ 6.66 (4.08s/it)
model - gin 88.77 ~ 5.64 (4.79s/it)
trainer - gin 81.04 ~ 9.28 (5.41s/it)
trainer + dataset - gin 80.42 ~ 9.56 (5.24s/it)
solver - gin 88.69 ~ 7.11 (5.05s/it)

@general502570
Copy link
Contributor

general502570 commented Dec 13, 2021

[PyG] NAS

Environment: GeForce GTX TITAN X

space algorithm cora citeseer pubmed
graphnas graphnas 81.5 70.8
graphnas random 81.1 70.0
singlepath enas 82.3 69.9
singlepath darts 81.9 72.2

@Frozenmad
Copy link
Contributor Author

Frozenmad commented Dec 21, 2021

[DGL] Homogeneous Graph Classification

Starting cmd:

python test/performance/graph_classification/dgl/xxx.py --repeat 10 --dataset MUTAG

Environment: Tesla V100S-PCIE-32GB

model MUTAG COLLAB IMDBBINARY
base - gin 89.62 ~ 6.21 (14.98s/it) 72.86 ~ 1.80 (375.67s/it) 68.80 ~ 2.68 (69.53s/it)
model - gin 89.62 ~ 6.21 (15.07s/it) 72.86 ~ 1.80 (366.30s/it) 68.80 ~ 2.68 (74.64s/it)
trainer - gin 89.62 ~ 6.21 (15.71s/it) 72.86 ~ 1.80 (396.29s/it) 68.80 ~ 2.68 (69.95s/it)
trainer + dataset - gin 89.62 ~ 6.21 (15.50s/it) 72.86 ~ 1.80 (482.89s/it) 68.80 ~ 2.68 (74.36s/it)
solver - gin 89.62 ~ 6.21 (15.89s/it) 72.86 ~ 1.80 (481.45s/it) 68.80 ~ 2.68 (74.02s/it)

@auroraToT
Copy link
Contributor

auroraToT commented Dec 22, 2021

[PYG] Homogeneous Link Prediction

Starting cmd:

python test/performance/link_prediction/pyg/xxx.py --model gcn/gat/sage --repeat 10 --dataset Cora/PubMed/CiteSeer

Environment: Tesla V100S-PCIE-32GB

model cora citeseer pubmed
base - gcn 90.44 ~ 0.91 (2.20s/it) 90.30 ~ 0.79 (2.29s/it) 95.48 ~ 0.22 (24.06s/it)
model - gcn 90.44 ~ 0.91 (2.24s/it) 90.30 ~ 0.79 (2.25s/it) 95.48 ~ 0.22 (22.83s/it)
model_decouple - gcn 90.44 ~ 0.91 (2.20s/it) 90.30 ~ 0.80 (2.26s/it) 95.48 ~ 0.22 (23.53s/it)
trainer - gcn 90.44 ~ 0.91 (2.00s/it) 90.30 ~ 0.79 (2.00s/it) 95.48 ~ 0.22 (22.88s/it)
trainer + dataset - gcn 90.44 ~ 0.91 (2.01s/it) 90.30 ~ 0.79 (2.03s/it) 95.48 ~ 0.22 (24.25s/it)
solver - gcn 90.44 ~ 0.93 (2.56s/it) 90.30 ~ 0.79 (2.61s/it) 95.48 ~ 0.22 (26.49s/it)
base - gat 90.72 ~ 0.79 (2.46s/it) 90.10 ~ 0.84 (2.62s/it) 91.72 ~ 0.44 (23.11s/it)
model - gat 90.72 ~ 0.79 (2.49s/it) 90.10 ~ 0.84 (2.54s/it) 91.72 ~ 0.44 (22.53s/it)
model_decouple - gat 90.72 ~ 0.79 (2.44s/it) 90.10 ~ 0.84 (2.57s/it) 91.72 ~ 0.44 (23.85s/it)
trainer - gat 90.72 ~ 0.79 (2.26s/it) 90.10 ~ 0.84 (2.43s/it) 91.72 ~ 0.44 (22.56s/it)
trainer + dataset - gat 90.72 ~ 0.79 (2.25s/it) 90.10 ~ 0.84 (2.38s/it) 91.72 ~ 0.44 (23.03s/it)
solver - gat 90.72 ~ 0.79 (2.71s/it) 90.10 ~ 0.84 (3.00s/it) 91.72 ~ 0.44 (26.98s/it)
base - sage 88.59 ~ 0.99 (1.98s/it) 84.44 ~ 1.47 (2.23s/it) 87.11 ~ 1.20 (22.91s/it)
model - sage 88.52 ~ 1.04 (1.99s/it) 84.44 ~ 1.47 (2.24s/it) 87.11 ~ 1.20 (22.78s/it)
model_decouple - sage 88.53 ~ 1.05 (1.97s/it) 84.42 ~ 1.46 (2.16s/it) 87.11 ~ 1.20 (22.44s/it)
trainer - sage 88.57 ~ 0.98 (1.92s/it) 84.43 ~ 1.46 (2.16s/it) 87.11 ~ 1.20 (21.41s/it)
trainer + dataset - sage 88.58 ~ 0.99 (1.91s/it) 84.42 ~ 1.45 (2.13s/it) 87.10 ~ 1.20 (23.60s/it)
solver - sage 88.58 ~ 0.99 (2.42s/it) 84.42 ~ 1.45 (2.59s/it) 87.10 ~ 1.20 (25.82s/it)

@Frozenmad
Copy link
Contributor Author

[DGL] Link Prediction

Starting cmd:

python test/performance/link_prediction/dgl/xxx.py --model gcn/gat/sage --repeat 10 --dataset Cora/PubMed/CiteSeer

Environment: Tesla V100S-PCIE-32GB

model cora citeseer pubmed
base - gcn 87.44 ~ 1.72 (1.51s/it) 84.79 ~ 2.24 (1.80s/it) 91.23 ~ 1.87 (8.17s/it)
model - gcn 87.44 ~ 1.72 (1.57s/it) 84.79 ~ 2.24 (1.72s/it) 91.23 ~ 1.87 (8.09s/it)
trainer - gcn 87.44 ~ 1.72 (2.08s/it) 84.79 ~ 2.24 (2.83s/it) 91.23 ~ 1.87 (9.17s/it)
trainer + dataset - gcn 87.44 ~ 1.72 (1.71s/it) 84.79 ~ 2.24 (2.33s/it) 91.23 ~ 1.87 (9.14s/it)
solver - gcn 87.44 ~ 1.72 (1.75s/it) 84.79 ~ 2.24 (2.46s/it) 91.23 ~ 1.87 (9.74s/it)
base - gat 92.39 ~ 0.40 (1.83s/it) 91.69 ~ 0.96 (1.98s/it) 75.55~ 0.91 (8.51s/it)
model - gat 92.39 ~ 0.40 (2.02s/it) 91.69 ~ 0.96 (1.90s/it) 75.55~ 0.91 (8.53s/it)
trainer - gat 92.39 ~ 0.40 (2.47s/it) 91.69 ~ 0.96 (3.27s/it) 75.55~ 0.91 (9.39s/it)
trainer + dataset - gat 92.39 ~ 0.40 (2.45s/it) 91.69 ~ 0.96 (3.12s/it) 75.55~ 0.91 (9.48s/it)
solver - gat 92.39 ~ 0.40 (2.13s/it) 91.69 ~ 0.96 (2.94s/it) 75.55~ 0.91 (9.36s/it)
base - sage 88.49 ~ 0.91 (1.46s/it) 87.36 ~ 0.74 (1.49s/it) 76.25 ~ 0.43 (8.06s/it)
model - sage 88.49 ~ 0.91(1.49s/it) 87.36 ~ 0.74 (1.61s/it) 76.25 ~ 0.43 (8.09s/it)
trainer - sage 88.49 ~ 0.91 (1.59s/it) 87.36 ~ 0.74 (2.58s/it) 76.25 ~ 0.43 (8.95s/it)
trainer + dataset - sage 88.49 ~ 0.91 (1.70s/it) 87.36 ~ 0.74 (2.51s/it) 76.25 ~ 0.43 (8.96s/it)
solver - sage 88.49 ~ 0.91 (1.51s/it) 87.36 ~ 0.74 (2.30s/it) 76.25 ~ 0.43 (9.74s/it)

@BeiniXie
Copy link

BeiniXie commented Aug 14, 2022

[PYG] Robust Model under Mettack

Starting cmd:

python test/performance/robust_model/model.py --model gcn/xxx --repeat 10 --dataset Cora/PubMed/CiteSeer
model cora cora citeseer citeseer pubmed pubmed
ptb rate 0 5% 0 5% 0 5%
base - gcn
model - gcn
base - gcnsvd
model - gcnsvd
base - gnnjaccard
model - gnnjaccard
base - robustgcn
model - robustgcn
base - gnnguard
model - gnnguard

[PYG] Robust Model under Mettack

Starting cmd:

python test/performance/robust_model/model.py --model gcn/xxx --repeat 10 --dataset Cora/PubMed/CiteSeer
model cora cora citeseer citeseer pubmed pubmed
ptb rate 0 20% 0 20% 0 20%
base - gcn 0.8351 ~ 0.003 0.7884 ~ 0.008 0.7340 ~ 0.006 0.6963 ~ 0.011 0.8515 ~ 0.0007 0.7512 ~ 0.0026
model - gcn 0.8351 ~ 0.003 0.7944 ~ 0.005 0.7327 ~ 0.008 0.7327 ~ 0.008 0.8553 ~ 0.0007 0.7412 ~ 0.0057
base - gnnguard 0.7810 ~ 0.006 0.7660 ~ 0.005 0.6900 ~ 0.013 0.6940 ~ 0.008 0.8538 ~ 0.0049 0.8439 ~ 0.0031
model - gnnguard 0.7952 ~ 0.003 0.7711 ~ 0.003 0.7114 ~ 0.008 0.7120 ~ 0.002 0.8536 ~ 0.0014 0.8428 ~ 0.0010

@defineZYP
Copy link
Contributor

defineZYP commented Sep 20, 2022

SSL Test

train set performance

Model MUTAG PTC-MR PROTEINS NCI1
Ours-GCN 0.6389 ~ 0.0373 0.6382 ~ 0.0188 0.7874 ~ 0.0167 0.8102 ~ 0.0242
Ours-GIN 0.6278 ~ 0.1407 0.6941 ~ 0.0606 0.7568 ~ 0.0783 0.8107 ~ 0.0384
Ours-solver + GCN 0.8222 ~ 0.0222 0.7059 ~ 0.0416 0.8018 ~ 0.0221 0.9130 ~ 0.0125
Ours-solver + GIN 0.8667 ~ 0.0667 0.7059 ~ 0.1162 0.7657 ~ 0.0138 0.8875 ~ 0.0214

valid set performance

Model MUTAG PTC-MR PROTEINS NCI1
Ours-GCN 0.7394 ~ 0.0077 0.5817 ~ 0.0098 0.7323 ~ 0.0138 0.7246 ~ 0.0041
Ours-GIN 0.7735 ~ 0.0274 0.5755 ~ 0.0174 0.6579 ~ 0.0102 0.6956 ~ 0.0095
Ours-solver + GCN 0.8197 ~ 0.0057 0.6481 ~ 0.0031 0.7677 ~ 0.0008 0.7167 ~ 0.0030
Ours-solver + GIN 0.8424 ~ 0.0130 0.6266 ~ 0.0114 0.7271 ~ 0.0049 0.7167 ~ 0.0020

test set performance

Model MUTAG PTC-MR PROTEINS NCI1
GraphCL 0.8680 ~ 0.0134 - 0.7417 ~ 0.0034 0.7463 ~ 0.0025
Ours-GCN 0.8155 ~ 0.0457 0.6063 ~ 0.0382 0.7362 ~ 0.0238 0.7440 ~ 0.0050
Ours-GIN 0.8558 ~ 0.0621 0.5133 ~ 0.0245 0.7306 ~ 0.0163 0.7117 ~ 0.0091
Ours-solver + GCN 0.8600 ~ 0.0330 0.5342 ~ 0.0332 0.7471 ~ 0.0092 0.7217 ~ 0.0113
Ours-solver + GIN 0.8694 ~ 0.0134 0.5192 ~ 0.0477 0.7231 ~ 0.0144 0.7224 ~ 0.0103

@THUMNLab THUMNLab deleted a comment from lihy96 Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants