Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VRAM/Speed tests #6

Open
brian6091 opened this issue Dec 7, 2022 · 2 comments
Open

VRAM/Speed tests #6

brian6091 opened this issue Dec 7, 2022 · 2 comments

Comments

@brian6091
Copy link
Owner

brian6091 commented Dec 7, 2022

Tesla T4

  • GPU=14396/15109MiB

  • 3.66s/it training, 1.08s/it inference

  • BATCH_SIZE=4

  • TRAIN_TEXT_ENCODER

  • USE_8BIT_ADAM

  • FP16

  • GRADIENT_CHECKPOINTING

  • GRADIENT_ACCUMULATION_STEPS=1

  • USE_EMA=False

  • RESOLUTION=512

  • No errors or warnings with xformers-0.0.15.dev0+189828c

diffusers==0.9.0
accelerate==0.14.0
torchvision @ https://download.pytorch.org/whl/cu116/torchvision-0.14.0%2Bcu116-cp38-cp38-linux_x86_64.whl
transformers==4.25.1
xformers @ https://github.com/camenduru/stable-diffusion-webui-colab/releases/download/0.0.15/xformers-0.0.15.dev0+189828c.d20221207-cp38-cp38-linux_x86_64.whl

Copy-and-paste the text below in your GitHub issue

  • Accelerate version: 0.14.0
  • Platform: Linux-5.10.133+-x86_64-with-glibc2.27
  • Python version: 3.8.15
  • Numpy version: 1.21.6
  • PyTorch version (GPU?): 1.13.0+cu116 (True)
@brian6091
Copy link
Owner Author

brian6091 commented Dec 8, 2022

A100-SXM4-40GB

  • GPU=31142/40536MiB, 32814 after first save, 33302 after 2nd save,

  • 1.03s/it training, 3.30s/it inference

  • BATCH_SIZE=4

  • TRAIN_TEXT_ENCODER

  • USE_8BIT_ADAM

  • FP16

  • GRADIENT_CHECKPOINTING

  • GRADIENT_ACCUMULATION_STEPS=1

  • USE_EMA=False

  • RESOLUTION=512

  • Warnings with xformers-0.0.15.dev0+4c06c7 (compiled on A10G)

  • https://github.com/camenduru/stable-diffusion-webui-colab/releases/download/0.0.15/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl

  • /usr/local/lib/python3.8/dist-packages/xformers/_C.so: undefined symbol: _ZNK3c104impl13OperatorEntry20reportSignatureErrorENS0_12CppSignatureE
    WARNING:xformers:WARNING: /usr/local/lib/python3.8/dist-packages/xformers/_C.so: undefined symbol: _ZNK3c104impl13OperatorEntry20reportSignatureErrorENS0_12CppSignatureE
    Need to compile C++ extensions to get sparse attention support. Please run python setup.py build develop
    */usr/local/lib/python3.8/dist-packages/diffusers/models/attention.py:433: UserWarning: Could not enable memory efficient attention. Make sure xformers is installed correctly and a GPU is available: No such operator xformers::efficient_attention_forward_cutlass - did you forget to build xformers with python setup.py develop?
    warnings.warn(

diffusers==0.9.0
accelerate==0.14.0
torchvision @ https://download.pytorch.org/whl/cu116/torchvision-0.14.0%2Bcu116-cp38-cp38-linux_x86_64.whl
transformers==4.25.1
xformers @ https://github.com/camenduru/stable-diffusion-webui-colab/releases/download/0.0.15/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl

Copy-and-paste the text below in your GitHub issue

  • Accelerate version: 0.14.0
  • Platform: Linux-5.10.133+-x86_64-with-glibc2.27
  • Python version: 3.8.15
  • Numpy version: 1.21.6
  • PyTorch version (GPU?): 1.13.0+cu116 (True)
  • Accelerate default config:
    Not found
  • Accelerate version: 0.14.0
  • Platform: Linux-5.10.133+-x86_64-with-glibc2.27
  • Python version: 3.8.15
  • Numpy version: 1.21.6
  • PyTorch version (GPU?): 1.13.0+cu116 (True)

@brian6091
Copy link
Owner Author

brian6091 commented Dec 8, 2022

A100-SXM4-40GB

  • GPU=16168/40536MiB
  • 1.23s/it training, 5.83 it/s inference
  • BATCH_SIZE=4
  • TRAIN_TEXT_ENCODER
  • USE_8BIT_ADAM
  • FP16
  • GRADIENT_CHECKPOINTING
  • GRADIENT_ACCUMULATION_STEPS=1
  • USE_EMA=False
  • RESOLUTION=512
  • No errors or warnings with 0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl

Description: Ubuntu 18.04.6 LTS
diffusers==0.9.0
torchvision @ https://download.pytorch.org/whl/cu116/torchvision-0.14.0%2Bcu116-cp38-cp38-linux_x86_64.whl
transformers==4.25.1
xformers @ https://github.com/brian6091/xformers-wheels/releases/download/0.0.15.dev0%2B4c06c79/xformers-0.0.15.dev0+4c06c79.d20221205-cp38-cp38-linux_x86_64.whl
2022-12-08 10:21:20.344739: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.

Copy-and-paste the text below in your GitHub issue

  • Accelerate version: 0.14.0
  • Platform: Linux-5.10.133+-x86_64-with-glibc2.27
  • Python version: 3.8.15
  • Numpy version: 1.21.6
  • PyTorch version (GPU?): 1.13.0+cu116 (True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant