rupeshs commited on
Commit
f254d4c
·
1 Parent(s): 21ce3db

removed old files

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +0 -2
  2. .gitignore +0 -8
  3. LICENSE +0 -21
  4. THIRD-PARTY-LICENSES +0 -143
  5. __init__.py +0 -0
  6. app.py +0 -4
  7. app_settings.py +0 -81
  8. backend/__init__.py +0 -0
  9. backend/__pycache__/__init__.cpython-311.pyc +0 -0
  10. backend/__pycache__/device.cpython-311.pyc +0 -0
  11. backend/__pycache__/image_saver.cpython-311.pyc +0 -0
  12. backend/__pycache__/lcm_models.cpython-311.pyc +0 -0
  13. backend/__pycache__/lcm_text_to_image.cpython-311.pyc +0 -0
  14. backend/device.py +0 -23
  15. backend/image_saver.py +0 -40
  16. backend/lcm_models.py +0 -11
  17. backend/lcm_text_to_image.py +0 -355
  18. backend/lcmdiffusion/pipelines/openvino/__pycache__/lcm_ov_pipeline.cpython-311.pyc +0 -0
  19. backend/lcmdiffusion/pipelines/openvino/__pycache__/lcm_scheduler.cpython-311.pyc +0 -0
  20. backend/lcmdiffusion/pipelines/openvino/lcm_ov_pipeline.py +0 -447
  21. backend/lcmdiffusion/pipelines/openvino/lcm_scheduler.py +0 -576
  22. backend/models/__pycache__/lcmdiffusion_setting.cpython-311.pyc +0 -0
  23. backend/models/lcmdiffusion_setting.py +0 -39
  24. backend/openvino/custom_ov_model_vae_decoder.py +0 -21
  25. backend/openvino/pipelines.py +0 -75
  26. backend/pipelines/lcm.py +0 -90
  27. backend/pipelines/lcm_lora.py +0 -25
  28. backend/safety_check.py +0 -17
  29. backend/tiny_decoder.py +0 -30
  30. benchmark-openvino.bat +0 -23
  31. benchmark.bat +0 -23
  32. configs/lcm-lora-models.txt +0 -4
  33. configs/lcm-models.txt +0 -8
  34. configs/openvino-lcm-models.txt +0 -9
  35. configs/stable-diffusion-models.txt +0 -7
  36. constants.py +0 -25
  37. context.py +0 -47
  38. controlnet_models/Readme.txt +0 -3
  39. docs/images/2steps-inference.jpg +0 -3
  40. docs/images/ARCGPU.png +0 -3
  41. docs/images/fastcpu-cli.png +0 -3
  42. docs/images/fastcpu-webui.png +0 -3
  43. docs/images/fastsdcpu-android-termux-pixel7.png +0 -3
  44. docs/images/fastsdcpu-api.png +0 -3
  45. docs/images/fastsdcpu-gui.jpg +0 -3
  46. docs/images/fastsdcpu-mac-gui.jpg +0 -3
  47. docs/images/fastsdcpu-screenshot.png +0 -3
  48. docs/images/fastsdcpu-webui.png +0 -3
  49. docs/images/fastsdcpu_claude.jpg +0 -3
  50. docs/images/fastsdcpu_flux_on_cpu.png +0 -3
.gitattributes DELETED
@@ -1,2 +0,0 @@
1
- *.jpg filter=lfs diff=lfs merge=lfs -text
2
- *.png filter=lfs diff=lfs merge=lfs -text
 
 
 
.gitignore DELETED
@@ -1,8 +0,0 @@
1
- env
2
- env_old
3
- *.bak
4
- *.pyc
5
- __pycache__
6
- results
7
- # excluding user settings for the GUI frontend
8
- configs/settings.yaml
 
 
 
 
 
 
 
 
 
LICENSE DELETED
@@ -1,21 +0,0 @@
1
- MIT License
2
-
3
- Copyright (c) 2023 Rupesh Sreeraman
4
-
5
- Permission is hereby granted, free of charge, to any person obtaining a copy
6
- of this software and associated documentation files (the "Software"), to deal
7
- in the Software without restriction, including without limitation the rights
8
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
- copies of the Software, and to permit persons to whom the Software is
10
- furnished to do so, subject to the following conditions:
11
-
12
- The above copyright notice and this permission notice shall be included in all
13
- copies or substantial portions of the Software.
14
-
15
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
- SOFTWARE.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
THIRD-PARTY-LICENSES DELETED
@@ -1,143 +0,0 @@
1
- stablediffusion.cpp - MIT
2
-
3
- OpenVINO stablediffusion engine - Apache 2
4
-
5
- SD Turbo - STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE AGREEMENT
6
-
7
- MIT License
8
-
9
- Copyright (c) 2023 leejet
10
-
11
- Permission is hereby granted, free of charge, to any person obtaining a copy
12
- of this software and associated documentation files (the "Software"), to deal
13
- in the Software without restriction, including without limitation the rights
14
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
15
- copies of the Software, and to permit persons to whom the Software is
16
- furnished to do so, subject to the following conditions:
17
-
18
- The above copyright notice and this permission notice shall be included in all
19
- copies or substantial portions of the Software.
20
-
21
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
22
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
23
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
24
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
25
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
26
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
27
- SOFTWARE.
28
-
29
- ERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
30
-
31
- Definitions.
32
-
33
- "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document.
34
-
35
- "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License.
36
-
37
- "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
38
-
39
- "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License.
40
-
41
- "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files.
42
-
43
- "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types.
44
-
45
- "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below).
46
-
47
- "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof.
48
-
49
- "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."
50
-
51
- "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work.
52
-
53
- Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.
54
-
55
- Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
56
-
57
- Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions:
58
-
59
- (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and
60
-
61
- (b) You must cause any modified files to carry prominent notices stating that You changed the files; and
62
-
63
- (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
64
-
65
- (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
66
-
67
- You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
68
-
69
- Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
70
-
71
- Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file.
72
-
73
- Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License.
74
-
75
- Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
76
-
77
- Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
78
-
79
- END OF TERMS AND CONDITIONS
80
-
81
- APPENDIX: How to apply the Apache License to your work.
82
-
83
- To apply the Apache License to your work, attach the following
84
- boilerplate notice, with the fields enclosed by brackets "[]"
85
- replaced with your own identifying information. (Don't include
86
- the brackets!) The text should be enclosed in the appropriate
87
- comment syntax for the file format. We also recommend that a
88
- file or class name and description of purpose be included on the
89
- same "printed page" as the copyright notice for easier
90
- identification within third-party archives.
91
- Copyright [yyyy] [name of copyright owner]
92
-
93
- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
94
-
95
- <http://www.apache.org/licenses/LICENSE-2.0>
96
- Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
97
-
98
- STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE AGREEMENT
99
- Dated: November 28, 2023
100
-
101
- By using or distributing any portion or element of the Models, Software, Software Products or Derivative Works, you agree to be bound by this Agreement.
102
-
103
- "Agreement" means this Stable Non-Commercial Research Community License Agreement.
104
-
105
- “AUP” means the Stability AI Acceptable Use Policy available at <https://stability.ai/use-policy>, as may be updated from time to time.
106
-
107
- "Derivative Work(s)” means (a) any derivative work of the Software Products as recognized by U.S. copyright laws and (b) any modifications to a Model, and any other model created which is based on or derived from the Model or the Model’s output. For clarity, Derivative Works do not include the output of any Model.
108
-
109
- “Documentation” means any specifications, manuals, documentation, and other written information provided by Stability AI related to the Software.
110
-
111
- "Licensee" or "you" means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity's behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.
112
-
113
- “Model(s)" means, collectively, Stability AI’s proprietary models and algorithms, including machine-learning models, trained model weights and other elements of the foregoing, made available under this Agreement.
114
-
115
- “Non-Commercial Uses” means exercising any of the rights granted herein for the purpose of research or non-commercial purposes. Non-Commercial Uses does not include any production use of the Software Products or any Derivative Works.
116
-
117
- "Stability AI" or "we" means Stability AI Ltd. and its affiliates.
118
-
119
- "Software" means Stability AI’s proprietary software made available under this Agreement.
120
-
121
- “Software Products” means the Models, Software and Documentation, individually or in any combination.
122
-
123
- 1. License Rights and Redistribution.
124
-
125
- a. Subject to your compliance with this Agreement, the AUP (which is hereby incorporated herein by reference), and the Documentation, Stability AI grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty free and limited license under Stability AI’s intellectual property or other rights owned or controlled by Stability AI embodied in the Software Products to use, reproduce, distribute, and create Derivative Works of, the Software Products, in each case for Non-Commercial Uses only.
126
-
127
- b. You may not use the Software Products or Derivative Works to enable third parties to use the Software Products or Derivative Works as part of your hosted service or via your APIs, whether you are adding substantial additional functionality thereto or not. Merely distributing the Software Products or Derivative Works for download online without offering any related service (ex. by distributing the Models on HuggingFace) is not a violation of this subsection. If you wish to use the Software Products or any Derivative Works for commercial or production use or you wish to make the Software Products or any Derivative Works available to third parties via your hosted service or your APIs, contact Stability AI at <https://stability.ai/contact>.
128
-
129
- c. If you distribute or make the Software Products, or any Derivative Works thereof, available to a third party, the Software Products, Derivative Works, or any portion thereof, respectively, will remain subject to this Agreement and you must (i) provide a copy of this Agreement to such third party, and (ii) retain the following attribution notice within a "Notice" text file distributed as a part of such copies: "This Stability AI Model is licensed under the Stability AI Non-Commercial Research Community License, Copyright (c) Stability AI Ltd. All Rights Reserved.” If you create a Derivative Work of a Software Product, you may add your own attribution notices to the Notice file included with the Software Product, provided that you clearly indicate which attributions apply to the Software Product and you must state in the NOTICE file that you changed the Software Product and how it was modified.
130
-
131
- 2. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE SOFTWARE PRODUCTS AND ANY OUTPUT AND RESULTS THERE FROM ARE PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE SOFTWARE PRODUCTS, DERIVATIVE WORKS OR ANY OUTPUT OR RESULTS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE SOFTWARE PRODUCTS, DERIVATIVE WORKS AND ANY OUTPUT AND RESULTS.
132
-
133
- 3. Limitation of Liability. IN NO EVENT WILL STABILITY AI OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS AGREEMENT, FOR ANY LOST PROFITS OR ANY DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN IF STABILITY AI OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
134
-
135
- 4. Intellectual Property.
136
-
137
- a. No trademark licenses are granted under this Agreement, and in connection with the Software Products or Derivative Works, neither Stability AI nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates, except as required for reasonable and customary use in describing and redistributing the Software Products or Derivative Works.
138
-
139
- b. Subject to Stability AI’s ownership of the Software Products and Derivative Works made by or for Stability AI, with respect to any Derivative Works that are made by you, as between you and Stability AI, you are and will be the owner of such Derivative Works
140
-
141
- c. If you institute litigation or other proceedings against Stability AI (including a cross-claim or counterclaim in a lawsuit) alleging that the Software Products, Derivative Works or associated outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Stability AI from and against any claim by any third party arising out of or related to your use or distribution of the Software Products or Derivative Works in violation of this Agreement.
142
-
143
- 5. Term and Termination. The term of this Agreement will commence upon your acceptance of this Agreement or access to the Software Products and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Stability AI may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement, you shall delete and cease use of any Software Products or Derivative Works. Sections 2-4 shall survive the termination of this Agreement.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
__init__.py DELETED
File without changes
app.py DELETED
@@ -1,4 +0,0 @@
1
- from frontend.webui.hf_demo import start_demo_text_to_image
2
-
3
- print("Starting HF demo text to image")
4
- start_demo_text_to_image(False)
 
 
 
 
 
app_settings.py DELETED
@@ -1,81 +0,0 @@
1
- import yaml
2
- from os import path, makedirs
3
- from models.settings import Settings
4
- from paths import FastStableDiffusionPaths
5
- from utils import get_models_from_text_file
6
- from constants import (
7
- OPENVINO_LCM_MODELS_FILE,
8
- LCM_LORA_MODELS_FILE,
9
- SD_MODELS_FILE,
10
- LCM_MODELS_FILE,
11
- )
12
- from copy import deepcopy
13
-
14
-
15
- class AppSettings:
16
- def __init__(self):
17
- self.config_path = FastStableDiffusionPaths().get_app_settings_path()
18
- self._stable_diffsuion_models = ["Lykon/dreamshaper-8"]
19
- self._lcm_lora_models = ["latent-consistency/lcm-lora-sdv1-5"]
20
- self._openvino_lcm_models = ["rupeshs/sd-turbo-openvino"]
21
- self._lcm_models = ["stabilityai/sd-turbo"]
22
-
23
- @property
24
- def settings(self):
25
- return self._config
26
-
27
- @property
28
- def stable_diffsuion_models(self):
29
- return self._stable_diffsuion_models
30
-
31
- @property
32
- def openvino_lcm_models(self):
33
- return self._openvino_lcm_models
34
-
35
- @property
36
- def lcm_models(self):
37
- return self._lcm_models
38
-
39
- @property
40
- def lcm_lora_models(self):
41
- return self._lcm_lora_models
42
-
43
- def load(self, skip_file=False):
44
- if skip_file:
45
- print("Skipping config file")
46
- settings_dict = self._load_default()
47
- self._config = Settings.parse_obj(settings_dict)
48
- else:
49
- if not path.exists(self.config_path):
50
- base_dir = path.dirname(self.config_path)
51
- if not path.exists(base_dir):
52
- makedirs(base_dir)
53
- try:
54
- print("Settings not found creating default settings")
55
- with open(self.config_path, "w") as file:
56
- yaml.dump(
57
- self._load_default(),
58
- file,
59
- )
60
- except Exception as ex:
61
- print(f"Error in creating settings : {ex}")
62
- exit()
63
- try:
64
- with open(self.config_path) as file:
65
- settings_dict = yaml.safe_load(file)
66
- self._config = Settings.parse_obj(settings_dict)
67
- except Exception as ex:
68
- print(f"Error in loading settings : {ex}")
69
-
70
- def save(self):
71
- try:
72
- with open(self.config_path, "w") as file:
73
- tmp_cfg = deepcopy(self._config)
74
- tmp_cfg.lcm_diffusion_setting.init_image = None
75
- yaml.dump(tmp_cfg.dict(), file)
76
- except Exception as ex:
77
- print(f"Error in saving settings : {ex}")
78
-
79
- def _load_default(self) -> dict:
80
- defult_config = Settings()
81
- return defult_config.dict()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/__init__.py DELETED
File without changes
backend/__pycache__/__init__.cpython-311.pyc DELETED
Binary file (152 Bytes)
 
backend/__pycache__/device.cpython-311.pyc DELETED
Binary file (1.56 kB)
 
backend/__pycache__/image_saver.cpython-311.pyc DELETED
Binary file (2.27 kB)
 
backend/__pycache__/lcm_models.cpython-311.pyc DELETED
Binary file (542 Bytes)
 
backend/__pycache__/lcm_text_to_image.cpython-311.pyc DELETED
Binary file (12.8 kB)
 
backend/device.py DELETED
@@ -1,23 +0,0 @@
1
- import platform
2
- from constants import DEVICE
3
- import torch
4
- import openvino as ov
5
-
6
- core = ov.Core()
7
-
8
-
9
- def is_openvino_device() -> bool:
10
- if DEVICE.lower() == "cpu" or DEVICE.lower()[0] == "g":
11
- return True
12
- else:
13
- return False
14
-
15
-
16
- def get_device_name() -> str:
17
- if DEVICE == "cuda" or DEVICE == "mps":
18
- default_gpu_index = torch.cuda.current_device()
19
- return torch.cuda.get_device_name(default_gpu_index)
20
- elif platform.system().lower() == "darwin":
21
- return platform.processor()
22
- elif is_openvino_device():
23
- return core.get_property(DEVICE.upper(), "FULL_DEVICE_NAME")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/image_saver.py DELETED
@@ -1,40 +0,0 @@
1
- from os import path, mkdir
2
- from typing import Any
3
- from uuid import uuid4
4
- from backend.models.lcmdiffusion_setting import LCMDiffusionSetting
5
- import json
6
-
7
-
8
- class ImageSaver:
9
- @staticmethod
10
- def save_images(
11
- output_path: str,
12
- images: Any,
13
- folder_name: str = "",
14
- format: str = ".png",
15
- lcm_diffusion_setting: LCMDiffusionSetting = None,
16
- ) -> None:
17
- gen_id = uuid4()
18
-
19
- for index, image in enumerate(images):
20
- if not path.exists(output_path):
21
- mkdir(output_path)
22
-
23
- if folder_name:
24
- out_path = path.join(
25
- output_path,
26
- folder_name,
27
- )
28
- else:
29
- out_path = output_path
30
-
31
- if not path.exists(out_path):
32
- mkdir(out_path)
33
- image.save(path.join(out_path, f"{gen_id}-{index+1}{format}"))
34
- if lcm_diffusion_setting:
35
- with open(path.join(out_path, f"{gen_id}.json"), "w") as json_file:
36
- json.dump(
37
- lcm_diffusion_setting.model_dump(exclude="init_image"),
38
- json_file,
39
- indent=4,
40
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/lcm_models.py DELETED
@@ -1,11 +0,0 @@
1
- from typing import List
2
- from constants import LCM_DEFAULT_MODEL
3
-
4
-
5
- def get_available_models() -> List:
6
- models = [
7
- LCM_DEFAULT_MODEL,
8
- "latent-consistency/lcm-sdxl",
9
- "latent-consistency/lcm-ssd-1b",
10
- ]
11
- return models
 
 
 
 
 
 
 
 
 
 
 
 
backend/lcm_text_to_image.py DELETED
@@ -1,355 +0,0 @@
1
- from typing import Any
2
- from diffusers import LCMScheduler
3
- import torch
4
- from backend.models.lcmdiffusion_setting import LCMDiffusionSetting
5
- import numpy as np
6
- from constants import DEVICE
7
- from backend.models.lcmdiffusion_setting import LCMLora
8
- from backend.device import is_openvino_device
9
- from backend.openvino.pipelines import (
10
- get_ov_text_to_image_pipeline,
11
- ov_load_taesd,
12
- get_ov_image_to_image_pipeline,
13
- )
14
- from backend.pipelines.lcm import (
15
- get_lcm_model_pipeline,
16
- load_taesd,
17
- get_image_to_image_pipeline,
18
- )
19
- from backend.pipelines.lcm_lora import get_lcm_lora_pipeline
20
- from backend.models.lcmdiffusion_setting import DiffusionTask
21
- from image_ops import resize_pil_image
22
- from math import ceil
23
-
24
-
25
- class LCMTextToImage:
26
- def __init__(
27
- self,
28
- device: str = "cpu",
29
- ) -> None:
30
- self.pipeline = None
31
- self.use_openvino = False
32
- self.device = ""
33
- self.previous_model_id = None
34
- self.previous_use_tae_sd = False
35
- self.previous_use_lcm_lora = False
36
- self.previous_ov_model_id = ""
37
- self.previous_safety_checker = False
38
- self.previous_use_openvino = False
39
- self.img_to_img_pipeline = None
40
- self.is_openvino_init = False
41
- self.task_type = DiffusionTask.text_to_image
42
- self.torch_data_type = (
43
- torch.float32 if is_openvino_device() or DEVICE == "mps" else torch.float16
44
- )
45
- print(f"Torch datatype : {self.torch_data_type}")
46
-
47
- def _pipeline_to_device(self):
48
- print(f"Pipeline device : {DEVICE}")
49
- print(f"Pipeline dtype : {self.torch_data_type}")
50
- self.pipeline.to(
51
- torch_device=DEVICE,
52
- torch_dtype=self.torch_data_type,
53
- )
54
-
55
- def _add_freeu(self):
56
- pipeline_class = self.pipeline.__class__.__name__
57
- if isinstance(self.pipeline.scheduler, LCMScheduler):
58
- if pipeline_class == "StableDiffusionPipeline":
59
- print("Add FreeU - SD")
60
- self.pipeline.enable_freeu(
61
- s1=0.9,
62
- s2=0.2,
63
- b1=1.2,
64
- b2=1.4,
65
- )
66
- elif pipeline_class == "StableDiffusionXLPipeline":
67
- print("Add FreeU - SDXL")
68
- self.pipeline.enable_freeu(
69
- s1=0.6,
70
- s2=0.4,
71
- b1=1.1,
72
- b2=1.2,
73
- )
74
-
75
- def _update_lcm_scheduler_params(self):
76
- if isinstance(self.pipeline.scheduler, LCMScheduler):
77
- self.pipeline.scheduler = LCMScheduler.from_config(
78
- self.pipeline.scheduler.config,
79
- beta_start=0.001,
80
- beta_end=0.01,
81
- )
82
-
83
- def init(
84
- self,
85
- device: str = "cpu",
86
- lcm_diffusion_setting: LCMDiffusionSetting = LCMDiffusionSetting(),
87
- ) -> None:
88
- self.device = device
89
- self.use_openvino = lcm_diffusion_setting.use_openvino
90
- model_id = lcm_diffusion_setting.lcm_model_id
91
- use_local_model = lcm_diffusion_setting.use_offline_model
92
- use_tiny_auto_encoder = lcm_diffusion_setting.use_tiny_auto_encoder
93
- use_lora = lcm_diffusion_setting.use_lcm_lora
94
- lcm_lora: LCMLora = lcm_diffusion_setting.lcm_lora
95
- ov_model_id = lcm_diffusion_setting.openvino_lcm_model_id
96
-
97
- if lcm_diffusion_setting.diffusion_task == DiffusionTask.image_to_image.value:
98
- lcm_diffusion_setting.init_image = resize_pil_image(
99
- lcm_diffusion_setting.init_image,
100
- lcm_diffusion_setting.image_width,
101
- lcm_diffusion_setting.image_height,
102
- )
103
-
104
- if (
105
- self.pipeline is None
106
- or self.previous_model_id != model_id
107
- or self.previous_use_tae_sd != use_tiny_auto_encoder
108
- or self.previous_lcm_lora_base_id != lcm_lora.base_model_id
109
- or self.previous_lcm_lora_id != lcm_lora.lcm_lora_id
110
- or self.previous_use_lcm_lora != use_lora
111
- or self.previous_ov_model_id != ov_model_id
112
- or self.previous_safety_checker != lcm_diffusion_setting.use_safety_checker
113
- or self.previous_use_openvino != lcm_diffusion_setting.use_openvino
114
- or self.previous_task_type != lcm_diffusion_setting.diffusion_task
115
- ):
116
- if self.use_openvino and is_openvino_device():
117
- if self.pipeline:
118
- del self.pipeline
119
- self.pipeline = None
120
- self.is_openvino_init = True
121
- if (
122
- lcm_diffusion_setting.diffusion_task
123
- == DiffusionTask.text_to_image.value
124
- ):
125
- print(f"***** Init Text to image (OpenVINO) - {ov_model_id} *****")
126
- self.pipeline = get_ov_text_to_image_pipeline(
127
- ov_model_id,
128
- use_local_model,
129
- )
130
- elif (
131
- lcm_diffusion_setting.diffusion_task
132
- == DiffusionTask.image_to_image.value
133
- ):
134
- print(f"***** Image to image (OpenVINO) - {ov_model_id} *****")
135
- self.pipeline = get_ov_image_to_image_pipeline(
136
- ov_model_id,
137
- use_local_model,
138
- )
139
- else:
140
- if self.pipeline:
141
- del self.pipeline
142
- self.pipeline = None
143
- if self.img_to_img_pipeline:
144
- del self.img_to_img_pipeline
145
- self.img_to_img_pipeline = None
146
-
147
- if use_lora:
148
- print(
149
- f"***** Init LCM-LoRA pipeline - {lcm_lora.base_model_id} *****"
150
- )
151
- self.pipeline = get_lcm_lora_pipeline(
152
- lcm_lora.base_model_id,
153
- lcm_lora.lcm_lora_id,
154
- use_local_model,
155
- torch_data_type=self.torch_data_type,
156
- )
157
- else:
158
- print(f"***** Init LCM Model pipeline - {model_id} *****")
159
- self.pipeline = get_lcm_model_pipeline(
160
- model_id,
161
- use_local_model,
162
- )
163
-
164
- if (
165
- lcm_diffusion_setting.diffusion_task
166
- == DiffusionTask.image_to_image.value
167
- ):
168
- self.img_to_img_pipeline = get_image_to_image_pipeline(
169
- self.pipeline
170
- )
171
- self._pipeline_to_device()
172
-
173
- if use_tiny_auto_encoder:
174
- if self.use_openvino and is_openvino_device():
175
- print("Using Tiny Auto Encoder (OpenVINO)")
176
- ov_load_taesd(
177
- self.pipeline,
178
- use_local_model,
179
- )
180
- else:
181
- print("Using Tiny Auto Encoder")
182
- if (
183
- lcm_diffusion_setting.diffusion_task
184
- == DiffusionTask.text_to_image.value
185
- ):
186
- load_taesd(
187
- self.pipeline,
188
- use_local_model,
189
- self.torch_data_type,
190
- )
191
- elif (
192
- lcm_diffusion_setting.diffusion_task
193
- == DiffusionTask.image_to_image.value
194
- ):
195
- load_taesd(
196
- self.img_to_img_pipeline,
197
- use_local_model,
198
- self.torch_data_type,
199
- )
200
-
201
- if (
202
- lcm_diffusion_setting.diffusion_task
203
- == DiffusionTask.image_to_image.value
204
- and lcm_diffusion_setting.use_openvino
205
- ):
206
- self.pipeline.scheduler = LCMScheduler.from_config(
207
- self.pipeline.scheduler.config,
208
- )
209
- else:
210
- self._update_lcm_scheduler_params()
211
-
212
- if use_lora:
213
- self._add_freeu()
214
-
215
- self.previous_model_id = model_id
216
- self.previous_ov_model_id = ov_model_id
217
- self.previous_use_tae_sd = use_tiny_auto_encoder
218
- self.previous_lcm_lora_base_id = lcm_lora.base_model_id
219
- self.previous_lcm_lora_id = lcm_lora.lcm_lora_id
220
- self.previous_use_lcm_lora = use_lora
221
- self.previous_safety_checker = lcm_diffusion_setting.use_safety_checker
222
- self.previous_use_openvino = lcm_diffusion_setting.use_openvino
223
- self.previous_task_type = lcm_diffusion_setting.diffusion_task
224
- if (
225
- lcm_diffusion_setting.diffusion_task
226
- == DiffusionTask.text_to_image.value
227
- ):
228
- print(f"Pipeline : {self.pipeline}")
229
- elif (
230
- lcm_diffusion_setting.diffusion_task
231
- == DiffusionTask.image_to_image.value
232
- ):
233
- if self.use_openvino and is_openvino_device():
234
- print(f"Pipeline : {self.pipeline}")
235
- else:
236
- print(f"Pipeline : {self.img_to_img_pipeline}")
237
-
238
- def generate(
239
- self,
240
- lcm_diffusion_setting: LCMDiffusionSetting,
241
- reshape: bool = False,
242
- ) -> Any:
243
- guidance_scale = lcm_diffusion_setting.guidance_scale
244
- img_to_img_inference_steps = lcm_diffusion_setting.inference_steps
245
- check_step_value = int(
246
- lcm_diffusion_setting.inference_steps * lcm_diffusion_setting.strength
247
- )
248
- if (
249
- lcm_diffusion_setting.diffusion_task == DiffusionTask.image_to_image.value
250
- and check_step_value < 1
251
- ):
252
- img_to_img_inference_steps = ceil(1 / lcm_diffusion_setting.strength)
253
- print(
254
- f"Strength: {lcm_diffusion_setting.strength},{img_to_img_inference_steps}"
255
- )
256
-
257
- if lcm_diffusion_setting.use_seed:
258
- cur_seed = lcm_diffusion_setting.seed
259
- if self.use_openvino:
260
- np.random.seed(cur_seed)
261
- else:
262
- torch.manual_seed(cur_seed)
263
-
264
- is_openvino_pipe = lcm_diffusion_setting.use_openvino and is_openvino_device()
265
- if is_openvino_pipe:
266
- print("Using OpenVINO")
267
- if reshape and not self.is_openvino_init:
268
- print("Reshape and compile")
269
- self.pipeline.reshape(
270
- batch_size=-1,
271
- height=lcm_diffusion_setting.image_height,
272
- width=lcm_diffusion_setting.image_width,
273
- num_images_per_prompt=lcm_diffusion_setting.number_of_images,
274
- )
275
- self.pipeline.compile()
276
-
277
- if self.is_openvino_init:
278
- self.is_openvino_init = False
279
-
280
- if not lcm_diffusion_setting.use_safety_checker:
281
- self.pipeline.safety_checker = None
282
- if (
283
- lcm_diffusion_setting.diffusion_task
284
- == DiffusionTask.image_to_image.value
285
- and not is_openvino_pipe
286
- ):
287
- self.img_to_img_pipeline.safety_checker = None
288
-
289
- if (
290
- not lcm_diffusion_setting.use_lcm_lora
291
- and not lcm_diffusion_setting.use_openvino
292
- and lcm_diffusion_setting.guidance_scale != 1.0
293
- ):
294
- print("Not using LCM-LoRA so setting guidance_scale 1.0")
295
- guidance_scale = 1.0
296
-
297
- if lcm_diffusion_setting.use_openvino:
298
- if (
299
- lcm_diffusion_setting.diffusion_task
300
- == DiffusionTask.text_to_image.value
301
- ):
302
- result_images = self.pipeline(
303
- prompt=lcm_diffusion_setting.prompt,
304
- negative_prompt=lcm_diffusion_setting.negative_prompt,
305
- num_inference_steps=lcm_diffusion_setting.inference_steps,
306
- guidance_scale=guidance_scale,
307
- width=lcm_diffusion_setting.image_width,
308
- height=lcm_diffusion_setting.image_height,
309
- num_images_per_prompt=lcm_diffusion_setting.number_of_images,
310
- ).images
311
- elif (
312
- lcm_diffusion_setting.diffusion_task
313
- == DiffusionTask.image_to_image.value
314
- ):
315
- result_images = self.pipeline(
316
- image=lcm_diffusion_setting.init_image,
317
- strength=lcm_diffusion_setting.strength,
318
- prompt=lcm_diffusion_setting.prompt,
319
- negative_prompt=lcm_diffusion_setting.negative_prompt,
320
- num_inference_steps=img_to_img_inference_steps * 3,
321
- guidance_scale=guidance_scale,
322
- num_images_per_prompt=lcm_diffusion_setting.number_of_images,
323
- ).images
324
-
325
- else:
326
- if (
327
- lcm_diffusion_setting.diffusion_task
328
- == DiffusionTask.text_to_image.value
329
- ):
330
- result_images = self.pipeline(
331
- prompt=lcm_diffusion_setting.prompt,
332
- negative_prompt=lcm_diffusion_setting.negative_prompt,
333
- num_inference_steps=lcm_diffusion_setting.inference_steps,
334
- guidance_scale=guidance_scale,
335
- width=lcm_diffusion_setting.image_width,
336
- height=lcm_diffusion_setting.image_height,
337
- num_images_per_prompt=lcm_diffusion_setting.number_of_images,
338
- ).images
339
- elif (
340
- lcm_diffusion_setting.diffusion_task
341
- == DiffusionTask.image_to_image.value
342
- ):
343
- result_images = self.img_to_img_pipeline(
344
- image=lcm_diffusion_setting.init_image,
345
- strength=lcm_diffusion_setting.strength,
346
- prompt=lcm_diffusion_setting.prompt,
347
- negative_prompt=lcm_diffusion_setting.negative_prompt,
348
- num_inference_steps=img_to_img_inference_steps,
349
- guidance_scale=guidance_scale,
350
- width=lcm_diffusion_setting.image_width,
351
- height=lcm_diffusion_setting.image_height,
352
- num_images_per_prompt=lcm_diffusion_setting.number_of_images,
353
- ).images
354
-
355
- return result_images
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/lcmdiffusion/pipelines/openvino/__pycache__/lcm_ov_pipeline.cpython-311.pyc DELETED
Binary file (21.5 kB)
 
backend/lcmdiffusion/pipelines/openvino/__pycache__/lcm_scheduler.cpython-311.pyc DELETED
Binary file (26.6 kB)
 
backend/lcmdiffusion/pipelines/openvino/lcm_ov_pipeline.py DELETED
@@ -1,447 +0,0 @@
1
- # https://huggingface.co/deinferno/LCM_Dreamshaper_v7-openvino
2
-
3
- import inspect
4
-
5
- from pathlib import Path
6
- from tempfile import TemporaryDirectory
7
- from typing import List, Optional, Tuple, Union, Dict, Any, Callable, OrderedDict
8
-
9
- import numpy as np
10
- import openvino
11
- import torch
12
-
13
- from diffusers.pipelines.stable_diffusion import StableDiffusionPipelineOutput
14
- from optimum.intel.openvino.modeling_diffusion import (
15
- OVStableDiffusionPipeline,
16
- OVModelUnet,
17
- OVModelVaeDecoder,
18
- OVModelTextEncoder,
19
- OVModelVaeEncoder,
20
- VaeImageProcessor,
21
- )
22
- from optimum.utils import (
23
- DIFFUSION_MODEL_TEXT_ENCODER_2_SUBFOLDER,
24
- DIFFUSION_MODEL_TEXT_ENCODER_SUBFOLDER,
25
- DIFFUSION_MODEL_UNET_SUBFOLDER,
26
- DIFFUSION_MODEL_VAE_DECODER_SUBFOLDER,
27
- DIFFUSION_MODEL_VAE_ENCODER_SUBFOLDER,
28
- )
29
-
30
-
31
- from diffusers import logging
32
-
33
- logger = logging.get_logger(__name__) # pylint: disable=invalid-name
34
-
35
-
36
- class LCMOVModelUnet(OVModelUnet):
37
- def __call__(
38
- self,
39
- sample: np.ndarray,
40
- timestep: np.ndarray,
41
- encoder_hidden_states: np.ndarray,
42
- timestep_cond: Optional[np.ndarray] = None,
43
- text_embeds: Optional[np.ndarray] = None,
44
- time_ids: Optional[np.ndarray] = None,
45
- ):
46
- self._compile()
47
-
48
- inputs = {
49
- "sample": sample,
50
- "timestep": timestep,
51
- "encoder_hidden_states": encoder_hidden_states,
52
- }
53
-
54
- if timestep_cond is not None:
55
- inputs["timestep_cond"] = timestep_cond
56
- if text_embeds is not None:
57
- inputs["text_embeds"] = text_embeds
58
- if time_ids is not None:
59
- inputs["time_ids"] = time_ids
60
-
61
- outputs = self.request(inputs, shared_memory=True)
62
- return list(outputs.values())
63
-
64
-
65
- class OVLatentConsistencyModelPipeline(OVStableDiffusionPipeline):
66
- def __init__(
67
- self,
68
- vae_decoder: openvino.runtime.Model,
69
- text_encoder: openvino.runtime.Model,
70
- unet: openvino.runtime.Model,
71
- config: Dict[str, Any],
72
- tokenizer: "CLIPTokenizer",
73
- scheduler: Union["DDIMScheduler", "PNDMScheduler", "LMSDiscreteScheduler"],
74
- feature_extractor: Optional["CLIPFeatureExtractor"] = None,
75
- vae_encoder: Optional[openvino.runtime.Model] = None,
76
- text_encoder_2: Optional[openvino.runtime.Model] = None,
77
- tokenizer_2: Optional["CLIPTokenizer"] = None,
78
- device: str = "CPU",
79
- dynamic_shapes: bool = True,
80
- compile: bool = True,
81
- ov_config: Optional[Dict[str, str]] = None,
82
- model_save_dir: Optional[Union[str, Path, TemporaryDirectory]] = None,
83
- **kwargs,
84
- ):
85
- self._internal_dict = config
86
- self._device = device.upper()
87
- self.is_dynamic = dynamic_shapes
88
- self.ov_config = ov_config if ov_config is not None else {}
89
- self._model_save_dir = (
90
- Path(model_save_dir.name)
91
- if isinstance(model_save_dir, TemporaryDirectory)
92
- else model_save_dir
93
- )
94
- self.vae_decoder = OVModelVaeDecoder(vae_decoder, self)
95
- self.unet = LCMOVModelUnet(unet, self)
96
- self.text_encoder = (
97
- OVModelTextEncoder(text_encoder, self) if text_encoder is not None else None
98
- )
99
- self.text_encoder_2 = (
100
- OVModelTextEncoder(
101
- text_encoder_2,
102
- self,
103
- model_name=DIFFUSION_MODEL_TEXT_ENCODER_2_SUBFOLDER,
104
- )
105
- if text_encoder_2 is not None
106
- else None
107
- )
108
- self.vae_encoder = (
109
- OVModelVaeEncoder(vae_encoder, self) if vae_encoder is not None else None
110
- )
111
-
112
- if "block_out_channels" in self.vae_decoder.config:
113
- self.vae_scale_factor = 2 ** (
114
- len(self.vae_decoder.config["block_out_channels"]) - 1
115
- )
116
- else:
117
- self.vae_scale_factor = 8
118
-
119
- self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor)
120
-
121
- self.tokenizer = tokenizer
122
- self.tokenizer_2 = tokenizer_2
123
- self.scheduler = scheduler
124
- self.feature_extractor = feature_extractor
125
- self.safety_checker = None
126
- self.preprocessors = []
127
-
128
- if self.is_dynamic:
129
- self.reshape(batch_size=-1, height=-1, width=-1, num_images_per_prompt=-1)
130
-
131
- if compile:
132
- self.compile()
133
-
134
- sub_models = {
135
- DIFFUSION_MODEL_TEXT_ENCODER_SUBFOLDER: self.text_encoder,
136
- DIFFUSION_MODEL_UNET_SUBFOLDER: self.unet,
137
- DIFFUSION_MODEL_VAE_DECODER_SUBFOLDER: self.vae_decoder,
138
- DIFFUSION_MODEL_VAE_ENCODER_SUBFOLDER: self.vae_encoder,
139
- DIFFUSION_MODEL_TEXT_ENCODER_2_SUBFOLDER: self.text_encoder_2,
140
- }
141
- for name in sub_models.keys():
142
- self._internal_dict[name] = (
143
- ("optimum", sub_models[name].__class__.__name__)
144
- if sub_models[name] is not None
145
- else (None, None)
146
- )
147
-
148
- self._internal_dict.pop("vae", None)
149
-
150
- def _reshape_unet(
151
- self,
152
- model: openvino.runtime.Model,
153
- batch_size: int = -1,
154
- height: int = -1,
155
- width: int = -1,
156
- num_images_per_prompt: int = -1,
157
- tokenizer_max_length: int = -1,
158
- ):
159
- if batch_size == -1 or num_images_per_prompt == -1:
160
- batch_size = -1
161
- else:
162
- batch_size = batch_size * num_images_per_prompt
163
-
164
- height = height // self.vae_scale_factor if height > 0 else height
165
- width = width // self.vae_scale_factor if width > 0 else width
166
- shapes = {}
167
- for inputs in model.inputs:
168
- shapes[inputs] = inputs.get_partial_shape()
169
- if inputs.get_any_name() == "timestep":
170
- shapes[inputs][0] = 1
171
- elif inputs.get_any_name() == "sample":
172
- in_channels = self.unet.config.get("in_channels", None)
173
- if in_channels is None:
174
- in_channels = shapes[inputs][1]
175
- if in_channels.is_dynamic:
176
- logger.warning(
177
- "Could not identify `in_channels` from the unet configuration, to statically reshape the unet please provide a configuration."
178
- )
179
- self.is_dynamic = True
180
-
181
- shapes[inputs] = [batch_size, in_channels, height, width]
182
- elif inputs.get_any_name() == "timestep_cond":
183
- shapes[inputs] = [batch_size, inputs.get_partial_shape()[1]]
184
- elif inputs.get_any_name() == "text_embeds":
185
- shapes[inputs] = [
186
- batch_size,
187
- self.text_encoder_2.config["projection_dim"],
188
- ]
189
- elif inputs.get_any_name() == "time_ids":
190
- shapes[inputs] = [batch_size, inputs.get_partial_shape()[1]]
191
- else:
192
- shapes[inputs][0] = batch_size
193
- shapes[inputs][1] = tokenizer_max_length
194
- model.reshape(shapes)
195
- return model
196
-
197
- def get_guidance_scale_embedding(self, w, embedding_dim=512, dtype=np.float32):
198
- """
199
- see https://github.com/google-research/vdm/blob/dc27b98a554f65cdc654b800da5aa1846545d41b/model_vdm.py#L298
200
- Args:
201
- timesteps: np.array: generate embedding vectors at these timesteps
202
- embedding_dim: int: dimension of the embeddings to generate
203
- dtype: data type of the generated embeddings
204
-
205
- Returns:
206
- embedding vectors with shape `(len(timesteps), embedding_dim)`
207
- """
208
- assert len(w.shape) == 1
209
- w = w * 1000.0
210
-
211
- half_dim = embedding_dim // 2
212
- emb = np.log(np.array(10000.0)) / (half_dim - 1)
213
- emb = np.exp(np.arange(half_dim, dtype=dtype) * -emb)
214
- emb = w.astype(dtype)[:, None] * emb[None, :]
215
- emb = np.concatenate([np.sin(emb), np.cos(emb)], axis=1)
216
- if embedding_dim % 2 == 1: # zero pad
217
- emb = np.pad(emb, (0, 1))
218
- assert emb.shape == (w.shape[0], embedding_dim)
219
- return emb
220
-
221
- # Adapted from https://github.com/huggingface/optimum/blob/15b8d1eed4d83c5004d3b60f6b6f13744b358f01/optimum/pipelines/diffusers/pipeline_stable_diffusion.py#L201
222
- def __call__(
223
- self,
224
- prompt: Optional[Union[str, List[str]]] = None,
225
- height: Optional[int] = None,
226
- width: Optional[int] = None,
227
- num_inference_steps: int = 4,
228
- original_inference_steps: int = None,
229
- guidance_scale: float = 7.5,
230
- num_images_per_prompt: int = 1,
231
- eta: float = 0.0,
232
- generator: Optional[np.random.RandomState] = None,
233
- latents: Optional[np.ndarray] = None,
234
- prompt_embeds: Optional[np.ndarray] = None,
235
- output_type: str = "pil",
236
- return_dict: bool = True,
237
- callback: Optional[Callable[[int, int, np.ndarray], None]] = None,
238
- callback_steps: int = 1,
239
- guidance_rescale: float = 0.0,
240
- ):
241
- r"""
242
- Function invoked when calling the pipeline for generation.
243
-
244
- Args:
245
- prompt (`Optional[Union[str, List[str]]]`, defaults to None):
246
- The prompt or prompts to guide the image generation. If not defined, one has to pass `prompt_embeds`.
247
- instead.
248
- height (`Optional[int]`, defaults to None):
249
- The height in pixels of the generated image.
250
- width (`Optional[int]`, defaults to None):
251
- The width in pixels of the generated image.
252
- num_inference_steps (`int`, defaults to 4):
253
- The number of denoising steps. More denoising steps usually lead to a higher quality image at the
254
- expense of slower inference.
255
- original_inference_steps (`int`, *optional*):
256
- The original number of inference steps use to generate a linearly-spaced timestep schedule, from which
257
- we will draw `num_inference_steps` evenly spaced timesteps from as our final timestep schedule,
258
- following the Skipping-Step method in the paper (see Section 4.3). If not set this will default to the
259
- scheduler's `original_inference_steps` attribute.
260
- guidance_scale (`float`, defaults to 7.5):
261
- Guidance scale as defined in [Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598).
262
- `guidance_scale` is defined as `w` of equation 2. of [Imagen
263
- Paper](https://arxiv.org/pdf/2205.11487.pdf). Guidance scale is enabled by setting `guidance_scale >
264
- 1`. Higher guidance scale encourages to generate images that are closely linked to the text `prompt`,
265
- usually at the expense of lower image quality.
266
- num_images_per_prompt (`int`, defaults to 1):
267
- The number of images to generate per prompt.
268
- eta (`float`, defaults to 0.0):
269
- Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to
270
- [`schedulers.DDIMScheduler`], will be ignored for others.
271
- generator (`Optional[np.random.RandomState]`, defaults to `None`)::
272
- A np.random.RandomState to make generation deterministic.
273
- latents (`Optional[np.ndarray]`, defaults to `None`):
274
- Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image
275
- generation. Can be used to tweak the same generation with different prompts. If not provided, a latents
276
- tensor will ge generated by sampling using the supplied random `generator`.
277
- prompt_embeds (`Optional[np.ndarray]`, defaults to `None`):
278
- Pre-generated text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt weighting. If not
279
- provided, text embeddings will be generated from `prompt` input argument.
280
- output_type (`str`, defaults to `"pil"`):
281
- The output format of the generate image. Choose between
282
- [PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `np.array`.
283
- return_dict (`bool`, defaults to `True`):
284
- Whether or not to return a [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] instead of a
285
- plain tuple.
286
- callback (Optional[Callable], defaults to `None`):
287
- A function that will be called every `callback_steps` steps during inference. The function will be
288
- called with the following arguments: `callback(step: int, timestep: int, latents: torch.FloatTensor)`.
289
- callback_steps (`int`, defaults to 1):
290
- The frequency at which the `callback` function will be called. If not specified, the callback will be
291
- called at every step.
292
- guidance_rescale (`float`, defaults to 0.0):
293
- Guidance rescale factor proposed by [Common Diffusion Noise Schedules and Sample Steps are
294
- Flawed](https://arxiv.org/pdf/2305.08891.pdf) `guidance_scale` is defined as `φ` in equation 16. of
295
- [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf).
296
- Guidance rescale factor should fix overexposure when using zero terminal SNR.
297
-
298
- Returns:
299
- [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] or `tuple`:
300
- [`~pipelines.stable_diffusion.StableDiffusionPipelineOutput`] if `return_dict` is True, otherwise a `tuple.
301
- When returning a tuple, the first element is a list with the generated images, and the second element is a
302
- list of `bool`s denoting whether the corresponding generated image likely represents "not-safe-for-work"
303
- (nsfw) content, according to the `safety_checker`.
304
- """
305
- height = (
306
- height or self.unet.config.get("sample_size", 64) * self.vae_scale_factor
307
- )
308
- width = width or self.unet.config.get("sample_size", 64) * self.vae_scale_factor
309
-
310
- # check inputs. Raise error if not correct
311
- self.check_inputs(
312
- prompt, height, width, callback_steps, None, prompt_embeds, None
313
- )
314
-
315
- # define call parameters
316
- if isinstance(prompt, str):
317
- batch_size = 1
318
- elif isinstance(prompt, list):
319
- batch_size = len(prompt)
320
- else:
321
- batch_size = prompt_embeds.shape[0]
322
-
323
- if generator is None:
324
- generator = np.random
325
-
326
- # Create torch.Generator instance with same state as np.random.RandomState
327
- torch_generator = torch.Generator().manual_seed(
328
- int(generator.get_state()[1][0])
329
- )
330
-
331
- # do_classifier_free_guidance = guidance_scale > 1.0
332
-
333
- # NOTE: when a LCM is distilled from an LDM via latent consistency distillation (Algorithm 1) with guided
334
- # distillation, the forward pass of the LCM learns to approximate sampling from the LDM using CFG with the
335
- # unconditional prompt "" (the empty string). Due to this, LCMs currently do not support negative prompts.
336
- prompt_embeds = self._encode_prompt(
337
- prompt,
338
- num_images_per_prompt,
339
- False,
340
- negative_prompt=None,
341
- prompt_embeds=prompt_embeds,
342
- negative_prompt_embeds=None,
343
- )
344
-
345
- # set timesteps
346
- self.scheduler.set_timesteps(
347
- num_inference_steps,
348
- "cpu",
349
- original_inference_steps=original_inference_steps,
350
- )
351
- timesteps = self.scheduler.timesteps
352
-
353
- latents = self.prepare_latents(
354
- batch_size * num_images_per_prompt,
355
- self.unet.config.get("in_channels", 4),
356
- height,
357
- width,
358
- prompt_embeds.dtype,
359
- generator,
360
- latents,
361
- )
362
-
363
- # Get Guidance Scale Embedding
364
- w = np.tile(guidance_scale - 1, batch_size * num_images_per_prompt)
365
- w_embedding = self.get_guidance_scale_embedding(
366
- w, embedding_dim=self.unet.config.get("time_cond_proj_dim", 256)
367
- )
368
-
369
- # Adapted from diffusers to extend it for other runtimes than ORT
370
- timestep_dtype = self.unet.input_dtype.get("timestep", np.float32)
371
-
372
- # prepare extra kwargs for the scheduler step, since not all schedulers have the same signature
373
- # eta (η) is only used with the DDIMScheduler, it will be ignored for other schedulers.
374
- # eta corresponds to η in DDIM paper: https://arxiv.org/abs/2010.02502
375
- # and should be between [0, 1]
376
- accepts_eta = "eta" in set(
377
- inspect.signature(self.scheduler.step).parameters.keys()
378
- )
379
- extra_step_kwargs = {}
380
- if accepts_eta:
381
- extra_step_kwargs["eta"] = eta
382
-
383
- accepts_generator = "generator" in set(
384
- inspect.signature(self.scheduler.step).parameters.keys()
385
- )
386
- if accepts_generator:
387
- extra_step_kwargs["generator"] = torch_generator
388
-
389
- num_warmup_steps = len(timesteps) - num_inference_steps * self.scheduler.order
390
- for i, t in enumerate(self.progress_bar(timesteps)):
391
- # predict the noise residual
392
- timestep = np.array([t], dtype=timestep_dtype)
393
-
394
- noise_pred = self.unet(
395
- sample=latents,
396
- timestep=timestep,
397
- timestep_cond=w_embedding,
398
- encoder_hidden_states=prompt_embeds,
399
- )[0]
400
-
401
- # compute the previous noisy sample x_t -> x_t-1
402
- latents, denoised = self.scheduler.step(
403
- torch.from_numpy(noise_pred),
404
- t,
405
- torch.from_numpy(latents),
406
- **extra_step_kwargs,
407
- return_dict=False,
408
- )
409
-
410
- latents, denoised = latents.numpy(), denoised.numpy()
411
-
412
- # call the callback, if provided
413
- if i == len(timesteps) - 1 or (
414
- (i + 1) > num_warmup_steps and (i + 1) % self.scheduler.order == 0
415
- ):
416
- if callback is not None and i % callback_steps == 0:
417
- callback(i, t, latents)
418
-
419
- if output_type == "latent":
420
- image = latents
421
- has_nsfw_concept = None
422
- else:
423
- denoised /= self.vae_decoder.config.get("scaling_factor", 0.18215)
424
- # it seems likes there is a strange result for using half-precision vae decoder if batchsize>1
425
- image = np.concatenate(
426
- [
427
- self.vae_decoder(latent_sample=denoised[i : i + 1])[0]
428
- for i in range(latents.shape[0])
429
- ]
430
- )
431
- image, has_nsfw_concept = self.run_safety_checker(image)
432
-
433
- if has_nsfw_concept is None:
434
- do_denormalize = [True] * image.shape[0]
435
- else:
436
- do_denormalize = [not has_nsfw for has_nsfw in has_nsfw_concept]
437
-
438
- image = self.image_processor.postprocess(
439
- image, output_type=output_type, do_denormalize=do_denormalize
440
- )
441
-
442
- if not return_dict:
443
- return (image, has_nsfw_concept)
444
-
445
- return StableDiffusionPipelineOutput(
446
- images=image, nsfw_content_detected=has_nsfw_concept
447
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/lcmdiffusion/pipelines/openvino/lcm_scheduler.py DELETED
@@ -1,576 +0,0 @@
1
- # Copyright 2023 Stanford University Team and The HuggingFace Team. All rights reserved.
2
- #
3
- # Licensed under the Apache License, Version 2.0 (the "License");
4
- # you may not use this file except in compliance with the License.
5
- # You may obtain a copy of the License at
6
- #
7
- # http://www.apache.org/licenses/LICENSE-2.0
8
- #
9
- # Unless required by applicable law or agreed to in writing, software
10
- # distributed under the License is distributed on an "AS IS" BASIS,
11
- # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
- # See the License for the specific language governing permissions and
13
- # limitations under the License.
14
-
15
- # DISCLAIMER: This code is strongly influenced by https://github.com/pesser/pytorch_diffusion
16
- # and https://github.com/hojonathanho/diffusion
17
-
18
- import math
19
- from dataclasses import dataclass
20
- from typing import List, Optional, Tuple, Union
21
-
22
- import numpy as np
23
- import torch
24
-
25
- from diffusers.configuration_utils import ConfigMixin, register_to_config
26
- from diffusers.utils import BaseOutput, logging
27
- from diffusers.utils.torch_utils import randn_tensor
28
- from diffusers.schedulers.scheduling_utils import SchedulerMixin
29
-
30
-
31
- logger = logging.get_logger(__name__) # pylint: disable=invalid-name
32
-
33
-
34
- @dataclass
35
- class LCMSchedulerOutput(BaseOutput):
36
- """
37
- Output class for the scheduler's `step` function output.
38
-
39
- Args:
40
- prev_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images):
41
- Computed sample `(x_{t-1})` of previous timestep. `prev_sample` should be used as next model input in the
42
- denoising loop.
43
- pred_original_sample (`torch.FloatTensor` of shape `(batch_size, num_channels, height, width)` for images):
44
- The predicted denoised sample `(x_{0})` based on the model output from the current timestep.
45
- `pred_original_sample` can be used to preview progress or for guidance.
46
- """
47
-
48
- prev_sample: torch.FloatTensor
49
- denoised: Optional[torch.FloatTensor] = None
50
-
51
-
52
- # Copied from diffusers.schedulers.scheduling_ddpm.betas_for_alpha_bar
53
- def betas_for_alpha_bar(
54
- num_diffusion_timesteps,
55
- max_beta=0.999,
56
- alpha_transform_type="cosine",
57
- ):
58
- """
59
- Create a beta schedule that discretizes the given alpha_t_bar function, which defines the cumulative product of
60
- (1-beta) over time from t = [0,1].
61
-
62
- Contains a function alpha_bar that takes an argument t and transforms it to the cumulative product of (1-beta) up
63
- to that part of the diffusion process.
64
-
65
-
66
- Args:
67
- num_diffusion_timesteps (`int`): the number of betas to produce.
68
- max_beta (`float`): the maximum beta to use; use values lower than 1 to
69
- prevent singularities.
70
- alpha_transform_type (`str`, *optional*, default to `cosine`): the type of noise schedule for alpha_bar.
71
- Choose from `cosine` or `exp`
72
-
73
- Returns:
74
- betas (`np.ndarray`): the betas used by the scheduler to step the model outputs
75
- """
76
- if alpha_transform_type == "cosine":
77
-
78
- def alpha_bar_fn(t):
79
- return math.cos((t + 0.008) / 1.008 * math.pi / 2) ** 2
80
-
81
- elif alpha_transform_type == "exp":
82
-
83
- def alpha_bar_fn(t):
84
- return math.exp(t * -12.0)
85
-
86
- else:
87
- raise ValueError(f"Unsupported alpha_tranform_type: {alpha_transform_type}")
88
-
89
- betas = []
90
- for i in range(num_diffusion_timesteps):
91
- t1 = i / num_diffusion_timesteps
92
- t2 = (i + 1) / num_diffusion_timesteps
93
- betas.append(min(1 - alpha_bar_fn(t2) / alpha_bar_fn(t1), max_beta))
94
- return torch.tensor(betas, dtype=torch.float32)
95
-
96
-
97
- # Copied from diffusers.schedulers.scheduling_ddim.rescale_zero_terminal_snr
98
- def rescale_zero_terminal_snr(betas: torch.FloatTensor) -> torch.FloatTensor:
99
- """
100
- Rescales betas to have zero terminal SNR Based on https://arxiv.org/pdf/2305.08891.pdf (Algorithm 1)
101
-
102
-
103
- Args:
104
- betas (`torch.FloatTensor`):
105
- the betas that the scheduler is being initialized with.
106
-
107
- Returns:
108
- `torch.FloatTensor`: rescaled betas with zero terminal SNR
109
- """
110
- # Convert betas to alphas_bar_sqrt
111
- alphas = 1.0 - betas
112
- alphas_cumprod = torch.cumprod(alphas, dim=0)
113
- alphas_bar_sqrt = alphas_cumprod.sqrt()
114
-
115
- # Store old values.
116
- alphas_bar_sqrt_0 = alphas_bar_sqrt[0].clone()
117
- alphas_bar_sqrt_T = alphas_bar_sqrt[-1].clone()
118
-
119
- # Shift so the last timestep is zero.
120
- alphas_bar_sqrt -= alphas_bar_sqrt_T
121
-
122
- # Scale so the first timestep is back to the old value.
123
- alphas_bar_sqrt *= alphas_bar_sqrt_0 / (alphas_bar_sqrt_0 - alphas_bar_sqrt_T)
124
-
125
- # Convert alphas_bar_sqrt to betas
126
- alphas_bar = alphas_bar_sqrt**2 # Revert sqrt
127
- alphas = alphas_bar[1:] / alphas_bar[:-1] # Revert cumprod
128
- alphas = torch.cat([alphas_bar[0:1], alphas])
129
- betas = 1 - alphas
130
-
131
- return betas
132
-
133
-
134
- class LCMScheduler(SchedulerMixin, ConfigMixin):
135
- """
136
- `LCMScheduler` extends the denoising procedure introduced in denoising diffusion probabilistic models (DDPMs) with
137
- non-Markovian guidance.
138
-
139
- This model inherits from [`SchedulerMixin`] and [`ConfigMixin`]. [`~ConfigMixin`] takes care of storing all config
140
- attributes that are passed in the scheduler's `__init__` function, such as `num_train_timesteps`. They can be
141
- accessed via `scheduler.config.num_train_timesteps`. [`SchedulerMixin`] provides general loading and saving
142
- functionality via the [`SchedulerMixin.save_pretrained`] and [`~SchedulerMixin.from_pretrained`] functions.
143
-
144
- Args:
145
- num_train_timesteps (`int`, defaults to 1000):
146
- The number of diffusion steps to train the model.
147
- beta_start (`float`, defaults to 0.0001):
148
- The starting `beta` value of inference.
149
- beta_end (`float`, defaults to 0.02):
150
- The final `beta` value.
151
- beta_schedule (`str`, defaults to `"linear"`):
152
- The beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from
153
- `linear`, `scaled_linear`, or `squaredcos_cap_v2`.
154
- trained_betas (`np.ndarray`, *optional*):
155
- Pass an array of betas directly to the constructor to bypass `beta_start` and `beta_end`.
156
- original_inference_steps (`int`, *optional*, defaults to 50):
157
- The default number of inference steps used to generate a linearly-spaced timestep schedule, from which we
158
- will ultimately take `num_inference_steps` evenly spaced timesteps to form the final timestep schedule.
159
- clip_sample (`bool`, defaults to `True`):
160
- Clip the predicted sample for numerical stability.
161
- clip_sample_range (`float`, defaults to 1.0):
162
- The maximum magnitude for sample clipping. Valid only when `clip_sample=True`.
163
- set_alpha_to_one (`bool`, defaults to `True`):
164
- Each diffusion step uses the alphas product value at that step and at the previous one. For the final step
165
- there is no previous alpha. When this option is `True` the previous alpha product is fixed to `1`,
166
- otherwise it uses the alpha value at step 0.
167
- steps_offset (`int`, defaults to 0):
168
- An offset added to the inference steps. You can use a combination of `offset=1` and
169
- `set_alpha_to_one=False` to make the last step use step 0 for the previous alpha product like in Stable
170
- Diffusion.
171
- prediction_type (`str`, defaults to `epsilon`, *optional*):
172
- Prediction type of the scheduler function; can be `epsilon` (predicts the noise of the diffusion process),
173
- `sample` (directly predicts the noisy sample`) or `v_prediction` (see section 2.4 of [Imagen
174
- Video](https://imagen.research.google/video/paper.pdf) paper).
175
- thresholding (`bool`, defaults to `False`):
176
- Whether to use the "dynamic thresholding" method. This is unsuitable for latent-space diffusion models such
177
- as Stable Diffusion.
178
- dynamic_thresholding_ratio (`float`, defaults to 0.995):
179
- The ratio for the dynamic thresholding method. Valid only when `thresholding=True`.
180
- sample_max_value (`float`, defaults to 1.0):
181
- The threshold value for dynamic thresholding. Valid only when `thresholding=True`.
182
- timestep_spacing (`str`, defaults to `"leading"`):
183
- The way the timesteps should be scaled. Refer to Table 2 of the [Common Diffusion Noise Schedules and
184
- Sample Steps are Flawed](https://huggingface.co/papers/2305.08891) for more information.
185
- rescale_betas_zero_snr (`bool`, defaults to `False`):
186
- Whether to rescale the betas to have zero terminal SNR. This enables the model to generate very bright and
187
- dark samples instead of limiting it to samples with medium brightness. Loosely related to
188
- [`--offset_noise`](https://github.com/huggingface/diffusers/blob/74fd735eb073eb1d774b1ab4154a0876eb82f055/examples/dreambooth/train_dreambooth.py#L506).
189
- """
190
-
191
- order = 1
192
-
193
- @register_to_config
194
- def __init__(
195
- self,
196
- num_train_timesteps: int = 1000,
197
- beta_start: float = 0.00085,
198
- beta_end: float = 0.012,
199
- beta_schedule: str = "scaled_linear",
200
- trained_betas: Optional[Union[np.ndarray, List[float]]] = None,
201
- original_inference_steps: int = 50,
202
- clip_sample: bool = False,
203
- clip_sample_range: float = 1.0,
204
- set_alpha_to_one: bool = True,
205
- steps_offset: int = 0,
206
- prediction_type: str = "epsilon",
207
- thresholding: bool = False,
208
- dynamic_thresholding_ratio: float = 0.995,
209
- sample_max_value: float = 1.0,
210
- timestep_spacing: str = "leading",
211
- rescale_betas_zero_snr: bool = False,
212
- ):
213
- if trained_betas is not None:
214
- self.betas = torch.tensor(trained_betas, dtype=torch.float32)
215
- elif beta_schedule == "linear":
216
- self.betas = torch.linspace(
217
- beta_start, beta_end, num_train_timesteps, dtype=torch.float32
218
- )
219
- elif beta_schedule == "scaled_linear":
220
- # this schedule is very specific to the latent diffusion model.
221
- self.betas = (
222
- torch.linspace(
223
- beta_start**0.5,
224
- beta_end**0.5,
225
- num_train_timesteps,
226
- dtype=torch.float32,
227
- )
228
- ** 2
229
- )
230
- elif beta_schedule == "squaredcos_cap_v2":
231
- # Glide cosine schedule
232
- self.betas = betas_for_alpha_bar(num_train_timesteps)
233
- else:
234
- raise NotImplementedError(
235
- f"{beta_schedule} does is not implemented for {self.__class__}"
236
- )
237
-
238
- # Rescale for zero SNR
239
- if rescale_betas_zero_snr:
240
- self.betas = rescale_zero_terminal_snr(self.betas)
241
-
242
- self.alphas = 1.0 - self.betas
243
- self.alphas_cumprod = torch.cumprod(self.alphas, dim=0)
244
-
245
- # At every step in ddim, we are looking into the previous alphas_cumprod
246
- # For the final step, there is no previous alphas_cumprod because we are already at 0
247
- # `set_alpha_to_one` decides whether we set this parameter simply to one or
248
- # whether we use the final alpha of the "non-previous" one.
249
- self.final_alpha_cumprod = (
250
- torch.tensor(1.0) if set_alpha_to_one else self.alphas_cumprod[0]
251
- )
252
-
253
- # standard deviation of the initial noise distribution
254
- self.init_noise_sigma = 1.0
255
-
256
- # setable values
257
- self.num_inference_steps = None
258
- self.timesteps = torch.from_numpy(
259
- np.arange(0, num_train_timesteps)[::-1].copy().astype(np.int64)
260
- )
261
-
262
- self._step_index = None
263
-
264
- # Copied from diffusers.schedulers.scheduling_euler_discrete.EulerDiscreteScheduler._init_step_index
265
- def _init_step_index(self, timestep):
266
- if isinstance(timestep, torch.Tensor):
267
- timestep = timestep.to(self.timesteps.device)
268
-
269
- index_candidates = (self.timesteps == timestep).nonzero()
270
-
271
- # The sigma index that is taken for the **very** first `step`
272
- # is always the second index (or the last index if there is only 1)
273
- # This way we can ensure we don't accidentally skip a sigma in
274
- # case we start in the middle of the denoising schedule (e.g. for image-to-image)
275
- if len(index_candidates) > 1:
276
- step_index = index_candidates[1]
277
- else:
278
- step_index = index_candidates[0]
279
-
280
- self._step_index = step_index.item()
281
-
282
- @property
283
- def step_index(self):
284
- return self._step_index
285
-
286
- def scale_model_input(
287
- self, sample: torch.FloatTensor, timestep: Optional[int] = None
288
- ) -> torch.FloatTensor:
289
- """
290
- Ensures interchangeability with schedulers that need to scale the denoising model input depending on the
291
- current timestep.
292
-
293
- Args:
294
- sample (`torch.FloatTensor`):
295
- The input sample.
296
- timestep (`int`, *optional*):
297
- The current timestep in the diffusion chain.
298
- Returns:
299
- `torch.FloatTensor`:
300
- A scaled input sample.
301
- """
302
- return sample
303
-
304
- # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler._threshold_sample
305
- def _threshold_sample(self, sample: torch.FloatTensor) -> torch.FloatTensor:
306
- """
307
- "Dynamic thresholding: At each sampling step we set s to a certain percentile absolute pixel value in xt0 (the
308
- prediction of x_0 at timestep t), and if s > 1, then we threshold xt0 to the range [-s, s] and then divide by
309
- s. Dynamic thresholding pushes saturated pixels (those near -1 and 1) inwards, thereby actively preventing
310
- pixels from saturation at each step. We find that dynamic thresholding results in significantly better
311
- photorealism as well as better image-text alignment, especially when using very large guidance weights."
312
-
313
- https://arxiv.org/abs/2205.11487
314
- """
315
- dtype = sample.dtype
316
- batch_size, channels, *remaining_dims = sample.shape
317
-
318
- if dtype not in (torch.float32, torch.float64):
319
- sample = (
320
- sample.float()
321
- ) # upcast for quantile calculation, and clamp not implemented for cpu half
322
-
323
- # Flatten sample for doing quantile calculation along each image
324
- sample = sample.reshape(batch_size, channels * np.prod(remaining_dims))
325
-
326
- abs_sample = sample.abs() # "a certain percentile absolute pixel value"
327
-
328
- s = torch.quantile(abs_sample, self.config.dynamic_thresholding_ratio, dim=1)
329
- s = torch.clamp(
330
- s, min=1, max=self.config.sample_max_value
331
- ) # When clamped to min=1, equivalent to standard clipping to [-1, 1]
332
- s = s.unsqueeze(1) # (batch_size, 1) because clamp will broadcast along dim=0
333
- sample = (
334
- torch.clamp(sample, -s, s) / s
335
- ) # "we threshold xt0 to the range [-s, s] and then divide by s"
336
-
337
- sample = sample.reshape(batch_size, channels, *remaining_dims)
338
- sample = sample.to(dtype)
339
-
340
- return sample
341
-
342
- def set_timesteps(
343
- self,
344
- num_inference_steps: int,
345
- device: Union[str, torch.device] = None,
346
- original_inference_steps: Optional[int] = None,
347
- ):
348
- """
349
- Sets the discrete timesteps used for the diffusion chain (to be run before inference).
350
-
351
- Args:
352
- num_inference_steps (`int`):
353
- The number of diffusion steps used when generating samples with a pre-trained model.
354
- device (`str` or `torch.device`, *optional*):
355
- The device to which the timesteps should be moved to. If `None`, the timesteps are not moved.
356
- original_inference_steps (`int`, *optional*):
357
- The original number of inference steps, which will be used to generate a linearly-spaced timestep
358
- schedule (which is different from the standard `diffusers` implementation). We will then take
359
- `num_inference_steps` timesteps from this schedule, evenly spaced in terms of indices, and use that as
360
- our final timestep schedule. If not set, this will default to the `original_inference_steps` attribute.
361
- """
362
-
363
- if num_inference_steps > self.config.num_train_timesteps:
364
- raise ValueError(
365
- f"`num_inference_steps`: {num_inference_steps} cannot be larger than `self.config.train_timesteps`:"
366
- f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle"
367
- f" maximal {self.config.num_train_timesteps} timesteps."
368
- )
369
-
370
- self.num_inference_steps = num_inference_steps
371
- original_steps = (
372
- original_inference_steps
373
- if original_inference_steps is not None
374
- else self.original_inference_steps
375
- )
376
-
377
- if original_steps > self.config.num_train_timesteps:
378
- raise ValueError(
379
- f"`original_steps`: {original_steps} cannot be larger than `self.config.train_timesteps`:"
380
- f" {self.config.num_train_timesteps} as the unet model trained with this scheduler can only handle"
381
- f" maximal {self.config.num_train_timesteps} timesteps."
382
- )
383
-
384
- if num_inference_steps > original_steps:
385
- raise ValueError(
386
- f"`num_inference_steps`: {num_inference_steps} cannot be larger than `original_inference_steps`:"
387
- f" {original_steps} because the final timestep schedule will be a subset of the"
388
- f" `original_inference_steps`-sized initial timestep schedule."
389
- )
390
-
391
- # LCM Timesteps Setting
392
- # Currently, only linear spacing is supported.
393
- c = self.config.num_train_timesteps // original_steps
394
- # LCM Training Steps Schedule
395
- lcm_origin_timesteps = np.asarray(list(range(1, original_steps + 1))) * c - 1
396
- skipping_step = len(lcm_origin_timesteps) // num_inference_steps
397
- # LCM Inference Steps Schedule
398
- timesteps = lcm_origin_timesteps[::-skipping_step][:num_inference_steps]
399
-
400
- self.timesteps = torch.from_numpy(timesteps.copy()).to(
401
- device=device, dtype=torch.long
402
- )
403
-
404
- self._step_index = None
405
-
406
- def get_scalings_for_boundary_condition_discrete(self, t):
407
- self.sigma_data = 0.5 # Default: 0.5
408
-
409
- # By dividing 0.1: This is almost a delta function at t=0.
410
- c_skip = self.sigma_data**2 / ((t / 0.1) ** 2 + self.sigma_data**2)
411
- c_out = (t / 0.1) / ((t / 0.1) ** 2 + self.sigma_data**2) ** 0.5
412
- return c_skip, c_out
413
-
414
- def step(
415
- self,
416
- model_output: torch.FloatTensor,
417
- timestep: int,
418
- sample: torch.FloatTensor,
419
- generator: Optional[torch.Generator] = None,
420
- return_dict: bool = True,
421
- ) -> Union[LCMSchedulerOutput, Tuple]:
422
- """
423
- Predict the sample from the previous timestep by reversing the SDE. This function propagates the diffusion
424
- process from the learned model outputs (most often the predicted noise).
425
-
426
- Args:
427
- model_output (`torch.FloatTensor`):
428
- The direct output from learned diffusion model.
429
- timestep (`float`):
430
- The current discrete timestep in the diffusion chain.
431
- sample (`torch.FloatTensor`):
432
- A current instance of a sample created by the diffusion process.
433
- generator (`torch.Generator`, *optional*):
434
- A random number generator.
435
- return_dict (`bool`, *optional*, defaults to `True`):
436
- Whether or not to return a [`~schedulers.scheduling_lcm.LCMSchedulerOutput`] or `tuple`.
437
- Returns:
438
- [`~schedulers.scheduling_utils.LCMSchedulerOutput`] or `tuple`:
439
- If return_dict is `True`, [`~schedulers.scheduling_lcm.LCMSchedulerOutput`] is returned, otherwise a
440
- tuple is returned where the first element is the sample tensor.
441
- """
442
- if self.num_inference_steps is None:
443
- raise ValueError(
444
- "Number of inference steps is 'None', you need to run 'set_timesteps' after creating the scheduler"
445
- )
446
-
447
- if self.step_index is None:
448
- self._init_step_index(timestep)
449
-
450
- # 1. get previous step value
451
- prev_step_index = self.step_index + 1
452
- if prev_step_index < len(self.timesteps):
453
- prev_timestep = self.timesteps[prev_step_index]
454
- else:
455
- prev_timestep = timestep
456
-
457
- # 2. compute alphas, betas
458
- alpha_prod_t = self.alphas_cumprod[timestep]
459
- alpha_prod_t_prev = (
460
- self.alphas_cumprod[prev_timestep]
461
- if prev_timestep >= 0
462
- else self.final_alpha_cumprod
463
- )
464
-
465
- beta_prod_t = 1 - alpha_prod_t
466
- beta_prod_t_prev = 1 - alpha_prod_t_prev
467
-
468
- # 3. Get scalings for boundary conditions
469
- c_skip, c_out = self.get_scalings_for_boundary_condition_discrete(timestep)
470
-
471
- # 4. Compute the predicted original sample x_0 based on the model parameterization
472
- if self.config.prediction_type == "epsilon": # noise-prediction
473
- predicted_original_sample = (
474
- sample - beta_prod_t.sqrt() * model_output
475
- ) / alpha_prod_t.sqrt()
476
- elif self.config.prediction_type == "sample": # x-prediction
477
- predicted_original_sample = model_output
478
- elif self.config.prediction_type == "v_prediction": # v-prediction
479
- predicted_original_sample = (
480
- alpha_prod_t.sqrt() * sample - beta_prod_t.sqrt() * model_output
481
- )
482
- else:
483
- raise ValueError(
484
- f"prediction_type given as {self.config.prediction_type} must be one of `epsilon`, `sample` or"
485
- " `v_prediction` for `LCMScheduler`."
486
- )
487
-
488
- # 5. Clip or threshold "predicted x_0"
489
- if self.config.thresholding:
490
- predicted_original_sample = self._threshold_sample(
491
- predicted_original_sample
492
- )
493
- elif self.config.clip_sample:
494
- predicted_original_sample = predicted_original_sample.clamp(
495
- -self.config.clip_sample_range, self.config.clip_sample_range
496
- )
497
-
498
- # 6. Denoise model output using boundary conditions
499
- denoised = c_out * predicted_original_sample + c_skip * sample
500
-
501
- # 7. Sample and inject noise z ~ N(0, I) for MultiStep Inference
502
- # Noise is not used for one-step sampling.
503
- if len(self.timesteps) > 1:
504
- noise = randn_tensor(
505
- model_output.shape, generator=generator, device=model_output.device
506
- )
507
- prev_sample = (
508
- alpha_prod_t_prev.sqrt() * denoised + beta_prod_t_prev.sqrt() * noise
509
- )
510
- else:
511
- prev_sample = denoised
512
-
513
- # upon completion increase step index by one
514
- self._step_index += 1
515
-
516
- if not return_dict:
517
- return (prev_sample, denoised)
518
-
519
- return LCMSchedulerOutput(prev_sample=prev_sample, denoised=denoised)
520
-
521
- # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.add_noise
522
- def add_noise(
523
- self,
524
- original_samples: torch.FloatTensor,
525
- noise: torch.FloatTensor,
526
- timesteps: torch.IntTensor,
527
- ) -> torch.FloatTensor:
528
- # Make sure alphas_cumprod and timestep have same device and dtype as original_samples
529
- alphas_cumprod = self.alphas_cumprod.to(
530
- device=original_samples.device, dtype=original_samples.dtype
531
- )
532
- timesteps = timesteps.to(original_samples.device)
533
-
534
- sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5
535
- sqrt_alpha_prod = sqrt_alpha_prod.flatten()
536
- while len(sqrt_alpha_prod.shape) < len(original_samples.shape):
537
- sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1)
538
-
539
- sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5
540
- sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten()
541
- while len(sqrt_one_minus_alpha_prod.shape) < len(original_samples.shape):
542
- sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1)
543
-
544
- noisy_samples = (
545
- sqrt_alpha_prod * original_samples + sqrt_one_minus_alpha_prod * noise
546
- )
547
- return noisy_samples
548
-
549
- # Copied from diffusers.schedulers.scheduling_ddpm.DDPMScheduler.get_velocity
550
- def get_velocity(
551
- self,
552
- sample: torch.FloatTensor,
553
- noise: torch.FloatTensor,
554
- timesteps: torch.IntTensor,
555
- ) -> torch.FloatTensor:
556
- # Make sure alphas_cumprod and timestep have same device and dtype as sample
557
- alphas_cumprod = self.alphas_cumprod.to(
558
- device=sample.device, dtype=sample.dtype
559
- )
560
- timesteps = timesteps.to(sample.device)
561
-
562
- sqrt_alpha_prod = alphas_cumprod[timesteps] ** 0.5
563
- sqrt_alpha_prod = sqrt_alpha_prod.flatten()
564
- while len(sqrt_alpha_prod.shape) < len(sample.shape):
565
- sqrt_alpha_prod = sqrt_alpha_prod.unsqueeze(-1)
566
-
567
- sqrt_one_minus_alpha_prod = (1 - alphas_cumprod[timesteps]) ** 0.5
568
- sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.flatten()
569
- while len(sqrt_one_minus_alpha_prod.shape) < len(sample.shape):
570
- sqrt_one_minus_alpha_prod = sqrt_one_minus_alpha_prod.unsqueeze(-1)
571
-
572
- velocity = sqrt_alpha_prod * noise - sqrt_one_minus_alpha_prod * sample
573
- return velocity
574
-
575
- def __len__(self):
576
- return self.config.num_train_timesteps
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/models/__pycache__/lcmdiffusion_setting.cpython-311.pyc DELETED
Binary file (2.68 kB)
 
backend/models/lcmdiffusion_setting.py DELETED
@@ -1,39 +0,0 @@
1
- from typing import Optional, Any
2
- from enum import Enum
3
- from pydantic import BaseModel
4
- from constants import LCM_DEFAULT_MODEL, LCM_DEFAULT_MODEL_OPENVINO
5
-
6
-
7
- class LCMLora(BaseModel):
8
- base_model_id: str = "Lykon/dreamshaper-8"
9
- lcm_lora_id: str = "latent-consistency/lcm-lora-sdv1-5"
10
-
11
-
12
- class DiffusionTask(str, Enum):
13
- """Diffusion task types"""
14
-
15
- text_to_image = "text_to_image"
16
- image_to_image = "image_to_image"
17
-
18
-
19
- class LCMDiffusionSetting(BaseModel):
20
- lcm_model_id: str = LCM_DEFAULT_MODEL
21
- openvino_lcm_model_id: str = LCM_DEFAULT_MODEL_OPENVINO
22
- use_offline_model: bool = False
23
- use_lcm_lora: bool = False
24
- lcm_lora: Optional[LCMLora] = LCMLora()
25
- use_tiny_auto_encoder: bool = False
26
- use_openvino: bool = False
27
- prompt: str = ""
28
- negative_prompt: str = ""
29
- init_image: Any = None
30
- strength: Optional[float] = 0.6
31
- image_height: Optional[int] = 512
32
- image_width: Optional[int] = 512
33
- inference_steps: Optional[int] = 1
34
- guidance_scale: Optional[float] = 1
35
- number_of_images: Optional[int] = 1
36
- seed: Optional[int] = 123123
37
- use_seed: bool = False
38
- use_safety_checker: bool = False
39
- diffusion_task: str = DiffusionTask.text_to_image.value
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/openvino/custom_ov_model_vae_decoder.py DELETED
@@ -1,21 +0,0 @@
1
- from backend.device import is_openvino_device
2
-
3
- if is_openvino_device():
4
- from optimum.intel.openvino.modeling_diffusion import OVModelVaeDecoder
5
-
6
-
7
- class CustomOVModelVaeDecoder(OVModelVaeDecoder):
8
- def __init__(
9
- self,
10
- model,
11
- parent_model,
12
- ov_config=None,
13
- model_dir=None,
14
- ):
15
- super(OVModelVaeDecoder, self).__init__(
16
- model,
17
- parent_model,
18
- ov_config,
19
- "vae_decoder",
20
- model_dir,
21
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/openvino/pipelines.py DELETED
@@ -1,75 +0,0 @@
1
- from constants import DEVICE, LCM_DEFAULT_MODEL_OPENVINO
2
- from backend.tiny_decoder import get_tiny_decoder_vae_model
3
- from typing import Any
4
- from backend.device import is_openvino_device
5
- from paths import get_base_folder_name
6
-
7
- if is_openvino_device():
8
- from huggingface_hub import snapshot_download
9
- from optimum.intel.openvino.modeling_diffusion import OVBaseModel
10
-
11
- from optimum.intel.openvino.modeling_diffusion import (
12
- OVStableDiffusionPipeline,
13
- OVStableDiffusionImg2ImgPipeline,
14
- OVStableDiffusionXLPipeline,
15
- OVStableDiffusionXLImg2ImgPipeline,
16
- )
17
- from backend.openvino.custom_ov_model_vae_decoder import CustomOVModelVaeDecoder
18
-
19
-
20
- def ov_load_taesd(
21
- pipeline: Any,
22
- use_local_model: bool = False,
23
- ):
24
- taesd_dir = snapshot_download(
25
- repo_id=get_tiny_decoder_vae_model(pipeline.__class__.__name__),
26
- local_files_only=use_local_model,
27
- )
28
- pipeline.vae_decoder = CustomOVModelVaeDecoder(
29
- model=OVBaseModel.load_model(f"{taesd_dir}/vae_decoder/openvino_model.xml"),
30
- parent_model=pipeline,
31
- model_dir=taesd_dir,
32
- )
33
-
34
-
35
- def get_ov_text_to_image_pipeline(
36
- model_id: str = LCM_DEFAULT_MODEL_OPENVINO,
37
- use_local_model: bool = False,
38
- ) -> Any:
39
- if "xl" in get_base_folder_name(model_id).lower():
40
- pipeline = OVStableDiffusionXLPipeline.from_pretrained(
41
- model_id,
42
- local_files_only=use_local_model,
43
- ov_config={"CACHE_DIR": ""},
44
- device=DEVICE.upper(),
45
- )
46
- else:
47
- pipeline = OVStableDiffusionPipeline.from_pretrained(
48
- model_id,
49
- local_files_only=use_local_model,
50
- ov_config={"CACHE_DIR": ""},
51
- device=DEVICE.upper(),
52
- )
53
-
54
- return pipeline
55
-
56
-
57
- def get_ov_image_to_image_pipeline(
58
- model_id: str = LCM_DEFAULT_MODEL_OPENVINO,
59
- use_local_model: bool = False,
60
- ) -> Any:
61
- if "xl" in get_base_folder_name(model_id).lower():
62
- pipeline = OVStableDiffusionXLImg2ImgPipeline.from_pretrained(
63
- model_id,
64
- local_files_only=use_local_model,
65
- ov_config={"CACHE_DIR": ""},
66
- device=DEVICE.upper(),
67
- )
68
- else:
69
- pipeline = OVStableDiffusionImg2ImgPipeline.from_pretrained(
70
- model_id,
71
- local_files_only=use_local_model,
72
- ov_config={"CACHE_DIR": ""},
73
- device=DEVICE.upper(),
74
- )
75
- return pipeline
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/pipelines/lcm.py DELETED
@@ -1,90 +0,0 @@
1
- from constants import LCM_DEFAULT_MODEL
2
- from diffusers import (
3
- DiffusionPipeline,
4
- AutoencoderTiny,
5
- UNet2DConditionModel,
6
- LCMScheduler,
7
- )
8
- import torch
9
- from backend.tiny_decoder import get_tiny_decoder_vae_model
10
- from typing import Any
11
- from diffusers import (
12
- LCMScheduler,
13
- StableDiffusionImg2ImgPipeline,
14
- StableDiffusionXLImg2ImgPipeline,
15
- )
16
-
17
-
18
- def _get_lcm_pipeline_from_base_model(
19
- lcm_model_id: str,
20
- base_model_id: str,
21
- use_local_model: bool,
22
- ):
23
- pipeline = None
24
- unet = UNet2DConditionModel.from_pretrained(
25
- lcm_model_id,
26
- torch_dtype=torch.float32,
27
- local_files_only=use_local_model,
28
- )
29
- pipeline = DiffusionPipeline.from_pretrained(
30
- base_model_id,
31
- unet=unet,
32
- torch_dtype=torch.float32,
33
- local_files_only=use_local_model,
34
- )
35
- pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config)
36
- return pipeline
37
-
38
-
39
- def load_taesd(
40
- pipeline: Any,
41
- use_local_model: bool = False,
42
- torch_data_type: torch.dtype = torch.float32,
43
- ):
44
- vae_model = get_tiny_decoder_vae_model(pipeline.__class__.__name__)
45
- pipeline.vae = AutoencoderTiny.from_pretrained(
46
- vae_model,
47
- torch_dtype=torch_data_type,
48
- local_files_only=use_local_model,
49
- )
50
-
51
-
52
- def get_lcm_model_pipeline(
53
- model_id: str = LCM_DEFAULT_MODEL,
54
- use_local_model: bool = False,
55
- ):
56
- pipeline = None
57
- if model_id == "latent-consistency/lcm-sdxl":
58
- pipeline = _get_lcm_pipeline_from_base_model(
59
- model_id,
60
- "stabilityai/stable-diffusion-xl-base-1.0",
61
- use_local_model,
62
- )
63
-
64
- elif model_id == "latent-consistency/lcm-ssd-1b":
65
- pipeline = _get_lcm_pipeline_from_base_model(
66
- model_id,
67
- "segmind/SSD-1B",
68
- use_local_model,
69
- )
70
- else:
71
- pipeline = DiffusionPipeline.from_pretrained(
72
- model_id,
73
- local_files_only=use_local_model,
74
- )
75
-
76
- return pipeline
77
-
78
-
79
- def get_image_to_image_pipeline(pipeline: Any) -> Any:
80
- components = pipeline.components
81
- pipeline_class = pipeline.__class__.__name__
82
- if (
83
- pipeline_class == "LatentConsistencyModelPipeline"
84
- or pipeline_class == "StableDiffusionPipeline"
85
- ):
86
- return StableDiffusionImg2ImgPipeline(**components)
87
- elif pipeline_class == "StableDiffusionXLPipeline":
88
- return StableDiffusionXLImg2ImgPipeline(**components)
89
- else:
90
- raise Exception(f"Unknown pipeline {pipeline_class}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/pipelines/lcm_lora.py DELETED
@@ -1,25 +0,0 @@
1
- from diffusers import DiffusionPipeline, LCMScheduler
2
- import torch
3
-
4
-
5
- def get_lcm_lora_pipeline(
6
- base_model_id: str,
7
- lcm_lora_id: str,
8
- use_local_model: bool,
9
- torch_data_type: torch.dtype,
10
- ):
11
- pipeline = DiffusionPipeline.from_pretrained(
12
- base_model_id,
13
- torch_dtype=torch_data_type,
14
- local_files_only=use_local_model,
15
- )
16
- pipeline.load_lora_weights(
17
- lcm_lora_id,
18
- local_files_only=use_local_model,
19
- )
20
- if "lcm" in lcm_lora_id.lower():
21
- print("LCM LoRA model detected so using recommended LCMScheduler")
22
- pipeline.scheduler = LCMScheduler.from_config(pipeline.scheduler.config)
23
- pipeline.fuse_lora()
24
- pipeline.unet.to(memory_format=torch.channels_last)
25
- return pipeline
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/safety_check.py DELETED
@@ -1,17 +0,0 @@
1
- from transformers import pipeline
2
-
3
-
4
- def is_safe_image(
5
- classifier,
6
- image,
7
- ):
8
- pred = classifier(image)
9
- nsfw_score = 0
10
- normal_score = 0
11
- for label in pred:
12
- if label["label"] == "nsfw":
13
- nsfw_score = label["score"]
14
- elif label["label"] == "normal":
15
- normal_score = label["score"]
16
- print(f"nsfw_score: {nsfw_score}, normal_score: {normal_score}")
17
- return normal_score > nsfw_score
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
backend/tiny_decoder.py DELETED
@@ -1,30 +0,0 @@
1
- from constants import (
2
- TAESD_MODEL,
3
- TAESDXL_MODEL,
4
- TAESD_MODEL_OPENVINO,
5
- TAESDXL_MODEL_OPENVINO,
6
- )
7
-
8
-
9
- def get_tiny_decoder_vae_model(pipeline_class) -> str:
10
- print(f"Pipeline class : {pipeline_class}")
11
- if (
12
- pipeline_class == "LatentConsistencyModelPipeline"
13
- or pipeline_class == "StableDiffusionPipeline"
14
- or pipeline_class == "StableDiffusionImg2ImgPipeline"
15
- ):
16
- return TAESD_MODEL
17
- elif (
18
- pipeline_class == "StableDiffusionXLPipeline"
19
- or pipeline_class == "StableDiffusionXLImg2ImgPipeline"
20
- ):
21
- return TAESDXL_MODEL
22
- elif (
23
- pipeline_class == "OVStableDiffusionPipeline"
24
- or pipeline_class == "OVStableDiffusionImg2ImgPipeline"
25
- ):
26
- return TAESD_MODEL_OPENVINO
27
- elif pipeline_class == "OVStableDiffusionXLPipeline":
28
- return TAESDXL_MODEL_OPENVINO
29
- else:
30
- raise Exception("No valid pipeline class found!")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
benchmark-openvino.bat DELETED
@@ -1,23 +0,0 @@
1
- @echo off
2
- setlocal
3
-
4
- set "PYTHON_COMMAND=python"
5
-
6
- call python --version > nul 2>&1
7
- if %errorlevel% equ 0 (
8
- echo Python command check :OK
9
- ) else (
10
- echo "Error: Python command not found, please install Python (Recommended : Python 3.10 or Python 3.11) and try again"
11
- pause
12
- exit /b 1
13
-
14
- )
15
-
16
- :check_python_version
17
- for /f "tokens=2" %%I in ('%PYTHON_COMMAND% --version 2^>^&1') do (
18
- set "python_version=%%I"
19
- )
20
-
21
- echo Python version: %python_version%
22
-
23
- call "%~dp0env\Scripts\activate.bat" && %PYTHON_COMMAND% src/app.py -b --use_openvino --openvino_lcm_model_id "rupeshs/sd-turbo-openvino"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
benchmark.bat DELETED
@@ -1,23 +0,0 @@
1
- @echo off
2
- setlocal
3
-
4
- set "PYTHON_COMMAND=python"
5
-
6
- call python --version > nul 2>&1
7
- if %errorlevel% equ 0 (
8
- echo Python command check :OK
9
- ) else (
10
- echo "Error: Python command not found, please install Python (Recommended : Python 3.10 or Python 3.11) and try again"
11
- pause
12
- exit /b 1
13
-
14
- )
15
-
16
- :check_python_version
17
- for /f "tokens=2" %%I in ('%PYTHON_COMMAND% --version 2^>^&1') do (
18
- set "python_version=%%I"
19
- )
20
-
21
- echo Python version: %python_version%
22
-
23
- call "%~dp0env\Scripts\activate.bat" && %PYTHON_COMMAND% src/app.py -b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
configs/lcm-lora-models.txt DELETED
@@ -1,4 +0,0 @@
1
- latent-consistency/lcm-lora-sdv1-5
2
- latent-consistency/lcm-lora-sdxl
3
- latent-consistency/lcm-lora-ssd-1b
4
- rupeshs/hypersd-sd1-5-1-step-lora
 
 
 
 
 
configs/lcm-models.txt DELETED
@@ -1,8 +0,0 @@
1
- stabilityai/sd-turbo
2
- rupeshs/sdxs-512-0.9-orig-vae
3
- rupeshs/hyper-sd-sdxl-1-step
4
- rupeshs/SDXL-Lightning-2steps
5
- stabilityai/sdxl-turbo
6
- SimianLuo/LCM_Dreamshaper_v7
7
- latent-consistency/lcm-sdxl
8
- latent-consistency/lcm-ssd-1b
 
 
 
 
 
 
 
 
 
configs/openvino-lcm-models.txt DELETED
@@ -1,9 +0,0 @@
1
- rupeshs/sd-turbo-openvino
2
- rupeshs/sdxs-512-0.9-openvino
3
- rupeshs/hyper-sd-sdxl-1-step-openvino-int8
4
- rupeshs/SDXL-Lightning-2steps-openvino-int8
5
- rupeshs/sdxl-turbo-openvino-int8
6
- rupeshs/LCM-dreamshaper-v7-openvino
7
- Disty0/LCM_SoteMix
8
- rupeshs/FLUX.1-schnell-openvino-int4
9
- rupeshs/sd15-lcm-square-openvino-int8
 
 
 
 
 
 
 
 
 
 
configs/stable-diffusion-models.txt DELETED
@@ -1,7 +0,0 @@
1
- Lykon/dreamshaper-8
2
- Fictiverse/Stable_Diffusion_PaperCut_Model
3
- stabilityai/stable-diffusion-xl-base-1.0
4
- runwayml/stable-diffusion-v1-5
5
- segmind/SSD-1B
6
- stablediffusionapi/anything-v5
7
- prompthero/openjourney-v4
 
 
 
 
 
 
 
 
constants.py DELETED
@@ -1,25 +0,0 @@
1
- from os import environ, cpu_count
2
-
3
- cpu_cores = cpu_count()
4
- cpus = cpu_cores // 2 if cpu_cores else 0
5
- APP_VERSION = "v1.0.0 beta 200"
6
- LCM_DEFAULT_MODEL = "stabilityai/sd-turbo"
7
- LCM_DEFAULT_MODEL_OPENVINO = "rupeshs/sd-turbo-openvino"
8
- APP_NAME = "FastSD CPU"
9
- APP_SETTINGS_FILE = "settings.yaml"
10
- RESULTS_DIRECTORY = "results"
11
- CONFIG_DIRECTORY = "configs"
12
- DEVICE = environ.get("DEVICE", "cpu")
13
- SD_MODELS_FILE = "stable-diffusion-models.txt"
14
- LCM_LORA_MODELS_FILE = "lcm-lora-models.txt"
15
- OPENVINO_LCM_MODELS_FILE = "openvino-lcm-models.txt"
16
- TAESD_MODEL = "madebyollin/taesd"
17
- TAESDXL_MODEL = "madebyollin/taesdxl"
18
- TAESD_MODEL_OPENVINO = "rupeshs/taesd-ov"
19
- LCM_MODELS_FILE = "lcm-models.txt"
20
- TAESDXL_MODEL_OPENVINO = "rupeshs/taesdxl-openvino"
21
- LORA_DIRECTORY = "lora_models"
22
- CONTROLNET_DIRECTORY = "controlnet_models"
23
- MODELS_DIRECTORY = "models"
24
- GGUF_THREADS = environ.get("GGUF_THREADS", cpus)
25
- TAEF1_MODEL_OPENVINO = "rupeshs/taef1-openvino"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
context.py DELETED
@@ -1,47 +0,0 @@
1
- from typing import Any
2
- from app_settings import Settings
3
- from models.interface_types import InterfaceType
4
- from backend.lcm_text_to_image import LCMTextToImage
5
- from time import perf_counter
6
- from backend.image_saver import ImageSaver
7
- from pprint import pprint
8
-
9
-
10
- class Context:
11
- def __init__(
12
- self,
13
- interface_type: InterfaceType,
14
- device="cpu",
15
- ):
16
- self.interface_type = interface_type
17
- self.lcm_text_to_image = LCMTextToImage(device)
18
-
19
- def generate_text_to_image(
20
- self,
21
- settings: Settings,
22
- reshape: bool = False,
23
- device: str = "cpu",
24
- ) -> Any:
25
- tick = perf_counter()
26
- from state import get_settings
27
-
28
- get_settings().save()
29
- pprint(settings.lcm_diffusion_setting.model_dump())
30
- if not settings.lcm_diffusion_setting.lcm_lora:
31
- return None
32
- self.lcm_text_to_image.init(
33
- device,
34
- settings.lcm_diffusion_setting,
35
- )
36
- images = self.lcm_text_to_image.generate(
37
- settings.lcm_diffusion_setting,
38
- reshape,
39
- )
40
- elapsed = perf_counter() - tick
41
- # ImageSaver.save_images(
42
- # settings.results_path,
43
- # images=images,
44
- # lcm_diffusion_setting=settings.lcm_diffusion_setting,
45
- # )
46
- print(f"Latency : {elapsed:.2f} seconds")
47
- return images
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
controlnet_models/Readme.txt DELETED
@@ -1,3 +0,0 @@
1
- Place your ControlNet models in this folder.
2
- You can download controlnet model (.safetensors) from https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/tree/main
3
- E.g: https://huggingface.co/comfyanonymous/ControlNet-v1-1_fp16_safetensors/blob/main/control_v11p_sd15_canny_fp16.safetensors
 
 
 
 
docs/images/2steps-inference.jpg DELETED

Git LFS Details

  • SHA256: bd0a36195619b4d29a92a1a7c0ee446b7a37caef1097bbaf23e38a0043884426
  • Pointer size: 130 Bytes
  • Size of remote file: 73.2 kB
docs/images/ARCGPU.png DELETED

Git LFS Details

  • SHA256: 9497008babdfd2cc2b5c2c1e58b435e94cbe69406e2ce7e18ea70cc638f7776e
  • Pointer size: 130 Bytes
  • Size of remote file: 51.3 kB
docs/images/fastcpu-cli.png DELETED

Git LFS Details

  • SHA256: 4a9d22055f37ef91b2b097871c5a632eaf2bb7431bbac2405ff487bfcc4a8499
  • Pointer size: 130 Bytes
  • Size of remote file: 56.6 kB
docs/images/fastcpu-webui.png DELETED

Git LFS Details

  • SHA256: d26b4e4bc41d515a730b757a6a984724aa59497235bf771b25e5a7d9c2d5680f
  • Pointer size: 131 Bytes
  • Size of remote file: 263 kB
docs/images/fastsdcpu-android-termux-pixel7.png DELETED

Git LFS Details

  • SHA256: 0e18187cb43b8e905971fd607e6640a66c2877d4dc135647425f3cb66d7f22ae
  • Pointer size: 131 Bytes
  • Size of remote file: 299 kB
docs/images/fastsdcpu-api.png DELETED

Git LFS Details

  • SHA256: 6ed9600154d423cff4e3baaaf6da31f070cb401cf6a1b497e1987a6c5731907d
  • Pointer size: 130 Bytes
  • Size of remote file: 50.7 kB
docs/images/fastsdcpu-gui.jpg DELETED

Git LFS Details

  • SHA256: 03c1fe3b5ea4dfcc25654c4fc76fc392a59bdf668b1c45e5e6fc14edcf2fac5c
  • Pointer size: 131 Bytes
  • Size of remote file: 207 kB
docs/images/fastsdcpu-mac-gui.jpg DELETED

Git LFS Details

  • SHA256: dd6fedc02f2922b7817f6b078c72feae48a8227b69035926bbd15ebbbb2426bb
  • Pointer size: 130 Bytes
  • Size of remote file: 93.3 kB
docs/images/fastsdcpu-screenshot.png DELETED

Git LFS Details

  • SHA256: 3729a8a87629800c63ca98ed36c1a48c7c0bc64c02a144e396aca05e7c529ee6
  • Pointer size: 131 Bytes
  • Size of remote file: 293 kB
docs/images/fastsdcpu-webui.png DELETED

Git LFS Details

  • SHA256: b1378a52fab340ad566e69074f93af40a82f92afb29d8a3b27cbe218b5ee4bff
  • Pointer size: 131 Bytes
  • Size of remote file: 380 kB
docs/images/fastsdcpu_claude.jpg DELETED

Git LFS Details

  • SHA256: 5f1509224a6c0fbf39362b1332f99b5d65ebe52cd9927474dae44d25321bf8a0
  • Pointer size: 131 Bytes
  • Size of remote file: 145 kB
docs/images/fastsdcpu_flux_on_cpu.png DELETED

Git LFS Details

  • SHA256: 5e42851d654dc88a75e479cb6dcf25bc1e7a7463f7c2eb2a98ec77e2b3c74e06
  • Pointer size: 131 Bytes
  • Size of remote file: 383 kB