The main novelty seems to be an extra layer of indirection with the prior network (whether it is an autoregressive transformer or a diffusion network), which predicts an image embedding based on the ...
This is an unofficial implementation maintained by @StaryMoon. If this repository helps your reading, reproduction, or course project, please consider giving it a star and following my GitHub profile.
一些您可能无法访问的结果已被隐去。
显示无法访问的结果