What's interesting to me about Nano Banana is video models are progressing so quickly, combining the two gets some pretty wild results. Even when ChatGPT released theirs, the video models were very cool, but within the last couple months the video stuff has gotten crazy.
It's always been tough because consistency has been all over the map. All of a sudden you have image models that are nailing consistency, mixed with video models that can take multiple reference shots.
|