What is the best way to predict the future?
This is a question worth asking today, because I made many predictions last year and one of them has turned out to be wrong.
I had predicted that Hollywood would soon be disrupted by independent creators who make better movies using AI tools than the cash-rich creativity-starved Hollywood studios.
But where are the great AI-made films, the brilliant movies that are made by one person or a very small team?
So far, we've not seen any relevant examples, so, this leads us to ask more questions, like -
Why did highly creative people, and these are creative people as you need to have a very high degree of openness to new experiences to be early adopters, getting access to powerful tools of creation lead to… nothing?
Text-To-Video platforms like LumaLab’s DreamMachine and Runway Gen-3 have recently come out, and it seems like Google Deepmind’s VEO & OpenAI’s SORA are not too far away from getting released to the public, so everyone will make the same prediction again.
But they’ll be wrong.
Other than video versions of memes (which we’ll see a lot of) and music videos & trailers (where random clips can still work), we won’t see an explosion of films because just Text-to-Video isn’t enough.
The problem isn't with the people, who are already creative and capable, who are definitely trying to use these tools & models, but can’t create anything resembling a film without investing significant amounts of time.
We need something that actually helps turn your stories to reality, but before we get there, let's examine this situation deeply.
What is wrong with the tools?
II.
There are two emerging classes of software right now - Software with high controllability & Generative Software that don't give you granular control.
Both these emerging classes are also trying to merge together in one. Apps that offer high granularity are introducing features of generative software. And generative software is getting easier to control.
For example, in the medium of text editing software, you can open up any text editor like Notepad, MS Word, Apple Notes, etc, and every word you write into that file - gets to be chosen by you. If you want to write an essay on "Top 5 YouTube creators you should follow if you like Math", you can write such an essay in 30 minutes.
Or, you can open ChatGPT and prompt it with "Write me an Essay titled ‘Top 5 YouTube creators you should follow if you like Math’ and make sure to include the following YouTubers - 3Blue1Brown, Computerphile, Professor Leonard, StatQuest & The Math Sorcerer" and ChatGPT will write your essay in less than 2 minutes.
Notice how you could exert a degree of control over the generated essay. If you wanted to write a joke about Professor Leonard being the smartest jacked YouTuber you've seen, you can just add that in your prompt.
The more you expand your prompt with your instructions, the more control you get.
Similarly, the text editors of tomorrow that come bundled with AI will allow you to do exactly this - and you'll give up some degree of control in typing out the entire thing, to save time. (Unless you're writing to think, the best kind of writing such as this essay, in which case, you'll very much prefer to spend more time typing).
Image generation software is following the same trend. Dall E - 3 is better than previous models at following your instructions, while image-editing software like Photoshop have integrated tons of AI features.
Then, there's video generation which just produces video clips of prompts. It's still very early, which means you can't really instruct any foundational model to accurately obey your exact instructions, but that will change as the models get smarter.
But will those improvements be enough to eventually produce films?
III.
Let's take a step back and ask ourselves, what does creating an AI film mean?
One answer to that question would be a model of how image generation tools already work - You type in a prompt and the AI works to create a feature length film. Maybe it's a mixture of foundational models consisting of one who writes the script, one that generates the audio and the other generates the visuals, and a final one who stitches it together.
That doesn't exist, as it's infinitely hard with strong technical limitations like having consistent characters, and not to mention compute intensive to do that.
But even if the technical & compute challenges weren't there, would it be worth watching?
While most would be dismissive instantly, owing to their general aversion of AI, if someone typed in "Star Wars, but with Spiderman in it", regardless of how the movie is - I think a lot of people would watch it especially if the quality of the movie was all right.
There's craft involved in producing such films as well. Take your judgement of how easy it is to do out of the equation, and you can start to understand that the best films might just start from best prompts.
And someone more creative than me can come up with a sentence better than "Star Wars, but with Spiderman in it." It's not hard to imagine that. I assure you, even you can come up with something that makes for a great movie.
Without actually launching such a company, it's impossible to predict whether such a product can get traction, but if just AI Text to Image and Text to Video are popular, then Text to Film should definitely be a useful product.
The only caveat is that it has to be watchable. It has to be a visual representation of a story, i.e, it has to tell a story properly, effectively immerse the viewer without calling too much attention to itself.
And today, none of the things that people create with AI tools are watchable. They look cool as music videos, with fancy visuals and trippy transitions, but you can't watch an hour of AI generated videos without losing interest unless you're sufficiently intoxicated, at which point the story won't make sense either way.
But there's a class of individuals who can create great films, entirely alone, but they choose not to do it for very good reasons.
III.
Indie game developers, usually as individuals or in teams of one to ten, can produce very high quality films today. But they don't.
Using software like Unreal Engine, Blender & Premiere Pro, it's not hard to imagine such an individual making a high quality film.
They opt for making games instead, because 1) the gaming market is considerably larger than anything they'd get if they made a high quality movie and just uploaded it on YouTube. 2) the intersection of people who play a lot of games and thus, know game mechanics throughly, and people who can code & use software like Unreal Engine is larger than people who want to make films and know how to work with these tools.
But, as most game developers will tell you - even with using AI tools in their workflow, making a game takes a lot of time & making a film can take considerably longer.
Thus, even when it's possible to make a film, the incentives aren't there to do simply because of the time constraints.
For a beginner with no experience in anything, you have to learn how to use a game engine. Then you have to learn how to edit videos. Probably a little bit of cinematography as well. All these can easily take you a few months to master the basics, and a year of consistent effort to get good enough to make something worth watching. Even for those who know how to use these tools, the simple fact of turning the footage made in game engines to films can take anywhere from weeks to months.
If only the tools of film making were quicker, like typing your story into a text box and that's it!- a lot of people would produce great films. And it doesn't have to be simple one line text prompts. It can very well be a short story you've written, and the AI does the work of translating that into a film, giving you both control and speed, so that you can focus on what you do best.
I think that’s the most viable path in this idea maze, and I’m reminded of Steve Jobs who vehemently started moving in a different direction during the personal computer revolution when he was always voicing, “We need to move away from programming. Programming computers is great, but the users don’t want to spend time programming. They want a computer that just works.” In the same way, I don’t think people want to spend weeks generating the perfect video clips and stitching them together - instead of just bringing their story and having AI agents do the rest for you.
Imagine what this can lead to - films that need to be made, that no one is making right now. Films on topics too controversial or unprofitable for Hollywood, films about the realities of war that can cause real change, or just films about someone’s life story or the story of a small city that need to get told. Films of high quality like Oppenheimer except a version that’s focused on the genius of those in the Manhattan Project like John von Neumann, Stanislaus Ulam and Richard Feynman, films such as these that would inspire millions of a new generation.
With text-to-film software, you’ll be able to create such films in a few hours with just a story.
The only skill you'd need is storytelling - something that's innately already present in most people, and something that's uniquely individual and personal enough to most people that almost anyone can create anything if given the right power.
Sure, not everyone will choose to create films. But a lot more people would, and the chances of great films emerging increases exponentially.
These incentives exist in a world with lower friction, and the world of independent Substack writers and YouTube creators have showed us that it is possible.
But given that there’s technical limitations & hard challenges to building such an app, who’s going to make it?
IV.
If you don’t make things, things don’t get made. And at the end of the day, very few people have the skills and persistence to make things that people want.
To see something in this world, it’s not enough to write about it and keep exploring the idea mazes.
Just like the intersection of people who play games and can code is relatively lower than those exclusively doing either, but it’s still higher than other intersections of such proclivities, there are very few people in the intersection of those who can code and those who want to make films.
Thankfully, I’m present in that narrow intersection.
For the past few weeks, I’ve been building the Text-To-Film app of my dreams and I’ll be using it to create films of my own stories & essays. I’ve been deeply thinking about this problem for a long time, and the existence of such a product is necessary not just for me to create my own films, but for anyone in the world with a story to tell.
It is said that only in the process of building something do you realize the depth & complexity of the problem space, and it is only through doing the work that you find out the answers to your problems. I’ve realized this to be true in the past few weeks of building heads-down.
In the process of building the project, I’ve figured out the answers to some very important problems like how to make consistent characters work, how to ensure there’s a consistent aesthetic, and most importantly, how to ensure the creation of films that hold your attention, and even though it took a lot of chewing glass - sometimes, that’s just what’s needed to arrive at important breakthroughs.
With the discoveries I’ve made in the process of building this, I’m now confident that creating films (not just videos) requires a radically different approach and it’s one that is going to look inevitable in hindsight.
I’ve named this app Fect (why, and what does it stand for, are questions that are answered in the Fect docs), and with Fect, my goal was to adapt one of the many short fictional stories I’ve written.
Fiction requires consistent characters to work correctly, and although Fect works perfectly fine when consistent characters are not required - like in video essays, or commentary videos, films based on real people require… well, people.
I think Fect overcomes the glaring problem with AI video tools today - that they all produce things that are a chore to consume unless you work to package them accordingly.
With Fect, your story takes center stage - and if you have a good story to tell, that’s all your audience will be focused on. With a Ken Burns style motion that’s guaranteed to keep viewers engaged, Fect’s agentic workflow ensures every frame of your film follows your script. Thus, Fect is something that can be used today, for every one of your stories and essays, to create films & videos that would’ve required you a team of video editors to do.
Here’s one of my stories that got 6.6k upvotes on reddit’s r/NoSleep when I posted it there a few years ago, which I’ve now made into a film using Fect -
This film was made entirely using Fect with minimal editing to stitch the clips together using a video editor. It took less than 30 minutes to make this. If you watched the video from the beginning to the end, you might have noticed slightly morphed fingers, changing hairstyles or the occasional appearance of a character that’s not supposed to present in that scene. I could’ve edited them out in post, but it’s good to keep them in there to make you understand where Fect is right now. The hairstyles can be controlled to remain consistent as well, I skipped over that part while making. But overall, I think the film looks perfectly fine.
All the problems that I mentioned are secondary if the story kept your attention for the entirety of 9 minutes, and even though this is the worst that Fect will ever be, I’m still happy with what it does now and how capable it is today.
Here’s The Raven by Edgar Allan Poe without consistent characters that took me less than 15 minutes to make -
The visuals are consistent, and adds a new dimension to the poem all on its own. Something like this would’ve taken me a couple dozen hours of do without Fect. Now, it’s done within 15 minutes.
I see myself using Fect to turn my essays that I write here into films soon, and if that sounds like a really autistic reason to make an entirely new company, maybe it is! But as user #1, I’m focused on making it as great for myself as it can be - but it doesn’t have to stay that way.
You can try out Fect today yourself by visiting fect.ai!
Anyone who signs up now will be able to get 15 minutes of video generation time for free!
By subscribing to any monthly or yearly plan, you’ll also get access to the discord where you get to tell me every day how to make Fect better for you. (And here’s a secret for only Bright Mirror readers - if you subscribe to an annual plan - let me know via email by reaching out to founder@fect.ai and I’ll grant you an additional 5 hours of video generation time this month for free!).
With all that said, I’m excited about the future of independent film making and now, I know for a fact that independent individuals creating great films is not just possible, but inevitable.
Let’s revisit the question that I started this essay with -
What is the best way to predict the future?
The answer should be obvious by now, and it is one where you play an active role.
The best way to predict the future -
Is to create it yourself.