Making 'Shinigami'

This is the middle section of 'Shinigami', the short film I'm currently making while researching. The workflow is from ComfyUI to Photoshop to Luma and then back to Photoshop and Premiere.

Don't be misled into thinking "AI" means you can ignore preparation, storyboarding, photography, writing, video editing, VFX, grading and scoring. Machines and genAI models on their own make a lot of mistakes that you don't see in the edited material you see online. You will always need to be passionate about film theory to get the best out of new tools.

As mentioned in other blog posts, 90% of the footage that I got out of video generators is unusable even though I gave them completed images to start with. Without a completed image to guide them the discard rate is closer to 100%. Most of the time they won't follow your instructions no matter how clear and simple you make them.

Yesterday Was Better

Currently continuing my deep dive into generative video filmmaking.

I’ve been getting very acquainted with the newest tools and services offered by companies such as Runway and the newcomer Luma. I’m augmenting them with ComfyUI which runs locally on my Mac and PC.

Everything I’ve said previously still holds true and will remain forever relevant. I’ll be doing a full write up once this deep dive is done.

As always I’m not anti “AI”. We need machine learning. I do not subscribe to the anti-AI feelings of some in the creative and artistic communities online. Used properly and responsibly these tools can be immensely useful for VFX work, cover shots, background plates and so on. They might even get to the point where an indie filmmaker can tell stories without worrying about high costs. Runway has a gallery of filmmakers doing great things by mixing generative imaging with film.

Like the brief experiment in my last blog post, generative video works better when you supply it with a still image, either your own fine art, drawing, CGI, photography or generated picture. By giving the models visual guidance along with prompts you can control them much better. However, the discard or error rate it still very high, something like 90% of the time you will either be frustrated with the output or laugh at the output. Future blooper reels really should include generative errors.

Even on those rare occasions when the model gives you something great, you’ll need to do a lot of clean up and editing to make it useful for story telling. As mentioned before, generative tools are VFX/CGI tools and that’s the approach to take with them.

For the time being I’m going to post below a comparison of Luma and Runway’s output. I tried to replicate a scene from John Woo’s classic film ‘A Better Tomorrow’ starring the inimitable Chow Yun-fat.

I did not supply a reference image to either service. I wanted to test pure prompting to see what kind of fidelity and errors they could create from a description alone. The prompt was:

‘a Hong Kong movie, night, outdoors, a Chinese man has bruised face, one side of his face has bandages, he is angry and arguing, he wears a blue jacket and pink shirt, close up angle’

Here are the results, Luma in the middle below original footage and Runway at the bottom.


Luma’s output had vigour! They even generated a motorcycle accident in the background! Of course, it’s distorted garbage that you couldn’t add a voice too. If you used some of the generative lip synching tools that are offered at the moment the footage would look even more comical. Pause the video at any point and the character’s face looks like a butcher’s worktop.

Runway’s output looks more realistic (except n the background, they added a drunk man in a gold shirt walking backwards), but it is bloodless, emotionless and the majority of instructions in the prompt were not followed. Runway refused to show blood, violence or anger and repeatedly flagged my instructions for asking.

If you were a film director and your actor, make up artists and stylists were this disobedient you would fire them. In the world of generative filmmaking however you have to exercise a lot of patience alongside wishful thinking and wasted money (in the form of fixed amount of credits these services supply you monthly).

We soon find out why Runway performed so badly at this task on their website:

If Runway's Gen3 could help storytellers tell stories with machine learning tools, they won't be able to tell a story like Fight Club, American History X, American Psycho, Takashi Miike's Audition or Takeshi Kitano's Sonatine. You should be allowed to create and upload anything legal, which includes horror, violence and swearing if a story needs it.

The tools are far far far from ready for character work unless you only require the most minimal amount of movement with minimal or no dialogue. You will not be producing AI video with a synthetic Chow Yun-fat, Toshiro Mifune or Al Pacino in the foreseeable future (getting an audience to pay for that is another big problem).

Yesterday was better than this. Knowing the talent of John Woo and Chow Yun-fat it is quite possible that they only needed to film one take for the scene above. It would have been rehearsed in person and mentally many times, but getting the shot done would not have been like using a slot machine, which is how it feels to use video and image generators.

I will end this blog post with a more successful case. As mentioned above, if all you require is minimal movement and almost no dialogue, some interested things can be created.

The very short Wong Kar-wai inspired film I crafted below started locally on my computer with ComfyUI. After many iterations I took the best images into Photoshop for clean ups and fixes. Then took the images to Luma’s Dream Machine to add motion. After several iterations I took the footage with the least errors into After Effects and Photoshop for more clean ups, and then finally edited the shots in Premiere, adding Wong Kar-wai’s signature picture book slow motion, music and voice over.

For now I call it ‘Farewell My Robot Concubine’. There’s a little twist in the reveal.

It was a fun experiment and I have been thinking of extending it with more scenes, but damn those credits keep going down, down, down. In another blog post I’ll reveal more production details including bloopers.

Sora competitor Dream Machine launches. Appears to have all the same sexist and racial biases.

By now if you’ve been watching the AI hype machine you would have noticed that generative AI models exhibit widely reported sexist and racial biases.

Some have tried to brush this away by saying those biases are baked into humanity anyway and the models are only reflections of us. That opinion is not shared by many because it is only partly true, because the data sets used to train the models are range limited and do not cross many cultural boundaries. They are mostly western data sets and thus reflect western biases, not global biases. Despite Luma having a number of Asians on their team those biases seem to still exist in their Dream Machine based on the limited testing I’ve done since launch.

Before we jump into testing Luma’s Dream Machine, we first need talk about those biases. There was a popular joke in 90s comedy that went something like this:

“All white people/black people/Asians look the same to me.”

The one liner poked fun at racists for stereotyping foreign people. Popular AI image generators also homogenise and stereotype Asians in much the same way. They’re guilty of doing the same to Caucasians and black people to some degree too, but in depicting East Asians it is most clearest. Asking image generators to depict a Japanese, Korean or Chinese person usually results in someone who looks South-East Asian. The models treat all East Asians the same with only minor variations, but as a Chinese, Korean or Japanese person will tell you, they can tell each other apart even from a far distance. Though they are physically diverse within they recognise ethnic differences between each other in the same way that an Englishman can usually point out a Spaniard or German in the street.

Secondly, images have problems depicting interracial and intercultural gatherings. Ask a generator to depict an East Asian man shaking hands with a white woman and it is almost guaranteed the East Asian man won’t make an appearance in the image. He’ll be replaced with a white or black man or the woman will be replaced with an Asian woman. Some say Meta’s image generators have fixed this bias, but that’s just one model and the problem is still widely seen.

With the two points above in mind, we can begin our tests by using a currently popular ComfyUI workflow called ‘AbominableSpaghetti’ which uses the PixArt E model. This model hasn’t captured any of the media’s attention but can generate images just as good or better than those coming from the biggest tech companies and can run locally on your computer with ComfyUI.

Using the AbominableSpaghetti workflow we can ask the model to generate an image from the following prompt:

‘A couple sitting in a traditional bar, a Japanese woman drinking whisky, a suited European man smoking, there is cigarette smoke in the air, dark room, dramatic lighting on their faces’

A minute later we are presented with our image.

The location is great, but as mentioned above, models have a problem depicting a European and an Asian together. In this case it has rendered them both Asian. The couple are smoking odd looking cigarettes and their hands are badly distorted. There is cigarette smoke rising from the whisky glass because the model has merged the existences of an ashtray and a glass into one object.

I change the order of the prompt a little and generate again. We aren’t going to easily eliminate the interracial issue so I ask for a Japanese couple. This time it’s better. There are still obvious errors, but I can edit this one in Photoshop.

His hands are badly distorted, there is still smoke rising from the glass, and as mentioned earlier, their faces are homogenised faces, the woman in particular. She looks more Thai than Japanese and her lips look like they have had botox shot into them. A very common thing image generators do to women.

In Photoshop I repair the man’s hand, hair and hunchback, then comp some newly generated fabrics to fix the sleeves, manually paint the image to make them look more ethically Japanese, then add an ashtray and cigarette a little behind the whisky glass so that the smoke makes sense.

That looks much better now. From a glance that could pass for a photo. That can be our starting point for testing Dream Machine.

Heading over to Luma’s website we have two options. We can generate a video from a prompt alone or from an image with a prompt. We’re going to use the image above. Dream Machine will take instructions from the prompt to generate subsequent frames. This should be a lot more predictable and accurate than generating from a prompt alone because it has visual guidance.

Our first prompt is ‘A Japanese couple talking in a quiet bar’

120 seconds later the video appears. I expected it to take longer because I had read the site was getting heavy traffic but half the world was asleep during my test so I got lucky.

The video however was disappointing. (If you see square artefacts at the start of the video that’s Squarespace’s fault.)

The man disappears into a blob leaving the woman by herself wondering wtf just happened. If the model thought that was what I wanted then it did a good job because the woman's lips do appear to quietly say wtf. I do like the movement of the video though. Let's try again.

Prompt 'A Japanese couple sitting quietly together in a bar, he is smoking a cigarette and she is drinking'

Despite the cigarette being easily recognisable and waiting to be picked up, the man performs a magic trick and makes a double cigarette appear out of thin air and then his face melts as it vanishes in a cloud of overly thick smoke. The woman isn’t phased by this.

It's time to test some acting direction. If generative video is going to be used in movie production it's no use spending hours and hours generating videos if characters don't follow a storyteller or director's commands.

Prompt 'A Japanese couple in a bar. She is upset with him. He bows his head to say sorry.'

In my attempt to get the characters to follow direction we may have inadvertently discovered more bias. Instead of the woman being upset and the man being apologetic the roles have been reversed. The male is dominant and aggressive and the woman is subordinate and weak. This isn't the scene I wanted. Could this be the model's bias towards depicting Asian woman as being subservient? Let's try again.

Prompt 'A Japanese couple in a bar. The wife is angry at the husband. The camera moves in for a close up of the woman's face.'

This should be a clear instruction to follow. What was generated this time was even worse than the prior result.

Here we see the man forcefully bullying the woman to the point of almost making her cry. I wanted a scene depicting a man asking for forgiveness because he knows he has done something wrong, but instead of that I'm presented with a male abuser and possible confirmation that generative AI models don't like to show Asian women in a strong light. In AI circles we often see that generative depictions of East Asian women are either entertaining, kawaii, or subservient with child-like faces. East Asian men are depicted as warriors, ninjas, kung fu masters, and buddhas but never just regular boyfriends, husbands or dads doing normal day to day things.

I decided to give it one last shot but this time it threw an application error.


We have learned that over 95% of material generated by AI is waste. The output gets deleted for being too bad to use.

‘Abel Art, an avid AI artist who had early access to Dream Machine has created some impressive work. But he said he needed to create hundreds of generations just for one minute of video to make it coherent and once you discard unusable clips. His ratio is roughly 500 clips for 1 minute of video, with each clip at about 5 seconds he's discarding 98% of shots to create the perfect scene. I suspect the ratio for Pika Labs and Runway is higher and reports suggest Sora has a similar discard rate, at least from filmmakers that have used it. ‘

The marketing demos we see online that hype the latest models are highly curated for many reasons including errors and biases. It's going to be really hard to remove all these biases from the models. The greatest storytellers removed biases from their mind and broke through racial and cultural barriers to create the characters we love and love to hate. If generative AI models can't do that and users spend extraordinary amounts of time and money generating useless videos for the recycle bin then it will be an uphill struggle for acceptance.

Is ToonCrafter a game changer?

I downloaded and played around with ToonCrafter last night until 2AM. As usual, AI bros on social media are using terms like 'game changer!' and 'this new thing changes the old thing forever!'.

So let's break it down.

1. ToonCrafter looks neat, but as with all things generative AI, demos and real world use are very different. Demos are curated to show the best examples. In practise actual film and animation shots are so varied and expansive that an AI model often can't understand what's going on in a scene and will produce large errors that are very time consuming to manually repair.

2. Motion tweening between keyframes has been possible for around 35 years. It has already been used for tweening between keyframes for facial expressions since the 90s. Motion tweening certainly could do with updates and use machine learning but for it to be truly useful it has to be available inside apps like Clip Paint Studio, TVPaint and ProCreate Dreams. Animators need layered files, vectors, camera controls and reusable assets.

3. Continuing from the last point, because generative imaging outputs compressed video files there are no reusable assets to work with. That means recolouring, retiming, rescaling, rekeyframing and cleaning up is extremely difficult or often impossible.

4. The compute power required for ToonCrafter is very high, needing at least 24GB of video memory for efficiency, and even then it takes a very long time to output low res 512 pixel compressed videos. That's not something an animation team can work with. They'll spend hours and hours waiting for output and waste a lot of time feeling disappointed by the unpredictable output.

Even the output of the included demo assets can vary widely from useless to not good. Dropping down the resolution can allow it to generate on lower end GPUs but this is what you can expect from an RTX 4070…

5. Accurate lip synch is a known problem in generative video and here in ToonCrafter it would be impossible.

6. ToonCrafter currently won’t work on a Mac because some dependencies are incompatible. ComfyUI also has this issue because the sources for all the various plugins come from different places, some of which are more compatible with one operating system than another.

So as neat at ToonCrafter is, it's not a game changer. You might be able to use it to animate some small items that you can comp into a scene but you'll be disappointed trying to use it, now or in the future, on large scene elements and character animation. As always, AI bros just want clicks. They can be helpful with their tutorials, but they are unhelpful when they exaggerate things that are outside their field of knowledge.

Anime Architecture by Stefan Riekeles

Reading through Anime Architecture by curator and producer Stefan Riekeles has been inspiring. It features superb new high resolutions scans of some of the best concept designs and artwork ever to be shown on the screen.

Works featured in the book are from Akira, the Patlabor and Ghost in the Shell series, and others. This is the stuff that inspires creatives young and old. When you tell artistic children their dreams are useless because they will be replaced with a robot you are being abusive to them, but when you show them what great people can do with many tools, you raise great children and the future is better for it.

Riekeles also has full size 1:1 prints in display at his gallery and on sale at https://www.riekeles.com

OpenAI releases text to video tool 'Sora' and instantly generates a wave of social media spam

A few hours ago OpenAI launched its text to video tool ‘Sora’ (from the Japanese word (空) for sky) and predictably AI fanatics spammed film makers forums on Reddit and YouTube claiming the film industry’s days are numbered (they really mean Hollywood Jews, they will not apply this wild idea to film makers in the rest of the world) and that the end of the camera is coming too.

Madness.

It should be noted first of all that those wild opinions are from mostly 40-50 year old men (the typical AI YouTuber demographic) stuck in front of a computer all day who aren’t film makers and don’t have a real passion for film or story telling. They're the people generating novels with ChatGPT and flooding Amazon with these synthetic books that nobody wants to buy or making AI art that doesn’t sell. They’re the people who sent ChatGPT generated short stories and Midjourney artwork to Clarkesworld Magazine and then got banned from ever submitting again.

Let’s clarify what we are seeing here. Sora is neat, I really like what it can do. Who can’t be impressed? But it is a VFX and CGI tool. The term ‘computer generated images’ is more applicable to generative AI than 3D modelling and animation ever was. I’m sure someone out there with a big budget and time might make something neat, but it would be great if people who don’t understand filmmaking or audiences would not jump to conclusions.

With demos of Sora you are seeing a curated sample of low res videos in which the prompter has very little control over the output. Like all generative AI tools it is impossible to predict the full output which means a lot of credits and compute time is wasted generating over and over again. If the people who were bowled over by Sora looked closer they would have noticed videos replete with errors, such as an Asian woman who sometimes had two left feet, street signs with nonsense logos, billboards featuring non-existent jibbersih hanzi/kanji characters, and so on.

These issues are not easily fixed as the possible combinations of elements that can make up a scene are infinite. Bugs in the output will be a persistent problem, just like bugs in all software can never be fully quashed. Then you have the biggest compute problem - trying to animate character performances and mouth movements, which requires real time feedback, audio sync and the equivalent of doing multiple takes and shoots to get a character’s performance exactly where a story teller wants it to be. The best film directors have had their actors do numerous takes to fine tune a performance and the same applies in 3D CGI and here in GenAI.

There are also no controls for real time camera movement, changing the angle of a camera or changing the focal length or depth of field. Prompting is the most inefficient and slowest way to do these things in a virtual environment.

Films like Chariots of Fire had some cuts in the same scene shot at 24fps, 72fps and over 200fps. In animé, different animation layers/cels are animated on ones, twos and threes in the same shot. Generative video lacks these fine controls for frame rates and keyframing making the output, compute demands and costs hard to predict.

Other issues with using Sora include colour grading. Film editors and colourists understand how important log/raw footage is for the colour grading process. Art directors are always requesting sudden changes and those changes need to be done in real time. GenAI tools like Sora can only output compressed videos and you can’t change anything until you see the output. If you ask Sora to generate a video again with a different colour scheme the video itself might not be the same as the last - objects and characters could be in different positions with different errors from the last generation.

After generating a compressed video it is much harder to change the grade or colour correct. The quality will deteriorate. That’s OK for social media on a phone screen, but not for cinema. On a big screen even the smallest errors and artefacts are distracting. You don’t want audiences walking out of a screening because of quality issues, not purchasing titles because of bad reviews, or asking for refunds.

Even The Guardian’s article on Sora was hyperbolic, stating that the tool could generate video ‘instantly’ never mind the fact that that no video content can be generated instantly and that the videos themselves are extremely low resolution and feature a large number or errors that OpenAI highlighted on Sora’s page ‘Sora might struggle with simulating the physics of a complex scene or understanding cause and effect in specific scenarios. Spatial details in a prompt may also be misinterpreted, and Sora may find precise descriptions of events over time challenging’. To fix those errors will require an extraordinary amount of engineering and an unspeakably large amount of compute power to generate video, especially in native 4K or 8K with HDR support.

The film industry isn’t going anywhere and neither is filming actors on locations and practical sets. People pay to watch actors (including motion captured) and that’s not changing. When it comes to consuming entertainment, media and literature we’re talking about a shared human cultural connection. ChatGPT generated books don’t sell well because readers want to connect with real authors.

I grew up transitioning from celluloid and paint to digital photography and software. I learn every new technology that comes around, but because I have lived through all these cycles and understand consumers from a business and fan perspective I never fall for hype.

I remember clearly in the late 90s when there was fear that CGI would replace actors. It was all over the media. Paul Newman had it written into his family estate that if technology ever allowed him to be revived after he passed way that permission would never be granted. He wanted his likeness to remain his own. In 2002 Andrew Niccol of Gattaca fame made a satire called ‘Simone’ about generative AI and synthetic actors.

It never happened of course, even though CGI at its best is excellent it is never good enough. All it did was complement film. James Cameron worked as hard as possible to make Avatar as lifelike as possible but even the sequel looks like a high resolution video game composited with film footage. Hard surfaces are much easier to recreate than organic lifeforms and even water.

No new technology completely displaces and replaces what came before it. Classical instruments weren’t killed by synthesisers and Logic Pro samples. Ebooks didn’t kill books. Streaming didn’t kill vinyl records (vinyl ended up outliving the mighty iPod!). The best film directors in the world still shoot on film. Things co-exist.

We were also told that CGI would replace all traditional animation. When interviewed about his anthology ‘Memories’ which used CGI for a number of difficult shots, the great mangaka and Akira director Katsuhiro Otomo said ‘Because I draw pictures, I don’t have a plan to move away from 2D to the main use of 3DCG. 3DCG anime is like animating dolls, so people like me who have thought through the use of drawings do not have much idea about it. To begin with, it’s certainly true that Japanese like pictures with ‘contour lines’. I don’t think 2D anime will be rendered entirely obsolete, but it will stay as one of a number of diverse choices.’

Over 25 years later 2D anime is more popular than ever and Otomo’s words were prophetic. Yet despite all the evidence, you can still find tech fetishists in software engineering who insist to you that 2D animation and film photography don’t exist now. They live in a bubble so tight and small they can’t see the world outside.

At the end of the day audiences and consumers decide what becomes successful or not. You can have the most outrageous technology in the world but if your content annoys the public it won’t sell. Look at Ghost in the Shell. In 2002 they did a CGI update of the anime classic. Fans hated it and will always prefer the original. A decade later they did a live action movie remake with even better CGI. Fans hated it that too, this time for a variety of reasons.

VFX tools like these video/image generators can be incorporated into your work and if you do it smartly then it is no different from when Ryuchi Sakamoto pioneered electronic music. He never abandoned playing a classical piano in front of an audience though, he did that until his last days and posthumously continues to perform on his piano in the mixed reality concert experience ‘Kagami’.

Learn everything, absorb what is useful, incorporate technology into traditional arts and crafts, reject hyperbole, lovingly handcraft things that people will love, and don’t spam. The better your work is, the more of yourself that you put into your work, the more fans will reward you for it. Generative AI will have a permanent image problem associated with spam, memes making fun of AI errors, trolls hiding behind AI to mock creative workers, misinformation and climate impacts. You won’t have this image problem.

Claims that AI will replace or displace most jobs are bogus

We are being told by finance bros, Twitter cretins and LinkedIn lunatics that AI (a buzzword that can mean almost anything now) will displace or replace anywhere between 40-90% of workers, enhance our productivity and make us more efficient, that it will free us up from hard work and give us more leisure time, and even more absurdly, that it will create a world of abundance for all.

Let’s break those claims down.

  1. It is impossible for “AI” or robots to do tasks that require the level of dexterity and flexibility that only the human mind and musculoskeletal system can pull off, even ChatGPT says it would be impossible. In real life, unlike science fiction, robots need a degree of bulk to be stable and are not good at self maintenance. AI, being software, will always be buggy and the more tasks you try to teach a system the more buggy and resource hungry it becomes. If robots and AI could displace significant numbers of workers it would come with reduced reliability, reduced dexterity and increased unpredictability in many fields.

  2. Microsoft, Meta and Google are talking up a big game about AI, but just take a look at the state of their platforms and Windows 11’s bloated and buggy condition. Google and YouTube are happy to host fake and scam ads so their moderation tools are failing to detect wrongful activity, unless they are allowing it. Instagram is infested with bots and sex pests - their AI moderation doesn’t protect users. Windows 11 still randomly crashes, looks like it was designed by a 12 year old in Microsoft Paint and users are already trying to uninstall or remove the Bing Chat bloat. That’s an operating system under development for almost five decades and it is a mess, so it’s not hard to imagine how bloated and buggy their future Godputer would be like in practice. Keep it far away from military bases and defence departments.

  3. The people making such claims don’t have domain expertise of all the jobs and sectors they are talking about. They are salesmen, newsletter shills and report writers who attach themselves to whatever the latest trend is. Some of them were raised by house servants or at the top of a caste system, so they never learned to respect working people anyway. If their reports and posts are super optimistic and buzzwordy, and if they fail to mention the technical limitations and implementation problems of any new technologies, it’s because they don’t really know what they’re talking about. They’re no different to Deepak Chopra talking about quantum physics.

  4. Many sectors are already operating at peak efficiency. We know that because we produce far more goods than we need and generate tons of food, electronic and clothing waste. We actually need to produce less, produce on demand, have more local production, higher quality more expensive long lasting goods, more locally repairable goods, and more just-in-time production and shipping. That’s something the polluting Sheins of the world don’t want to hear about. AI does’t solve this problem. Human willpower and cooperation solves this problem. An AI can suggest people take action against over production, slavery and pollution (things we already know) but it requires actual people to make the decisions and do it.

  5. Whenever sectors do use automation to increase efficiency, the time saved is filled up again by producing more goods, more content, more projects and expanding product lines. Employees do not end up working less, just differently. This is exactly what we have seen in creative workflows. As a production and post-production creative, I have used and implemented everything from Photoshop Actions to machine learning based tools in our workflows to speed up work and reduce mental stress. The result of efficiency gains allowed companies to ask us to produce more content. A decade ago we used to produce about 3 images per product. Today we are likely to produce up to 6 images per product and an optional video. Implementing machine learning and automation isn’t plain sailing either and often comes with bugs that are never fully resolved.

  6. At the pharmacy where my brother works they recently installed a state of the art robot for stock tracking and dispensing drugs. It didn’t displace any workers and requires onsite and remote support whenever there’s a hardware or software issue. That’s just how robots are.

  7. “AI will free up our time so we can create art.” Not everyone wants to create art, but remember that one going around on the socials? It didn’t age well considering the web is now being spammed by AI art that anyone can produce and often rips off the styles of well known artists. Generative art looks attractive at first sight because our visual cortex is experiencing something ‘new’ but within moments a kind of dire existential dread sinks in, similar to when you get a robocall or a bot sends you a DM. The images are bland, lifeless and have a ghoulish vibe to them. Apparently even AI systems prefer real art and real photography to the generative kind because when they are fed only generative art their abilities begin to decay.

  8. “Generative AI democratises artistic creation” is another one we sometimes hear, and at first glance it appears to be a true statement, but democratisation of content creation already exists with the plethora of options available. With generative AI (especially in the cloud) you are getting centralisation, it enriches chip makers, rent seekers and energy companies. It pushes up the cost of living while pushing down the cost of labour. It encourages talentless soulless executives and shareholders to tell artists their skills are near worthless. It reduces the quality of creative production and increases the ease of which spam can now be generated. It contributes not only to the enshitification of the web but the whole human experience. LinkedIn is full of 40-something men who generate images of women of colour (aka they’re saying please don’t give work to real women of colour) and claim they are part of a youth movement democratising content creation. They tried the same trick with VR, the metaverse, NFTs and crypto, by claiming they were democratising finance and being inclusive. They were largely rejected by society and then pivoted to AI, after they had caused millions of people to lose money.

  9. We could already have a world of abundance. It is the wealthiest who rig economies and create artificial scarcity in order to drive their wealth higher. They’re doing the same thing with AI by driving up the cost of using software or playing games, increasing energy consumption and making energy scarcer, increasing pollution, and hanging the threat of AI over the heads of workers to scare them into complacency and obedience.

  10. Driving a car is something a teenager learns to do and for the rest of our lives we mostly drive sub-consciously because we rely on known routes, laws and landmarks to assist us. It’s when laws are ignored that bad things happen. The best full self driving offered still can’t consistently perform on the level of a law abiding driver. If AI can’t actually drive cars yet (and in some parts of the world it won’t be possible at all) despite 40+ years of development, then AI won’t be able to do all those jobs that require a lot more complex real time reasoning, fluid thinking and dynamic responses than driving requires.

  11. If the consuming public were happy with bots replacing people, then athletes and sporting events would have been replaced by bots and virtual sports already. Why spend $20 million on a footballer when a team of computer controlled footballers can play sponsored advert filled virtual matches? Golf is an extremely wasteful and inefficient use of land and resources. Why can’t that end and be replaced with virtual golf? Big Blue beat Garry Kasparov at chess in 1997. Fast forward 27 years, there's still no audience to watch AI chess players play against each other in AI chess tournaments. The technology has existed for years, but consumers (fans) won’t pay for that. They will pay to watch real athletes struggle to win. Likewise, consumers will always pay more to read books written by real people, not chatbots. They want to build an emotional connection with the author, visit the author at a meet up, and get a signed copy of the book. A book is not just words on pages.

  12. Finally, if the cost of producing something, whether it is art, literature or clothing, is closer and closer to nothing then there’s little incentive for customers to want to pay you good money for whatever you are offering. Your offerings are a McDonald’s Happy Meal at this point, or worse. The world’s economy can’t be made up full of Happy Meal and fast fashion equivalents. Every sector depends on diversification of goods and services, from high end and artisanal to the low end mass produced.

I end this blog entry with a video of a delightful lady who runs one of Tokyo’s many popular food joints ‘Onigiri Bongo’. Japan already has a few restaurants with robot staff (they are gimmicks), but a robot cannot make a thousand onigiri a day without health and safety hazards and causing a mess, cannot build a rapport with customers and cannot make customers wait in a line outside for an hour every day. Connections, traditions and craft are important.

When I started my first novel Scrivener hadn’t been released yet.

Writing this science fiction novel took me 18 years of reading and research. Scrivener came out after I began working on it and over the years was so helpful and indispensable for managing all the notes and ideas.

Sometimes I would take an hiatus to read and research other things. Many ideas and scenes were revised or scrapped during those years but the central theme remained constant. I not only wanted the story to be ahead of its time but also contemporary enough to be relatable, so my bookmarks and notes kept growing and growing.

Finally I decided there was nothing left to study. The novel will be finished this summer. There will also be concept designs and artwork to accompany it.

Thanks to Keith of Literature & Latte for helping me stay organised for so long.

Recovered my old film school DV tape from 2001

Today I managed to finally capture my old film school DV tape. The tape had travelled with me for almost 20 years, from flat to flat and from country to country. I thought it wouldn’t have survived after so long. Tape degrades.

I was about to capture the tape in early 2020 but then covid came along and delayed those plans. I didn’t want to ask someone to capture it for me. I really wanted to enjoy the process of capturing tape just like I did when I was young. In fact, the first time I ever captured video computers weren't powerful enough to transfer live video from a camera. Computers needed a special targa capture card to import each frame individually as a targa sequence.

Finally covid subsided and after keeping an eye on eBay for a long time I found a Canon XM2 in excellent ‘almost new’ condition at a great price. The short film itself was filmed on the XM2’s big brother the XL1S, but the cameras are very similar internally. Video capture is somewhat similar to film scanning. You grab a coffee, set up the equipment, and then diligently perform the job of transferring media manually into the computer.

Connecting the XM2 for capture proved tricky. First, I had to use an old Mac with firewire. Second, Adobe Premiere stopped supporting miniDV capture a few years ago and there was no method to install a version of Premiere old enough that did still support miniDV. QuickTime still does allow firewire capture, but I discovered that the start of the tape had degraded from exposure to air and heat. Because of the damage to the tape, Quicktime was unable to capture video with audio, but it could capture the streams separately!

Capture done, you can see how little detail and resolution we worked with in those days. Imagine if we had 4K or 6K HDR cameras at the time! 🤯 The colours produced by Canon’s 3CCD system were great though. There is hardly any grading applied to the images below, in some scenes none at all.

Screenshots from ‘花’ (‘Hana’) a short film in Japanese that I wrote, shot and directed in film school back in 2001 starring my friend Ryoko who was a news reporter on Nihon TV at the time.

I made a vertical trailer for social media which can be watched below.



Getting it right requires time...and feedback

Even though I've been designing a completely new type of camera system that will be years ahead of anything that currently exists, my favourite camera will probably always be my Leica M3, a custom version with a genuine Italian rosewood body. Whenever I take it to camera shops for servicing it always receives the same compliment 'It's a unique piece!'

Leica M3 (photo taken with an iPhone :p)

I've used just about every type of camera over the years, but what attracted me to the M3 was that Leica put over a decade of research into it because they wanted to make sure they got the M series just right with the very first release (an extremely rare feat for any device). 

Over that decade, Leica frequently interacted with customers to help them design the M3. Users wanted it to be streamlined and ergonomic compared to the irksome and intimidating Leica III. Because the principal market was street photography, the M3 also had to feel second nature to users so that they could quickly capture moments around them. It also had to be easy to repair and recycle.

It was one of the earliest examples of a company asking for global customer feedback and beta testing the hell out of the product. The incredible results of that collaboration haven't been replicated so well since. Evidence of that can be seen in the fact that people still enjoy using the M3 65 years later.

Only one change was made during the M3's life-cycle - a change from a double stroke to single stroke advance lever (both equally useful). As time went by, Leica added a few more bells and whistles to the M series that weren't possible in the early 1950s, and sometimes they made mistakes in doing so, but the tradition of keeping their renowned product line pure still exists today in the M10.