Issue #42 – I’ll watch the TV news only if Charles Darwin presents

December 30, 2023

Free Edition

Generated with Stable Diffusion XL and ComfyUI

In This Issue

Intro
- New Year’s Resolutions.
What Caught My Attention This Week
- Channel 1 previews the future of news broadcasting, and its implications.
- Microsoft admits that AI will displace jobs and it’s working with US labor unions to address the impact.
- Eminent US cardiologist Eric Topol shows how AI is beating cardiologists and radiologists at diagnosing illness from almost invisible clues.
- Terence Tao, prodigy professor of Mathematics at UCLA, believes AI could help mathematicians discover new theorems.
The Way We Work Now
- A group of artists uses MusicGen, an AI trained by Meta, to create an entire album in 24 hours.
How Do You Feel?
- The head of the UK judiciary is not too happy with the growing use of AI among judges.
Putting Lipstick on a Pig
- It turns out that Presto Automation is using human labor instead of AI to process 70% of fast-food drive-thru orders.

Intro

Aaaaaaaaaand we are back! This is the last issue of the year.

I know you missed reading 10 pages of a dense newsletter for two weeks in a row, and the anticipation of reading this new issue the morning of New Year’s Eve gave you goosebumps.

Before we start, just one thing.

During this 2023, I met a lot of professionals in the most disparate fields, at every level of seniority. And I’ve seen the tasks they perform in their day-to-day.

In most cases, these professionals would have a non-insignificant, tangible benefit from using GTP-4 in almost every one of those tasks. Yet, they don’t use it.

These professionals are curious about AI, and understood the potential of large language models in principle when we discussed the topic. But they didn’t act on it. Even after a crystal clear demonstration of the benefits of using GTP-4 in their work, they returned to their habits, doing things the way they always did.

The productivity gain they are leaving on the table is huge. The burden they insist on carrying is unnecessary.

None of this is surprising. Why people don’t change their habits has been a topic of study for decades and this behavior is well understood and documented.

I have read so much about this topic that, when working for my previous employer, a technology provider, I strenuously advocated that to effectively sell technology to a large organization, you must couple the solution with a change management program.

And because technology providers are not experts in change management, I pushed hard to partner with change management consulting firms to deliver whichever solution my former employer wanted to sell.

All of this to say the following.

At the end of every year, a lot of people make commitments to themselves. They want to change something in their lives.
Well, this year, don’t commit to going to the gym more often. Rather, commit to using GPT-4 every day for a month.

Using GPT-3.5-Turbo won’t do. Using Bard for sure won’t do (and you should stay away from that AI system as much as possible).
Committing to learning how to use GPT-4 won’t do either because it’s too abstract and you have no incentive to put time aside to learn.

Instead, spend $20 for a single month of ChatGPT Plus subscription and commit to trying to use GPT-4 for every task you perform at work. Every single task. Without any formal training on how to use it.

Just open the ChatGPT window and explain to GPT-4 what is that you need to do.

Some tasks will not be suitable as they require you to manipulate sensitive information. But for everything else, commit to trying to use GPT-4, without assuming that it’s not capable enough to do the job.

You will be surprised by the speed of execution you can achieve after just one week. Even if GPT-4’s output is not perfect. Even with the hallucinations.

Good luck with your commitment and happy new year!
Alessandro

What Caught My Attention This Week

The biggest story of the last two weeks is the announcement of an entirely AI-powered news broadcast channel called Channel 1.

If you haven’t watched the 22min video announcement yet, you really, really should:

https://youtu.be/v4MzO6mzdd4

What we are seeing here is a series of synthetic news anchors delivering news reports from across the globe. Which means that the organization behind this video has trained or fine-tuned:

A series of realistic virtual avatars, like the ones we have mentioned in the past issues of this newsletter, capable of accurate lip-sync in multiple languages.
If you have read Synthetic Work long enough, you know that the technology to achieve this is still very immature and the task is extremely challenging. Even in this video, you can tell that the lip sync is not always 100% accurate.

Even more challenging is the fact that the company promises the capability to customize the aspect and personality of the news anchors. This is phenomenally hard to achieve credibly and I’d remain skeptical until I see it in action.

You can tell that there’s a remarkable difference in terms of level of realism between the primary three news anchors in the video and the secondary ones you see from minute 12:20. I would expect the latter to be the ones that the company is capable of offering today for customization.

Regardless, of all the technologies required to build a synthetic news channel, this is the one that is the most difficult to implement.

An AI model capable of translating and lip-syncing the voice of a person appearing in footage captured by ground reporters, from one language to another, without constraints.
Everything I said above applies here. Phenomenally hard to achieve credibly. Hard to believe until we see it in action.

A series of extremely realistic synthetic voices, featuring emotional inflections.
For a project like this you won’t use the stock voices offered by OpenAI, as you don’t want to have competitors replicating your exact anchor. And, probably, you won’t use the voices of any other AI startup, even if they are customizable, due to the extraordinary costs of speaking text 24/7.

More likely, you would fine-tuned open access voices like the ones I used for Fake Show. But even so, the emotional inflections we heard in the videos are exceptionally hard to obtain and come at a very high computation cost.

An AI model capable of summarizing any input text (videos from on-the-ground reporters, PR announcements, tweets, blog posts, human-written news articles, etc.) in the style of a TV news report, in any language.
This is very easy to do and you can see an example resembling this approach by looking at the Breaking News section of Synthetic Work.

It can be done using a model like GPT-4 or, with enough fine-tuning, an open access model like LLaMA 2 or Mistral.

An AI model capable of generating images and videos to support the news being reported.
Generating high-quality images is not very difficult, given the maturity of AI systems like Midjourney 6, but it’s expensive. So, we’ll have to see the frequency these images will be used in the news reports.

Generating high-quality videos is still very far from a reality, but we are seeing the first steps in this direction. It might take another 2-3 years before generated videos will reach the same maturity we are seeing today with images.

Generating charts, even low-quality ones, remains impossible today. So, here the company is betting on a breakthrough in the next few months.

Let’s now talk about the impact on jobs, as this is and will always remain the key focus of Synthetic Work.

The company behind the video promises to rely exclusively on authoritative, trusted news sources. Which means that they will have to sign licensing agreements with the likes of Reuters, AP, etc.

Assuming this is true, what they are cutting out of the equation is the human news anchor, the team that takes care of the makeup and wardrobe of the news anchor, and part of the staff in the newsroom that editorializes the reports, produces the images and charts, and takes care of the video shooting and editing.

This is a non-insignificant number of jobs.

On top of this, unsurprisingly, the company behind this video is calling for citizen journalism, as the language barrier and the quality of the report on the ground can be both addressed with generative AI. And this means a much cheaper news-gathering process, as it’s much cheaper to pay a citizen journalist for a video than to send a professional reporter on the ground.

Will this project succeed? It will depend on the execution. Much of what’s in the video might be just a proof of concept impossible to scale to production before years.

Will a project like this succeed? Very likely.

As we said many times in this newsletter, the Broadcasting & Media industry is one of the most suitable for AI disruption. News is what people do and talk about. And that, today, is religiously documented by our interactions on social media networks.

As long as a company has direct access to those social media networks, there’s little need for broadcasting intermediaries.

Elon Musk constantly laments that news organizations have come to a point where they simply report what’s happening on X. You don’t have to take his word for it. Just look at what’s happening with the Ukraine-Russia war or the Israel-Palestine war.

And if you think that the audience will always choose a human news anchor over a synthetic one, you have to take into account the fact that these synthetic news anchors will be customizable in increasingly sophisticated ways, as technology matures.

People will be able to choose the gender, look, and personality of these news anchors wiping away any human counterpart that they don’t like for one reason or another.
People will be able to choose if they want to hear a more liberal or conservative version of the same news, effectively being told what they want to hear.
Eventually, we’ll get to a point where celebrities in every field will start licensing their virtual image and so, if people please, they might be able to hear the news from a synthetic version of Morgan Freeman.

News organizations might still have enormous value in the future, as long as they refocus all their efforts on investigative journalism, and there are enough people willing to pay for that. But for every other type of news reporting, it’s hard to see how people wouldn’t prefer a realistic synthetic news anchor with a bubbly personality who is available 24/7 and can speak any language.

Two final points about this Channel 1.

First. It’s entirely possible that this is an elaborate hoax, designed to raise awareness about an impending future in the hope that people will reject it before it damages the Broadcasting & Media industry. If that’s the case, I’d say that the organization behind this video has obtained the opposite result.

Second. If synthetic news broadcasting is really going to happen, and we have every reason to believe it will, independently of this specific project, we have to ask ourselves what happens when an organization can augment, tweak, or even fabricate events almost instantaneously, delivering highly personalized reports about them 24/7, in any language, across the planet.

An organization like that has the potential for exceptional advertising revenue. And that advertising revenue can influence news augmentation, tweaking, and fabrication in ways that are not in the interest of the public.

AI has the potential to impact not just jobs, but also the way we think and act as a society.

Microsoft admits that AI will displace jobs and it’s working with US labor unions to address the impact.

Jackie Davalos and Josh Eidelson, reporting for Bloomberg:

The software giant is forming an alliance with the American Federation of Labor and Congress of Industrial Organizations, which comprises 60 labor unions representing 12.5 million workers, according to a statement on Monday.

Under the partnership, Redmond, Washington-based Microsoft will provide labor leaders and workers with formal training on how artificial intelligence works. The education sessions will start in the winter of 2024. Microsoft will also begin gathering feedback from labor groups and will focus on unions and workers in “key selected sectors.”
…
At an event in Washington announcing the partnership, Microsoft President Brad Smith said the goal is to bring both groups to the table to “enhance” the way people work.

“I can’t sit here and say that AI will never displace a job. I don’t think that would be honest,” he said. “AI is well-designed to accelerate and eliminate some of the parts of people’s jobs that you might consider to be drudgery.”

“By working directly with labor leaders, we can help ensure that AI serves the country’s workers,” he said.

An unprecedented admission. And an interesting timing. One wonders if a delay of an entire year is simply due to the time necessary to set up the project (which is significant in scale), or if it’s designed to coincide with the launch of GPT-5.

If Microsoft is seriously concerned about the public backlash against a mighty powerful GPT-5, this is a smart preemptive move.

The eminent American cardiologist Eric Topol shows how AI is beating human doctors at diagnosing illness from almost invisible clues.

Synthetic Work covered some of the studies mentioned in the video in the past. So, to long-term readers of this newsletter, this growing body of evidence won’t be surprising.

What’s surprising is the fact that a top cardiologist openly talks about how ChatGPT correctly diagnosed illnesses of users who simply described their symptoms in plain English after being treated unsuccessfully by human doctors for months or years. And he does that from one of the most influential stages in the world.

A lot of things could go wrong when you seek medical advice from a large language model. Especially if you don’t understand how the technology works. Whoever acknowledges ChatGPT’s successful diagnoses in public puts his/her reputation at risk, and exposes him/herself to mockery and criticism.

Dr. Topol is taking this risk to raise awareness and show the way forward, even if that means admitting that machines are becoming better than humans. Or, as I prefer to say, admitting that hardware machines are becoming better than wetware machines.

In every industry, people like Topol will have to step up and take the same risk.

Terence Tao, prodigy professor of Mathematics at UCLA believes AI could help mathematicians discover new theorems.

From the essay Tao published on Microsoft’s blog called AI Anthology:

Current large language models (LLM) can often persuasively mimic correct expert response in a given knowledge domain (such as my own, research mathematics). But as is infamously known, the response often consists of nonsense when inspected closely. Both humans and AI need to develop skills to analyze this new type of text. The stylistic signals that I traditionally rely on to “smell out” a hopelessly incorrect math argument are of little use with LLM-generated mathematics. Only line-by-line reading can discern if there is any substance.

Strangely, even nonsensical LLM-generated math often references relevant concepts. With effort, human experts can modify ideas that do not work as presented into a correct and original argument.

The 2023-level AI can already generate suggestive hints and promising leads to a working mathematician and participate actively in the decision-making process.

When integrated with tools such as formal proof verifiers, internet search, and symbolic math packages, I expect, say, 2026-level AI, when used properly, will be a trustworthy co-author in mathematical research, and in many other fields as well.

Then what? That depends not just on the technology, but on how existing human institutions and practices adapt. How will research journals change their publishing and referencing practices when entry-level math papers for AI-guided graduate students can now be generated in less than a day—and with the far better accuracy of future AI tools? How will our approach to graduate education change? Will we actively encourage and train our students to use these tools?

We are largely unprepared to address these questions. There will be shocking demonstrations of AI-assisted achievement and courageous experiments to incorporate them into our professional structures. But there will also be embarrassing mistakes, controversies, painful disruptions, heated debates, and hasty decisions.

Our usual technology paradigms will not serve as an adequate guide for navigating these uncharted waters. Perhaps the greatest challenge will be transitioning to a new AI-assisted world as safely, wisely, and equitably as possible.

The Way We Work Now

A section dedicated to all the ways AI is changing how we do things, the new jobs it’s creating, or the old job it's making obsolete.

This is the material that will be greatly expanded in the Splendid Edition of the newsletter.

A group of artists, from New York and Oakland, uses (seemingly for the first time) MusicGen, an AI trained by Meta, to create an entire album in 24 hours.

The process is captured in a two-part documentary from Slouchy Media.

The work in New York:

The work in Oakland:

The full album is here.

The MusicGen model the artists used is here (keep in mind that this is a demo and not the real product that Meta is working on).

Will any of these songs win a Grammy? Maybe not, but remember that such a conversation is already taking place about Whiplash, the song with deepfake voices by Travis Scott and 21 Savage released on TikTok by the anonymous producer Ghostwriter. We talked about this in Issue #28 – Can’t Kill a Ghost.

What matters is how AI influenced the creative process of these artists and their reactions. And how aspiring artists all around the world will feel about this unprecedented opportunity to express themselves.

If you have a kid who is interested in music production, show them these videos and see what they think.

The impact of artificial intelligence on the music industry will be irreversible.

How Do You Feel?

This section is dedicated to the psychological impact of artificial intelligence on people. You might think that this has no relevance to the changing nature of human labour, but it does. Oh, if it does!

For any new technology to be successfully adopted in a work environment or by society, people must feel good about it (before, during, and after its use). No business rollout plan will ever be successful before taking this into account.

The head of the UK judiciary is not too happy with the growing use of AI among judges.

Alistair Gray, reporting for the Financial Times:

Senior judges have warned the judiciary in England and Wales to restrict their use of artificial intelligence in conducting legal research and to avoid divulging information about cases to online chatbots.

Official guidance published on Tuesday for magistrates, tribunal panel members and judges highlighted the risk that AI tools would make factual errors or draw on law from foreign jurisdictions if asked to help with cases.
…
The guidance made clear that judges might find AI useful for some administrative or repetitive tasks. But its use for legal research was “not recommended”, except to remind judges of material about which they were already familiar.
…
Lord Justice Birss, deputy head of civil justice, said it might be possible to use AI to help judges to determine provisional assessments of costs — a data-heavy and time-consuming task.

This story gives us an insight into the adoption trends of AI in the UK legal system. We already discussed this topic on multiple occasions, but you’ll read an interesting new twist in this week’s Splendid Edition of Synthetic Work.

The worst thing that could happen is that the public loses trust in the judiciary because magistrates have become too reliant on an imperfect generative AI for ruling that can have a significant impact on people’s lives.

Putting Lipstick on a Pig

This section of the newsletter is dedicated to AI tools and services that help people and things pretend to be who/what they are not. Another title for this section could be Cutting Corners.

Normally, we reserve this section of the newsletter for those who use AI to make something look better than it actually is. Where that something can be a product, a service, or us.

This week we talk about a different type of pig: the AI itself.

Daniela Sirtori-Cortina, reporting for Bloomberg:

Presto Automation Inc. pitched a restaurant industry desperate to combat rising wages on a talking chatbot that could take orders with almost no human intervention. The firm touted OpenAI’s Sam Altman as an early investor. And it has used the firm’s technology to improve its system as it aims to triple deployments to 1,200 locations next year.

But disclosures in recent filings with the US Securities and Exchange Commission and changes to marketing suggest that the technology is less autonomous than it first appeared. The company, which went public last year, now says “off-site agents” working in locales such as the Philippines help during more than 70% of customer interactions to make sure its AI system doesn’t mess up.
…
The company was notified by the SEC in July that it’s under investigation over disclosures made “regarding certain aspects of its AI technology.” Meanwhile, Rajat Suri, its founder and chief executive officer, left in March and was replaced internally by Xavier Casanova. The new boss has implemented several changes, including laying off 17% of the company’s staff and disclosing the human agents.
…
“Our human agents enter, review, validate and correct orders,” Casanova, who was promoted to Presto’s CEO from chief operating officer, told investors on a recent earnings call. “Human agents will always play a role in ensuring order accuracy.”
…
One issue is that it’s difficult for AI chatbots to discern the variety of accents, speech patterns and noises they encounter at a drive-thru, including a car’s radio or traffic sounds. Orders could also be too complicated for AI to decipher.

When AI trips up or guests ask to speak to a person, many providers of this technology alert a worker at the restaurant, who takes over and talks to the customer directly to finish the transaction. But unlike some competitors, Presto also employs people off-site who check the chatbot to minimize the number of times guests get handed off to a staffer in the kitchen.

Valyant AI said it used to employ a similar technique but stopped earlier this year.

See? It’s not true that AI takes away jobs from humans!

We talked about Presto Automation in Issue #17 – How to prompt to buy a burger, when we discussed the various AI technologies that the fast-food chain White Castle is using for its drive-thru.

Breaking AI News

Older news

Want More? Read the Splendid Edition

This week’s Splendid Edition is titled Say hi to your future synthetic financial analyst.

In it:

Intro
- Happy New Year.
What’s AI Doing for Companies Like Mine?
- Learn what Allen & Overy, Addleshaw Goddard, Travers Smith, Macfarlanesand, H&R Block, and Hiscox are doing with AI.
A Chart to Look Smart
- How AI for financial management is being pushed by CFOs in midsized companies.
Prompting
- 26 tested recommendations to improve your prompting with GPT-3.5-Turbo, GPT-4, and LLaMA 2
The Tools of the Trade
- OpenBB Terminal, the open source alternative to Bloomberg Terminal, is now powered by AI