Blind Guru - AI badly suffers from dyscalculia

TL;DR: As my girlfriend succinctly sums it up: “AI systems are the first computers that really can’t deal with numbers.”

LLMs need a calculator tool

As we all learnt when the GPT hype started, LLMs actually need access to a calculator tool to do math/numbers correctly. It might be able to occasionally sum 2-digit numbers without an error, but it is really a gamble. Stephen Wolfram more or less immediately appeared on YouTube, talking about how the Wolfram Language has been integrated with ChatGPT to solve this exact problem. A few months later, access to Python was more or less standard, and the problem somehow went underground/unnoticed for most users. However, the fundamental issue remains. If you LLM doesn’t have access to a caluclator tool, you can’t trust any number it will ever give you. Well, you can’t even trust the facts it is spitting at you, but this is a completely different topic.

OpenAI TTS

However, speech synthesis doesn’t just get the number wrong, in some cases, it breaks the pronounciation of a multi-digit number completely. To a point where the number it is trying to say is not recognizable.

I first noticed this phenomenon when I tried the OpenAI speech API. I was amazed at the speech quality, no question. However, I also immediately noticed this stuff is not usable for anything that might contain 3-digit or more numbers. I basically ran a METAR weather report through gpt3.5-turbo, and passed the expansion to the OpenAI TTS. The LLM model did a great job, but the TTS botched every number in subtle ways. I was schocked, to say the least, and immediately reminded of the Xerox compression algorithm bug debacle.

How can someone release a speech synthesizer that can’t say numbers correctly, in the 21st century, and get away with it without a huge public backlash? Do people really don’t care about correct data anymore?

iOS VoiceOver

Next time I noticed a simplar effect was when I upgraded my iOS to 14 or 15, I honestly don’t remember which exactly. This time, the phenomenon was even weirder.

I noticed it the first time when I read a message in WhatsApp, that was send early in the day. It said “X:15”. Later, I heard the speech synthesizer replace 5 with V and 1 with I. Yes, you probably already caught on.

Something in the training of whatever, apparently picked up roman numerals, and decided that it would be fun to randomly replace arabic numbers with their roman counterparts. So don’t be surprised if the balance of your bank account is “478.0V”, thats Apple trying to tell you that you and your requirements don’t matter.

How is this possible?

Seriously, how is it possible that companies like Apple and OpenAI release software that ruins numbers when speaking them? Are we so deep into Idiocracy already that it does go unntocied? Does nobody care about anything anymore these days? How come this shit goes unnoticed? I mean, really. How can you train a TTS model and forget to check if it reads numbers correctly? How can you maintain a screen reader, and not notice that in your latest major release, arabic digits are getting replaced with roman numerals? How can this stuff pass through QA? IMO, this can only happen if there is actually no QA.