Transcripts in Apple Podcasts

With the iOS 17.4 update, the Podcasts app from Apple now has the ability to create transcripts of podcasts. This is great news. For years, people have asked me to add transcripts to the Mac Power Users and my other shows, but the problem has always been that it is cost prohibitive. With the explosion of artificial intelligence over the last year or two, that is no longer the case. And not only that, it’s built-in to the app, so we don’t even need to produce it ourselves.

iPad in landscape mode showing the Podcasts app from Apple. The episode shown is from the Mac Power Users podcast, entitled “I Got to Be the Hero.” You can see the artwork and the play controls on the left, and the new live transcription feature on the right, with some text highlighted at the top.

A couple nice features is that the transcript is searchable and tapping on an area of the transcript jumps the audio to that point.

This is a really nice update to Podcasts. Is it going to be enough to pull me away from Overcast? Probably not. But I’m at least going to take a serious look.

Apple Licensing Data for its AI Training

The New York Times reports Apple is in negotiations to license published materials for training their generative AI model. This shouldn’t be a surprise. A few years ago, when image processing was the big thing, everyone thought Apple would fall behind because they weren’t collecting all our images for data processing. Then I saw Craig Federighi explain how Apple could get pictures of mountains and that they didn’t need mine.

This is similar to how Machine Learning requires a data set to train. Again, Apple is looking to buy data as opposed to setting its AI loose on the Internet. I really wish I had a better idea about what Apple is thinking to do with AI.

A Different Take on Apple and AI

William Gallagher is a pretty clever guy, and I enjoyed his take on Apple and AI over at AppleInsider. Based on Apple’s latest paper, they seem (unsurprisingly) interested in looking for ways to run Large Language Models (LLMs) on memory-constrained local devices. In other words, AI without the cloud. We saw this a few years ago with image processing. Apple wants to have the tools while preserving user privacy. Just from speaking to Labs members in privacy-conscious businesses, I expect this will be very popular if it works.

Sam Altman’s Return to OpenAI

It was quite the week over at the OpenAI Office. I’m sure someone will write a book about it at some point. From the outside, it looked like another example of the conflicting priorities that always result when a nonprofit owns a for-profit company. Regardless, those priorities got sorted out this week.

My only other comment on this is the irony that OpenAI is the company making the thing that many fear will replace their jobs. Yet, when push came to shove, OpenAI’s biggest concern was keeping their humans, not their robots.

Is AI Apple’s Siri Moonshot?

The Information has an article by Wayne Ma reporting Apple is spending “millions of dollars a day” on Artificial Intelligence initiatives. The article is pay-walled, but The Verge summarizes it nicely.

Apple has multiple teams working on different AI initiatives throughout the company, including Large Language Models (LLMs), image generation, and multi-modal AI, which can recognize and produce “images or video as well as text”.

The Information article reports Apple’s Ajax GPT was trained on more than 200 billion parameters and is more potent than GPT 3.5.

I have a few points on this.

First, this should be no surprise.

I’m sure folks will start writing about how Apple is now desperately playing catch-up. However, I’ve seen no evidence that Apple got caught with its pants down on AI. They’ve been working on Artificial Intelligence for years. Apple’s head of AI, John Giannandrea, came from Google, and he’s been with Apple for years. You’d think that people would know by now that just because Apple doesn’t talk about things doesn’t mean they are not working on things.

Second, this should dovetail into Siri and Apple Automation.

If I were driving at Apple, I’d make the Siri, Shortcuts and AI teams all share the same workspace in Apple Park. Thus far, AI has been smoke and mirrors for most people. If Apple could implement it in a way that directly impacts our lives, people will notice.

Shortcuts with its Actions give them an easy way to pull this off. Example: You leave 20 minutes late for work. When you connect to CarPlay, Siri asks, “I see you are running late for work. Do you want me to text Tom?” That seems doable with an AI and Shortcuts. The trick would be for it to self-generate. It shouldn’t require me to already have a “I’m running late” shortcut. It should make it dynamically as needed. As reported by 9to5Mac, Apple wants to incorporate language models to generate automated tasks.

Similarly, this technology could result in a massive improvement to Siri if done right. Back in reality, however, Siri still fumbles simple requests routinely. There hasn’t been the kind of improvement that users (myself included) want. Could it be that all this behind-the-scenes AI research is Apple’s ultimate answer on improving Siri? I sure hope so.

My Transcription Workflow for the Obsidian Field Guide (MacSparky Labs)

In this video I demonstrate how I used two AI tools, MacWhisper and ChatGPT, to generate transcripts and SubRip text (SRT) files for the Obsidian Field Guide videos.…

This is a post for MacSparky Labs Level 3 (Early Access) and Level 2 (Backstage) Members only. Care to join? Or perhaps do you need to sign in?

Specific vs. General Artificial Intelligence

The most recent episode of the Ezra Klein podcast includes an interview with Google’s head of DeepMind, Demis Hassabis, whose AlphaFold project was able to use artificial intelligence to predict the shape of proteins essential for addressing numerous genetic diseases, drug development, and vaccines.

Before the AlphaFold project, human scientists, after decades of work, had solved around 150,000 proteins. Once AlphaFold got rolling, it solved 200 million protein shapes, nearly all proteins known, in about a year.

I enjoyed the interview because it focused on Artificial Intelligence to solve specific problems (like protein folds) instead of one all-knowing AI that can do anything. At some point in the future, a more generic AI will be useful, but for now, these smaller specific AI projects seem the best path. They can help us solve complex problems while at the same time being constrained to just those problems while we humans figure out the big-picture implications of artificial intelligence.