Weeknotes 2024.18

This week I switch it around with the report on last week. Hope you like the updates on AI’s 90% problem, Federation technology’s moment, Apple’s AI-RAM conundrum, plenty of Software Engineering and Retro Computing this time.

I’m still trying to find my style for the weeknotes, so please excuse the ever changing format. This time I try a more newsletter-like style, inspired by Molly White’s excellent newsletter posts.

Understanding AI’s 90% Problem

Speaking of Molly White, she was one of three people this week who independently of each other noted the same problem AI is facing. The others were Moxie Marlinspike and Martin Kleppmann, so if three people of this calibre get similar conclusions, I listen and if you deal with an LLM-based product, you should too.

Moxie created an experimental app built on top of an LLM and shared his conclusions. He more or less coins what I call the “90% Problem”: generative AI gets you 90% there, but the last 10% of the product are made even harder, if not impossible to figure out. If your target audience is OK with that, you may have found a viable business. If you mainly work in the last 10%, like Martin Kleppmann obviously does, LLMs are of little value currently.

Moxie then goes on to question the economic viability of LLM-based solution, which is the second puzzle-piece to its viability. Inference is expensive, both in terms of computational costs and energy cost.

This is where Molly’s great article comes in, where she brings it all together quite nicely: AI isn’t useless. it’s kind of useful, but does that justify the costs? It’s the fundamental question AI products need to answer, whether they are infrastructure (LLMs themselves) or build upon LLMs.

Federation

Federated media and federated technologies are having a moment in the spotlight. After Threads has joined the fold of ActivityPub platforms, the technology is moving ever so slowly to the common consumer. So far the best known implementations are various decentralised social media apps, but the possible applications go far beyond that.

Ghost now adopts ActivityPub and it’s a big thing that deserves this microsite. I like the vision they have behind it. While I don’t use Ghost, I’ve been seriously tempted to try it, because they tick many positive boxes.

The New Stack discusses Federated Identity as the next puzzle piece that needs work and adoption. We're not there yet.

In my personal ranking, federated technologies rank way higher up than AI or blockchain, and you don't need data centres full of expensive GPUs to implement it.

Apple’s RAM problem with AI

How two seemingly disjointed stories can come together. Apple has been under scrutiny for the base RAM configuration of their devices for as long as I can think, but the current 8GB base for Macs is definitely the biggest let-down in recent history. They will also need to find a way to make RAM upgrades cheaper, if they don’t want to lose ground in the AI race^[1]. MacRumors resurfaced some charts that show the memory bumps for Apple consumer desktops and laptop and concluded that the regular bumps ended with Tim Cook at the helm. I’m not sure if he is the only factor.

So what does this have to do with AI? A lot. Those fancy LLMs need memory and lots of it. You can bring any 8GB MacBook Air to its knees by running a recent LLM (if it even fits). Apple is also the prime candidate for on-device inference, so they want to run models locally and they know that some of their current devices are a no-go due to memory constraints. Apple is one of the most active researchers in making LLMs run in memory-constraint environments and even released some compact open-source language models recently.

They might squeeze LLMs this way, but medium to long term, I don’t see a way around some significant RAM increases.

Software Engineering

The first Swift Server Side Meetup took place and the recording is available on YouTube for everyone who couldn’t attend live, like yours truly 😕

Zed Decoded: Rope & SumTree is a fascinating blog post and discussion on the data structures that make editors work.

Gregely Orosz dives deep with Evan Morikawa on the challenges of scaling ChatGPT. Wonderfully nerdy and a warning for upshots trying build the next ChatGPT.

SQLite seems to be the technology du jour. Turns out Bluesky is using SQLite for their backend as well. Every user has their own database. I guess this architecture is about to go mainstream, because it just makes so much sense for many applications.

Hackers are using staggered roll-outs for ransomware, hitting countries with less sophisticated, read expensive, countermeasures first, refining the ransomware before moving on more valuable targets. It’s a good practice, we’ve been preaching that for years.

Voyager is working again thanks to some clever software engineering. This makes me happy.

Retro Computing

Microsoft open-sources MS-DOS 4.0. Historically it’s probably one of the most interesting versions, as you could argue it was the starting point of a collaboration that lead to both OS/2 and Windows NT. Unfortunately MT-DOS, as the multitasking DOS version was called, could not be found. This copy of the source code is one IBM used exclusively.

Fascinating article about the Apple Jonathan, a concept from the 1980s that never shipped. The modularity is intriguing. This might be possible today, but not in the 80s. It’s a little bit like a Framework desktop, with a similar market.

Yes, some concepts like the CRT look very NeXT to me 😂

And finally an Ars retrospective on Palm OS and its devices. My Tungsten E was one of the best gadgets I owned.

Until next week!
– JJ

As mentioned before, this opinion is partially due to the AI race. If competitors figure out to make the CPU/GPU/NPU combo 80-90% as good as Apple Silicon, they can easily move ahead in all but very specific benchmarks by making RAM (upgrades) financially more accessible. ↩︎