Co-Panelist18:51
You can't on Facebook. It's impossible. The business is just too strong. And I also don't think Mark will be singled out by a regulatory hammer should it come down hard. If there's a data center ban, I don't think it's going to be uniquely focused on Meta. Only Dario or only Dario and Sam have true scapegoat risk because they run pure play labs and have been so noisy about AI and they have the biggest revenues in the category. There might be data center regulation but it won't unfairly target Meta. So anyway, that's my Social Reckoning take. Let's move on to Fable, which launched yesterday and the model seems incredibly impressive. I've been seeing these vibecoded games that look really, really good. For some of those, the vibecoded games hit really well on social media because you take a video of them and you show the example of what it built, you share the time, and people just sort of believe it. It could be embellished but in general these feel pretty solid. Of course, making a great game requires great mechanics and multiple levels, and a single demo of a forest is not quite there, but really, really useful and I'm sure we'll see a bunch more examples of games as memes, simulators, and these things are going to get easier to build. That's really exciting. Of course, there was a bunch of debate on the timeline yesterday and it's bleeding into today around the latest model, Fable 5. The first Mythos class model that both seems remarkably good at long horizon tasks like software development and knowledge work, but rejects requests related to biology, cyber security, and frontier LLM development. Interestingly, I haven't seen anyone share rejections around anything else. Like, did they remember to reject 'build me a nuke'? Because I haven't seen anyone try that. And it'd be very funny if it was like, 'Oh yeah, we just didn't get around to that.' Or there are so many other things, but I think a lot of those other queries that you should reject have been ironed out in previous iterations. So going back, saying a slur was a big one for a while or saying something rude or different, pausing a chat, basically just shutting down a conversation versus switching you to a less performant model.
Yeah. And it creates this screenshot that went viral pretty much continuously yesterday. And so this aligns with Anthropic's focus on safety, but as many people have pointed out, it's also just good business. You don't want competitors using your products to directly create competitors and you also don't want financial liability or negative headlines from bad actors using your models for nefarious purposes. Ben Thompson called it true alignment. The take safety seriously culture aligns with business value creation which is very, very rare. Oftentimes the 'be good' culture limits what you can do and actually hurts your business, but it's something that you do in favor of brand. Like Apple would probably be more profitable if they were using diesel generators for all of their data centers. They went clean energy because they wanted to have an environmental brand and over the long term it's helped them, but in the short term it's been rough. Of course, inexpensive. So Ben Thompson writes, 'What is so fascinating about Anthropic, however, is that while I'm sure some executives of the company are thinking this way, I also totally believe that the employee base broadly also happen to believe that they are doing the right thing.' It's fascinating to observe. Me, the rational business analyst, sees a hard-nosed but understandable decision to cut off would-be competitors. Anthropic employees and advocates, the true believers, see a regrettable but understandable safety decision that ensures that responsible and thoughtful people themselves will be the ones guiding our AI AGI future. This is true alignment and it's an incredible accomplishment. Facebook has tussled with this a bunch. And we already talked about that. But, to be clear, the Fable 5 rejection threshold really does feel way too low from what people are saying. Tons of examples on the timeline of a biologist just saying hi to the model and getting kicked down to Opus. I saw you shared someone just said 'cyber' with the devil horns, the purple devil horn emoji, and it's like we can't go further. Of course, bringing down a broad hammer, you can always dial it back over time, but every rejection is this implicit invitation to hop on the phone with an Anthropic sales rep and get on the Mythos enterprise plan. And that's where the real dollars are, too. The timeline is unhappy because the idea of democratizing science, technology, all of this is very alluring, but the pool of dollars available from all the biohackers in the world probably isn't close to the budgets available from big pharma. And so you're again in this rational business analyst situation and you fail to see how this is damaging, except to the hacker community. The real tricky part is how AI frontier AI research is handled. Instead of outright rejecting the query and bumping the user down to Opus, the model appears to answer but quietly gives a degraded answer. And this was disclosed in the model card which is interesting. So this is again reasonable business not disclosed to the user in the product while they're paying for, which is a different path than bio and cyber. So if you go LLM frontier research devil horns, it will actually give you an answer apparently. It won't bump you down immediately. But it doesn't disclose that which is odd. Outright rejecting requests for AI research and just saying, 'Hey user, sorry this model doesn't work for that type of project. Please use another model or contact sales if you want help with this,' would have been much more in line with the bio and cyber security strategies. And they also could have not disclosed and it's also possible that they just didn't need to disclose this at all. Like they could have just released a model that was intentionally nerfed on AI research. It would have shown up in the benchmarks because people would have benchmarked it on some sort of AI research bench and been like, 'Oh, weird. It's really good at all these other things, but it's bad at LLM research.' And maybe that would have been a bit of a brand hit maybe, but users might never know that the model was intentionally degraded around this category of work. So that leaves this third more worrying position. Intentional degradation without disclosure in the model card. There's no evidence of this, but it's possible that other workflows might be nerfed and there's no law or even convention around disclosure. Again, maybe good business, but a weird situation to be in. So probably bullish for eval if you're building a business on top of a big lab. You can imagine a legal AI company will want to be really sure that the models they're using aren't degrading unexpectedly and not telling them. It's different if you're like, 'Hey, I've been a bio researcher for a while. I'm using this and I know that this model was never intended for me.' But what you don't want is I'm using it and now it's leading me astray in my work and it's also not telling me that it's going to lead me astray, which would be sort of an odd outcome that I think they'll probably address in the near future anyway.
Yeah, that aspect triggered Dean Ball. He said, 'My last observation re Anthropic secret sabotage safety policy is that it undermines actually good safety policy.' How? First, it is very plausible to describe this as anti-competitive behavior. Even if you are maximally sympathetic to Anthropic here, you must admit this. And it is behavior being justified in the name of AI safety. If you believe, as I and many Anthropic staff do, that it may end up being critically important to relax antitrust enforcement so that the frontier labs can cooperate and collaborate on some areas of AI safety, Anthropic just undermined the case for that in a large way. Overall, this massively and profoundly raises the status of the argument that AI safety has been hyped to justify monopolistic behavior by labs. I continue to believe that AI safety is a real and serious issue that is growing in importance rather than diminishing. If you agree with me, this incident is a setback, maybe a serious one. And third, he says, as I have observed elsewhere, Anthropic's official corporate policy is structurally identical to the fact pattern alleged against them by the Department of War. I still think DO acted both falsely and wrongly in that fight, but it is no longer possible to defend Anthropic with a full throat after this incident. This raises the case for heavier-handed regulations. Anthropic is making an awfully good case here that their product ought to be treated as utilities and thus their alignment practices should be a matter of public policy rather than private property. I am starkly opposed to this sort of state power grab but Anthropic is doing more to justify it than anyone else. Thus, significant damage has been done to a community and entire approach to AI governance. It was done unilaterally by Anthropic likely motivated largely by self-interest and justified within the internal psychology of the firm through the lens of safety. I suspect this is fixable in the economic and legal senses, but I fear that trust has just been broken and the goodwill extinguished will take very much time to repair.' And just to level, he wrote the AI action plan, but then also came out very publicly in support of Anthropic during the conflict with the Department of War, saying that the Department of War is completely overstepping by pushing towards supply chain risk designation, putting pressure on Anthropic for not wanting to work with the government in that particular way. Obviously there's a whole bunch of new data that's been released and that conversation has evolved, but he doesn't strike me as some crazy hater.