METR, which runs the benchmark measuring how well models can complete long-duration tasks, found that Claude Mythos Preview ...
Plus: Meta shares fall as FT flags mega share sale ahead; SpaceX USD75b IPO already oversubscribed; Blowout jobs number ...