Stubsack: weekly thread for sneers not worth an entire post, week ending 28th December 2025

BlueMonday1984@awful.systems · 7 days ago

Stubsack: weekly thread for sneers not worth an entire post, week ending 28th December 2025

Seminar2250@awful.systems · 5 days ago

https://www.windowscentral.com/microsoft/windows-11/my-goal-is-to-eliminate-every-line-of-c-and-c-from-microsoft-by-2030-microsoft-bets-on-ai-to-finally-modernize-windows

My goal is to eliminate every line of C and C++ from Microsoft by 2030. Our strategy is to combine AI *and* Algorithms to rewrite Microsoft’s largest codebases. Our North Star is “1 engineer, 1 month, 1 million lines of code”. To accomplish this previously unimaginable task, we’ve built a powerful code processing infrastructure. Our algorithmic infrastructure creates a scalable graph over source code at scale. Our AI processing infrastructure then enables us to apply AI agents, guided by algorithms, to make code modifications at scale. The core of this infrastructure is already operating at scale on problems such as code understanding."

wow, *and* algorithms? i didn’t think anyone had gotten that far

rook@awful.systems · 4 days ago

I suppose it was inevitable that the insufferable idiocy that software folk inflict on other fields would eventually be turned against their own kind.

https://xkcd.com/1831/

alt text

And xkcd comic.

Long haired woman: or field has been struggling with this problem for years!

Laptop wielding techbro: struggle no more! I’m here to solve it with algorithms.

6 months later:

Techbro: this is really hard Woman: You don’t say.

swlabr@awful.systems · 5 days ago

Q: what kind of algorithms does an AI produce

A: the bubble sort

blakestacey@awful.systems · 5 days ago

God damn that’s good.

Seminar2250@awful.systems · 5 days ago

this made me cackle

very nice

V0ldek@awful.systems · 4 days ago

Ah yes, I want to see how they eliminate C++ from the Windows Kernel – code notoriously so horrific it breaks and reshapes the minds of all who gaze upon it – with fucking “AI”. I’m sure autoplag will do just fine among the skulls and bones of Those Who Came Before

o7___o7@awful.systems · 4 days ago

Before: You were eaten by a grue.

After: Oops, All Grues!

istewart@awful.systems · 4 days ago

Our algorithmic infrastructure creates a scalable graph over source code at scale.

There’s a lot going on here, but I started by trying to parse this sentence (assuming it wasn’t barfed out by an LLM). I’ve become dissatisfied lately with my own writing being too redundancy-filled and overwrought, showing I’m probably too far out of practice at serious writing, but what is this future Microsoft Fellow even trying to describe here?

at scale

froztbyte@awful.systems · 4 days ago

so, ever watched Godzilla? and then did a twofer with a zombie movie? I think that’s essentially the plot here

istewart@awful.systems · 4 days ago

so what you’re saying is undead kaiju, at scale

Soyweiser@awful.systems · 4 days ago

They now updated this to say it is just a research project and none of it will be going live. Pinky promise (ok, I added the pinky promise bit).

istewart@awful.systems · 4 days ago

Not just pinkies, my friend, we are promising with all fingers, at scale!

Soyweiser@awful.systems · 3 days ago

All twelve fingers? Wow.

YourNetworkIsHaunted@awful.systems · 5 days ago

So maybe I’m just showing my lack of actual dev experience here, but isn’t “making code modifications algorithmically at scale” kind of definitionally the opposite of good software engineering? Like, I’ll grant that stuff is complicated but if you’re making the same or similar changes at some massive scale doesn’t that suggest that you could save time, energy and mental effort by deduplicating somewhere?

Sailor Sega Saturn@awful.systems · 5 days ago

This doesn’t directly answer your question but I guess I had a rant in me so I might as well post it. Oops.

It’s possible to write tools that make point changes or incremental changes with targeted algorithms in a well understood problem space that make safe or probably safe changes that get reviewed by humans.

Stuff like turning pointers into smart pointers, reducing string copying, reducing certain classes of runtime crashes, etc. You can do a lot of stuff if you hand-code C++ AST transformations using the clang / llvm tools.

Of course “let’s eliminate 100% of our C code with a chatbot” is… a whole other ballgame and sounds completely infeasible except in the happiest of happy paths.

In my experience even simple LLM changes are wrong somewhere around half the time. Often in disturbingly subtle ways that take an expert to spot. Also in my experience if someone reviews LLM code they also tend to just rubber stamp it. So multiply that across thousands of changes and it’s a recipe for disaster.

And what about third party libraries? Corporate code bases are built on mountains of MIT licensed C and C++ code, but surely they won’t all switch languages. Which means they’ll have a bunch of leaf code in C++ and either need a C++ compatible target language, or have to call all the C++ code via subprocess / C ABI / or cross-language wrappers. The former is fine in theory, but I’m not aware of any suitable languages today. The latter can have a huge impact on performance if too much data needs to be serialized and deserialized across this boundary.

Windows in particular also has decades of baked in behavior that programs depend on. Any change in those assumptions and whoops some of your favorite retro windows games don’t work anymore!

In the worst case they’d end up with a big pile of spaghetti that mostly works as it does today but that introduces some extra bugs, is full of code that no one understands, and is completely impossible to change or maintain.

In the best case they’re mainly using “AI” for marketing purposes, will try to achieve their goals using more or less conventional means, and will ultimately fall short (hopefully not wreaking too much havoc in the progress) and give up halfway and declare the whole thing a glorious success.

Either way ultimately if any kind of large scale rearchitecting that isn’t seen through to the end will cause the codebase to have layers. There’s the shiny new approach (never finished), the horrors that lie just beneath (also never finished), and the horrors that lie just beneath the horrors (probably written circa 2003). Any new employees start by being told about the shiny new parts. The company will keep a dwindling cohort of people in some dusty corner of the company who have been around long enough to know how the decades of failed code architecture attempts are duct-taped together.

froztbyte@awful.systems · 4 days ago

In my experience even simple LLM changes are wrong somewhere around half the time. Often in disturbingly subtle ways that take an expert to spot.

I just want to add: sailor’s reference to “expert” here is no joke. the amount of wild and subtle UB (undefined behaviour) you get in the C family is extremely high-knowledge stuff. it’s the sort of stuff that has in recent years become fashionable to describe as “cursed”, and often with good reason

LLMs being bad at precision and detail is as perfect an antithesis in that picture as I am capable of conceiving. so any thought of a project like this that pairs LLMs (or, more broadly, any of the current generative family of nonsense) as a dependency in it’s implementation is just damn wild to me

(and just incase: this post is not an opportunity to quibble about PLT and about what be or become possible.)

Soyweiser@awful.systems · 4 days ago

Some of the horrors are also going to be load bearing for some fixes people dont properly realize because the space of computers which can run windows is so vast.

Think something like that happend with twitter, when Musk did his impression of a bull in a china store at the stack, they cut out some code which millions of Indians, who used old phones, needed to access the twitter app.

swlabr@awful.systems · edit-2 4 days ago

The short answer is no. Outside of this context, I’d say the idea of “code modifications algorithmically at scale” is the intersection of code generation and code analysis, all of which are integral parts of modern development. That being said, using LLMs to perform large scale refactors is stupid.

jaschop@awful.systems · 3 days ago

I think I’m with Haunted’s intuition in that I don’t really buy code generation. (As in automatic code generation.) My understanding was you build a thing that takes some config and poops out code that does certain behaviour. But could you not build a thing instead, that does the behaviour directly?

I know people who worked on a system like that, and maybe there’s niches where it makes sense. Just seems like it was a SW architecture fad 20 years ago, and some systems are locked into that know. It doesn’t seem like the pinnacle of engineering to me.

Jonathan Hendry@iosdev.space · 2 days ago

@jaschop

“But could you not build a thing instead, that does the behaviour directly?”

Back in the day NeXT’s Interface Builder let you connect up and configure “live” UI objects, and then freeze-dry them to a file, which would be rehydrated at runtime to recreate those objects (or copies of them if you needed more.)

Apple kept this for a while but doesn’t really do it anymore. There were complications with version control, etc.

istewart@awful.systems · 2 days ago

I have always felt like NeXT/OS X Interface Builder has serious “path not taken” energy, but the fact that OpenStep/Cocoa failed to become a generalized multiplatform API, as well as the version control issues for the .nib format (never gave much thought to that, but it makes sense) sadly doomed it. And most mobile apps are glorified web pages, each with their own bespoke interface to maintain “brand identity,” so it could be argued there’s less than zero demand there for the flexibility (and complexity!) that Interface Builder could enable.

swlabr@awful.systems · 2 days ago

Unfortunately, the terms “code generation” and “automatic code generation” are too broad to make any sort of value judgment about their constituents. And I think evaluating software in terms of good or bad engineering is very context-dependent.

To speak to the ideas that have been brought up:

“making the same or similar changes at some massive scale […] suggest[s] that you could save time, energy and mental effort by deduplicating somewhere”

So there are many examples of this in real code bases, ranging everywhere from simple to complex changes.

Simple: changing variable names and documentation strings to be gender neutral (e.g. his/hers -> their) or have non-loaded terms (black/white list -> block/allow list). Not really something you’d bother to try and deduplicate, but definitely something you’d change on a mass scale with a “code generation tool”. In this case, the code-generation tool is likely just a script that performs text replacement.
Less simple: upgrading from a deprecated API (e.g. going from add_one_to(target) to add_to(target, addend)). Anyone should try to de-dupe where they can, but at the end of the day, they’ll probably have some un-generalisable API calls that still can be upgraded automatically. You’ll also have calls that need to be upgraded by hand.

Giving a complex example here is… difficult. Anyway, I hope I’ve been able to illustrate that sometimes you have to use “code generation” because it’s the right tool for the job.

“My understanding was you build a thing that takes some config and poops out code that does certain behaviour.”

This hypothetical is a few degrees too abstract. This describes a compiler, for example, where the “config” is source code and “code that does certain behaviour” is the resulting machine code. Yes, you can directly write machine code, but at that point, you probably aren’t doing software engineering at all.

I know that you probably don’t mean a compiler. But unfortunately, it’s compilers all the way down. Software is just layers upon layers of abstraction.

Here’s an example: a web page. (NB I am not a web dev and will get details wrong here) You can write html and javascript by hand, but most of the time you don’t do that. Instead, you rely on a web framework and templates to generate the html/javascript for you. I feel like that fits the config concept you’re describing. In this case, the templates and framework (and common css between pages) double as de-duplication.

V0ldek@awful.systems · 4 days ago

This is like the entire fucking genAI-for-coding discourse. Every time someone talks about LLMs in lieu of proper static analysis I’m just like… Yes, the things you say are of the shape of something real and useful. No, LLMs can’t do it. Have you tried applying your efforts to something that isn’t stupid?

istewart@awful.systems · 4 days ago

Hmm, sounds like you are suggesting proper static analysis, at scale

BurgersMcSlopshot@awful.systems · 4 days ago

If there’s one thing that coding LLMs do “well”, it’s expose the need in frameworks for code generation. All of the enterprise applications I have worked on in modernity were by volume mostly boilerplate and glue. If a statistically significant portion of a code base is boilerplate and glue, then the magical statistical machine will mirror that.

LLMs may simulate filling this need in some cases but of course are spitting out statistically mid code.

Unfortunately, committing engineering effort to write code that generates code in a reliable fashion doesn’t really capture the imagination of money or else we would be doing that instead of feeding GPUs shit and waiting for digital God to spring forth.

o7___o7@awful.systems · 5 days ago

Throw in the rust evangelism and you have a techtakes turducken

istewart@awful.systems · 4 days ago

If you want a warm and fuzzy Christmas contemplation, imagine turducken production at scale