November ramble
AI coding assistants, teaching kids to code, and general updates and recommendations
I briefly tried mooting in school—a long and embarrassing story. One piece of advice I still remember is that it doesn’t matter how long you take to think before you talk, as any silence will be forgotten as focus shifts to the profundity of your remark. Of course, this assumes something about the effectiveness of one’s silent thinking. Did I already mention that it’s an embarrassing story?
… anyway, it’s been a while since the last of these newsletters! Instead of trying to sound profound enough to make up for the silence, I’m going to try a new format that should be easier both for me to write and for you to skim.
Thoughts on AI coding assistants
I generally prefer not to comment on software development practices, because of something I’ve observed often enough that it feels like a law: for every excellent engineer who swears by a particular practice, there’s an even better one who swears by the opposite. Some people couldn’t imagine coding without unit tests, or code review, or continuous integration, or step-through debugging, or [your preferred “best practice”]. Yet, there are people out there who do the exact opposite and outperform us all.
But people keep asking for my take on AI coding assistants, and with the caveat, again, that there are better programmers than me at both ends of this spectrum, here is my current practice.
For almost all meaningful tasks, I don’t use AI. I use my brain, pen and paper, a minimally configured text editor, and something to run my code. I don’t like something like Cursor for most tasks as it feels like a distraction, as if I’m pair programming with an intern with great recall but poor judgment. To compensate for my imperfect memory, I mostly stick to my ingrained habits of reading official docs, man pages and source code, or just running quick experiments.
I have enjoyed using Cursor—or more often just Claude via the website, since I don’t usually want a whole new editor—for smaller projects, like translating short programs into an unfamiliar language, writing throwaway personal tools, or learning a new library or domain, where I can ask a lot of questions and run faster experiments. I know that the “poor judgment” aspect of the LLM is still problematic here, and that I may have to update my mental models as I go, but I see that as an unavoidable part of learning anyway. I would far prefer an expert human pairing partner for such things, but they too will teach me imperfectly, and in any case there’s not always one around.
I’m also aware that I have some baggage, or “experience”, however you prefer to think about it. Some people are writing their first line of code today with LLM assistance; I wrote mine without an internet connection. Google came later, as did Stack Overflow, as did ubiquitous small open source projects. In each case I could make a conscious decision about the extent to which I would use the new thing—often substantially, but rarely as much as those who arrived after the new-to-me thing was commonplace.
My path isn’t optimal, it’s just the one I happened to take. For somebody learning today, it may well be the case that prudent use of an in-editor LLM for rapid Q&A will expedite their journey. Jeremy Howard’s new course How To Solve It With Code seems to be an earnest exploration of this possibility, so I look forward to working through it, and to seeing similar efforts.
Will LLMs substantially change the industry?
Software engineering tends to take its greatest strides from one major abstraction to another. These are innovations like the stored program, the compiler, the relational database system, and the process abstraction in operating systems. These dramatically changed the kind of work we could do—not to mention the constitution of the labor force—by packing a great deal of functionality behind a high leverage interface that could still be pierced as required.
My favorite example is Fortran, or more broadly “high level” (higher than assembly) programming languages. When Fortran, COBOL and Lisp all showed up in the late 50s, it was clear that programming would never be the same again: programmers would no longer require an intimate understanding of the machine; they could think in terms of the high level language (“formulas” in the case of Fortran) and have that be compiled or interpreted. As early as 1954, Fortran’s lead designer John Backus imagined that this should virtually eliminate coding and debugging, and he wasn’t entirely wrong. Coding as he knew it is now extremely rare. Very few people write much assembly.
But these languages and there descendants were very thoughtfully designed, the compilers are understandable without strictly needing to be understood, the disassembly can still be generated and stepped through one instruction at a time if necessary. These are phenomenal abstractions.
I would ask LLM hyperenthusiasts to question the degree to which LLMs are good abstractions, or can be incorporated into good abstractions in the future. Perhaps we end up with programming languages designed as LLM targets, where it’s easy to prompt engineer in front of this abstraction, and to debug deterministically behind it when needed. But these have not yet been invented, and it remains to be seen whether such a target language could be mostly ignored in a way comparable to a compiler’s output or intermediate representation.
If such abstractions don’t eventuate, LLMs may end up only as tools, with an impact comparable to IDEs or Stack Overflow, but well short of the hype.
Teach Yourself CS updates
I’m planning a major round of updates to teachyourselfcs.com around the end of the year, focused on clarifying context for topics and resources that come across as “pre-requisites” like SICP for programming, or broadly the topic of mathematics for computer science. A major failure mode I see with those using the curriculum is that they hit a roadblock on a topic they see as necessary for one they’re actually interested in, and give up rather than either persevering or skipping ahead.
I’m also planning to refresh some of the specific suggestions. On my radar are Andy Pavlo’s intro databases course, John Denero’s Composing Programs for introductory programming, the Kurose and Ross lecture recordings for networking, and a few others. If you have a favorite resource not yet on Teach Yourself CS, or another suggestion, this is a good time to send it through!
I’d also like to add some more self-guided project suggestions for those not able to commit to CS Primer. This is the biggest failure mode I see with Teach Yourself CS: grinding through textbooks and video lectures without consolidating that knowledge with projects and exercises.
CS Primer updates
My content production work on CS Primer is progressing decently, although slower than I’d like as usual. Since last newsletter I’ve cleaned up a number of existing courses, released an early version of Programming: Beyond the Basics and am progressively releasing the Relational Databases course, where the major project sequence is to write your own from scratch!
Jason’s MLE club
Former Bradfield student Jason Benn has been running a sabbatical/study club for machine learning engineers, meeting in person in San Francisco. Check it out, or see the podcast episode with Jason below for a little context.
Book recommendations
My favorite non-fiction book I read since last writing was Failure is Not an Option by Gene Kranz, flight director for many of the Mercury and Apollo missions including Apollo 11 and Apollo 13. It’s a fascinating view of the first era of space exploration, best paired with a pilgrimage to see the Saturn V at Kennedy Space Center. I speak about this a little with Charlie in a podcast recording, linked below.
My favorite fiction book over this period was Project Hail Mary by Andy Weir, an extremely fun read, with much of the engineering napkin math style of The Martian, but also some quite interesting exploration of communication with an alien life form.
Some honorable mention recent reads:
Reentry, on the start of the (hopefully) second space age
The Wright Brothers for clarity around how they actually managed to invent powered flight. Best paired with a visit to the National Air and Space Museum.
Metals in The Service of Man covering just about the perfect scope of what I wanted to know about metallurgy
Podcast recordings
In case you need to fill the void, here are some podcast recordings since the last newsletter:
When failure is not an option mentioned above
Teaching kids to code
Thorsten Ball asked me to share a little about how I’m teaching my kids to code. We are in the early days but so far I’ve really liked CodeMonkey for its thoughtful skill progression and balance between substance and kid-friendly whimsy. It starts with Lightbot style icon-based programming puzzles (good for 3+ year olds) and progresses through Scratch style block coding, to short puzzles in CoffeeScript (don’t laugh, it’s great for young and old) and Python.
With this basic competency in Scratch, my 5 year old was then able to make a few games, mostly riffing off those in the book Coding Games in Scratch, generally with me my her side to debug as she went. It may not seem like “real learning” to copy 80% of a program from a book, but you may be surprised how many coders of my vintage first learned to code by tying in QBASIC games from a magazine, or using “view source”.
At this point my 3 year old is dabbling in CodeMonkey puzzles, mostly motivated by her big sister. She may have only put in a few hours, but she does have a basic intuition for loops and procedures in a way that’s typically achieved much more circuitously through the “computational thinking” offline games and activities typically pitched to this age range.
My 5 year old is currently doing simple Python programming, with a mix of the CodeMonkey challenges, a book Coding Projects in Python and of course a little direct tutoring from dad. For anybody who has taught their kids Python, I’d appreciate any tips for knocking off the sharp edges while using real tools, so that we can keep progressing. At the moment we are using VS Code for a little linting and tab completion, but I haven’t thought much about configuring it for a kid.
Overall the core of my pedagogy is trying to find the right challenge for that learner at that time. It’s rarely a good idea to dump information on somebody—particularly a kid!—but it’s also problematic to just provide an open-ended “learning environment” and hope for a good feedback loop. So while some kids can do ok with information-heavy resources at one end or a constructionist playground like Scratch at the other, I think most are going to thrive on something in the middle: a sequence of challenges, or somewhat structured mini projects, or a mix.
Synthesis “AI tutor” is not bad
I mostly avoided this product because I can’t imagine a good “AI tutor” at our current level of AI capability. Thankfully it appears that the use of AI is well restrained, and the positioning is mostly for hype. In practice, there’s a fairly thoughtful sequence of mostly multiple choice prompts around interactive widgets, to teach topics like number systems and arithmetic. The prompts are somewhat responsive—mostly by way of decision trees, and sometimes with a text field allowing for natural language input. My 5 year old quite enjoys it, and is currently using it alongside her long running favorites Beast Academy and Matific.
This is a good use of current capabilities in my view, and allows for a substantially better experience than less interactive substitutes such as Khan Academy and the like. It is of course no replacement for a good tutor, but again, it’s right there, and cheap, and can be done with my loose supervision rather than full attention.
Overall I quite like this approach of focusing on core instructional design, but using LLMs (I presume) to allow for higher interactivity and reduced friction. I previously explored a similar design—with mostly pre-scripted responses, but LLM-based classification of user input and some highly targeted LLM generated feedback—in unpublished prototypes, one for geometric construction and another for a software library. I didn’t continue these mostly for lack of time, but I think it would be particularly great for learning to code, or similar scenarios that involve basic guided feedback on an artefact. If you are interested in working on something like this, let me know.
Good stuff. Yes, LLMs as a complete replacement for existing programming languages seems challenging because human language is necessarily ambiguous. We don't always know exactly what we mean. Computer languages have unintentional ambiguity (C++ undefined behavior, anyone?) but more limited in scope. Defining the problem and iterating on it seems to be an essential component that humans will continue to be useful for.
+1 for Andy Pavlo's database course (CS 15-445). I'm working through it now. So awesome with the publicly available Autograder, GitHub repo, and Discord channel.
+1 for the Kurose networking lectures. I watched all of those. Kurose has a gift for making the complex simple. His book doesn't have a lot of projects, however, so there's an opportunity for adding value.
I also enjoy Apollo history. I've got Failure Is Not An Option on my shelf, but haven't made time to dig into it. If you haven't played with https://apolloinrealtime.org/ check it out. Amazing. I'll check out your podcast episode.
Yes, I'm looking for a good AI math tutor as well. We, too, were a bit disappointed by Khanmigo. It sometimes gave wrong answers and the anti-cheating mechanism was more friction than help. I'll check out Synthesis Tutor.
Keep up the amazing work!
For Codemonkey are you doing things sequentially starting from Codemonkey Jr?