Large language models #

2023-12-03 #

So I am realising I am going to have to learn a bit about these large language models, and in general about artificial intelligence: it’s not going anywhere, more and more people are talking about, and smart people are starting to say things about it that scare me.

  • Some thoughts on why companies will need to focus on “prompt engineering” to differentiate themselves in a crowded AI space.
  • A nice technical summary of transformers, the neural network architecture used to build LLMs.
  • Someone using LLMs to build a “power user” interface for iMessage.
  • The potential impact of LLMs on the creative arts has often reminded me of this fantastic short story I remember reading when I was younger, by Roald Dahl.
  • An interesting post on how the uncensored open-source llama2 model was trained.
  • A really impressive visualization of how LLMs work under the hood.
  • A really great introduction to large language models by Andrej Karpathy - this is meant to be for a “lay” audience but I think it serves as a good introduction for technical-minded people who have not encountered the ideas before.
  • The first paper I’ve seen which claims that LLMs can “discover new mathematics”.
  • A nice notebook from a reliable source about attention in transformers.
  • A nice emacs package for integrating with local LLMs.
  • This post pretty much sums up how I feel heading into 2024.