Into the age of probability
I am almost 26 years old this year and I have been a dedicated computer user for 19 years. I have been charmed and fascinated by the fact that I can move a plastic, rodent-like thing and make an impact on people. I recall the first language I tried to learn is C++ but I ended up drawn to C# and .Net for writing Windows Phone App. The word “instantiate” feels magical to me, for I have created something virtually yet I can see it and make changes to it, in my mind and in the computer’s memory. Since I wrote my first line of code, I know one thing - if I messed up my code, the code won’t work. Fix it, and try again.
That, my friend, is what this technology world has always meant to me. This is a world where you have to prove your knowledge. This is a world where we only have two answers: right or wrong. If computer says it is wrong, either you are wrong or something else is wrong. That is also what motivated me to study and work everyday, as my success proves my knowledge. My idol has been people like John Carmack who knows how to fast inverse square root with a magic number and used it towards something innovative. You know what I mean now. The finite, discrete outcome from what you choose to input to computer, from what you have learned, is what keeps me forward.
Fast forward to 2017, one year before I graduated high school, we all know what happened - a new paper was published, named “Attention is All You Need”. It laid foundation for GPT: Generative Pre-Trained Transformer. Of course I didn’t know about this paper at the time, but now almost everyone in the computer industry can recite you how GPT works now: you buy a bunch of GPU card, get a big chunk of training data, apply the contextualization algorithm from the paper for self-attention and signal transformations, and bam you have your LLM now (this is not for educational purpose). When GPT2.0 was released in 2019, I saw a lot of compliments from social media, so I did try on Huggingface. I still vividly recall that day. I was in my freshman writing class, distracted, and I put in the writing assignment prompt into the prompt. The thing it returned, was at best at middle school level, so I was not impressed.
My reflection on what LLM has evolved to today, is that I lacked the ability to predict the worst when not having the expertise to examine the issue thoroughly. I was naive and irrationally optimistic towards the evoluation of the LLM technology, since GPT 3/ChatGPT or other models at the time had their obvious flaws - hallucination, limited context window, etc. I didn’t buy what those industry leaders had to say since they would say anything to keep their stock price afloat. Three years after ChatGPT was introduced, we have several models that don’t hallucinate after 200k tokens. Did I say 200k tokens? Yeh their context window is now at 400k tokens now (context window is how much context/words the model can ingest in for the conversation; 400k tokens is ~300k words). And also, we now know how to delegate the models via natural languages. Models can write code within 10sec, debug it and run it; Models can write government documents within 10sec, critique it and revise it; Models can call your restaurants for reservations. In one way, now we have a technology that can literally does anything as long as it can be done on computer. I didn’t see this coming so soon.
US Tech industry is having its storm right now, as tech giants slash their employee numbers. They say it is AI innovation now so they can have less people to work on the same stuff. I call that bullshit for now, as we all know how companies like Oracle is having a smaller wallet as they invest in building more data centers to feed the AI. But we have to see that once we have a better model than today’s, it will know how to architect new applications and code it. And it will be a lot cheaper than hire a whole team. Now, being pessimistic since I am not an AI researcher (learned from my lesson), I think scaling law is still in effect. Give companies like Anthropic, OpenAI or Deepseek more computational power and time, they will have a more powerful and knowledgeable model with more parameters. So in the end, we are all going to be out of work. With robotics evolving, I would say no one’s job, even for plumber Mike’s job, is secure.
I have explained how ChatGPT works to my friends and parents, who know less about computer: it takes in your prompt, and generate a probability distribution of what to say next, sample it and then choose the highest probable word. So practically we have no control over what it will say back to you, although with the same prompt. It is not the technology I prepared myself to, I embarrassingly admit. With those LLM being the core of all the things that can substitute, you can call it agent, skill, or whatever, we no longer know what work it will create. But we know it is highly probable that it will be better than what a human can generate. How can we calculate our future’s probability, for individual and for society?
The fallen leaves tell a story.
In our home, across the fog, in the Lands Between.
Our seed will look back upon us and recall, An Age of Probability