AI Developer Pioneers GPT-2 in Microsoft Excel
Over the last few years, the widespread integration of AI large language models (LLMs) such as ChatGPT has been truly remarkable. However, now a software developer, Ishan Anand, has accomplished a groundbreaking feat by embedding a precursor to ChatGPT known as GPT-2—initially introduced in 2019 with some hesitance from OpenAI—into a functional Microsoft Excel spreadsheet. This innovative creation, codenamed “Spreadsheets-are-all-you-need,” is readily accessible to the public and aims to provide individuals with a comprehensive understanding of how LLMs operate.
Unveiling the Power of GPT-2 in Spreadsheets
In a statement posted on the spreadsheet’s official website, Anand emphasizes the user-friendly nature of this innovative tool by stating, “By utilizing a spreadsheet, individuals—even those without a software development background—can directly delve into and experiment with how a ‘real’ transformer functions beneath the surface with minimal abstractions hindering the process.” The moniker “Spreadsheets-are-all-you-need” pays homage to the seminal 2017 research paper titled “Attention is All You Need,” which first introduced the Transformer architecture fundamental to the operation of LLMs.
Employing the XLSB Microsoft Excel binary file format, Anand successfully integrated GPT-2 into his creation, necessitating the utilization of the latest Excel version (note that it is incompatible with the web-based Excel version). Notably, the spreadsheet operates entirely offline without any dependencies on cloud-based AI services.
Decoding LLM Functionality in an Excel Environment
Despite containing a fully functional AI language model, this spreadsheet tool does not facilitate direct conversational interactions similar to ChatGPT. Instead, users can input text in designated cells and observe immediate predictive outputs in adjacent cells. It is crucial to acknowledge that language models such as GPT-2 primarily focus on next-token prediction, striving to conjecture the most probable text following a given input sequence, whether it is a continuation of a sentence or another text-related task. The various sheets within Anand’s Excel file enable users to gain insights into the underlying processes while these predictions are generated.
Spreadsheets-are-all-you-need supports up to 10 tokens of input, a stark contrast to the extensive 128,000-token context window of GPT-4 Turbo. Nevertheless, this limited capacity suffices to elucidate fundamental principles of LLM functionality, as elucidated by Anand in a series of instructive tutorial videos accessible on YouTube.
Anand’s Motivation and Implementation Journey
During an interview with Ars Technica, Anand revealed his motivation behind this pioneering project, citing a desire to quench his curiosity and achieve a nuanced comprehension of the Transformer architecture. Reflecting on the evolution of modern AI from his academic years, he underscored the need to revisit foundational concepts to forge a robust mental model of AI’s inner workings.
Initially contemplating GPT-2’s reproduction using JavaScript, Anand’s love for spreadsheets spurred a pivotal realization. Drawing inspiration from data scientist Jeremy Howard’s fast.ai and former OpenAI engineer Andrej Karpathy’s AI tutorials on YouTube, Anand recognized the potential of deploying the entire GPT-2 model within an Excel framework.
When queried about the challenges he encountered while implementing an LLM on a spreadsheet, Anand acknowledged the mathematical rigour of GPT-2’s algorithm as an ideal fit for a spreadsheet environment. Notably, he highlighted text tokenization—a non-mathematical text processing phase—as the most intricate aspect. Anand attributed the project’s success to leveraging insights from ChatGPT, GPT-2’s descendant, which not only assisted in solving complex issues but also necessitated cautious validation due to occasional hallucinations.
Thus, Anand’s innovative venture underscores the transformative potential of embracing AI models in unconventional software paradigms, paving the way for enhanced education and user engagement.
Image/Photo credit: source url