Nilesh Jasani’s Post

View profile for Nilesh Jasani

Innovation Enthusiast

Is it truly hopeless for anyone but pioneers to build an #AI foundation model? The mounting evidence suggests that unlike semis or operating systems, these models may not remain in the exclusive realm of a privileged few. As explained in the post (https://bit.ly/45QZrhZ), the pivotal moment of late 2022 was not an "iPhone"-like product or Einsteinian innovation in methods/hardware; it was simply that for some, persistence paid! Their transformer models, being trained for years without sufficiently satisfactory results, crossed an unknown threshold of complexity to give human-like results. With the validation of transformer neural net logic, all boosted the training pace, and contenders emerged to challenge the frontrunners in weeks. (read earlier post https://bit.ly/3qlKUe2). A graphic analogy to drive home the point: 2022 became the birth year of the first machine brains with input-output conversions mimicked those of humans. Interestingly, the creators of these early machine brains didn't pursue significant patents. Their neighboring peers recognized the quantity game for own machine-birthing endeavors. While many pioneers may genuinely believe their achievements to be unreplicable by anyone globally, the author of at least one internal memo was less confident, suggesting the exact opposite (https://bit.ly/42sPdC2). Even if Alpaca was developed in days over the open-source model from Meta at USD600, it came from a team from the same nation. For distant observers, it looked like the best the rest of the world could hope for was to join the queue to access the APIs of the earliest creators and develop use cases atop - similar to their fate in other technology segments. Falcon 40B shatters all this. Abu Dhabi's TII unleashed a trillion-token foundation model that leads the Open LLM leaderboard in performance (https://bit.ly/3NwHVJ7). The single example is enough for everyone in countries from India to Singapore, Japan to the UK, or corporates from even non-technology sectors that to try developing models without backbreaking resource or time requirements. Many will finally note that the sizes of teams behind almost all foundation models are relatively small, and so are the length of their codes - unlike teams that built the first smartphones or the length of code in their OSs. In short, #samaltman could be wrong about India's chances of the building. Does one swallow make a summer? While theoretically promising, we must gather more information on the practical use cases of all models before passing a verdict on whether the world at the same time two years later will have ten or thousand useable foundation models. Many foundation models may prove to be mere namesakes with limited use. The takeaway for policymakers and corporate leaders worldwide is not to resign to being perpetual end-users burdened with exorbitant payments. Abu Dhabi's endeavors deserve much greater media attention than they have received thus far. #abudhabi #india

View profile for Nilesh Jasani

Innovation Enthusiast

Many people are struggling to understand what changed in late 2022 to cause so much excitement in #AI. In one word, the answer is quantity. We crossed a threshold, and everything has been on fast-forward ever since. Though there wasn't a groundbreaking invention in 2022 or in the immediately preceding years within the diverse AI domain, we experienced a defining moment when outcomes from models like Stable Diffusion and Dall-E 2 confirmed our course. This validation turned the AI landscape into a hive of activity, which can be understood through the lens of emergent properties. Emergent properties are a powerful force in nature and natural sciences. When smaller, simpler constituent parts of a system interact, their multiplicity creates chaos, which beyond a point, results in new characteristics that are not visible at the constituent levels. In layperson's terms, consider that individual atoms or molecules do not possess a temperature. However, when a substantial number of them gather, the concept of temperature emerges. Biological sciences often present even more instances of emergent phenomena than physical sciences do. A prime example of this is our brain, where straightforward neuron cells and their connectivity functions transform into something vastly complex once their number surpasses approximately 80 billion. The world of Gen AI has mirrored this natural phenomenon. While small research teams were toiling away on foundational models using novel techniques introduced in the mid-2010s, results were middling until a significant shift occurred. The use of larger testing datasets - tokens and parameters - led to diffusion models exhibiting human-like behavior once they crossed the 100-billion parameter mark. This validation set off a chain reaction, with models from Google's Palm and Lamda, Meta's Llama, Hugging Face, Firefly, and Falcon, springing to life following substantial training. The ramifications of this, which can be termed a no-new-innovation yet massive innovation, extend across all disciplines. We've discovered that the neural network constituents of our foundational models function effectively. Though they can be enhanced and made more efficient, comprehensive training is the key to their power. What is most remarkable to note, and one that is not well understood, is that at constituent levels, the programs are relatively simple. And, as much as people talk about intensive resource requirements, they are relatively small for any reasonably sized corporate. This is the reason why understanding the emerging competitive landscape is essential. More about that later. For now, let's marvel at the power of quantity and emergent properties at the heart of the AI revolution. #innovation PS: some of this author's beliefs on emergent properties within this book review https://bit.ly/3Chi0yw

To view or add a comment, sign in

Explore topics