08/14/2024
Predictive and Generative AI: From the 1980s to Now
Many CEOs today see advanced Generative AI as the key to staying competitive. But did you know that Gen…
Welcome to our new website! Learn more about how we help Associations succeed with Salesforce solutions. Learn More >
November 13, 2024 | Andrew Lawlor
I’ve spoken with many technology leaders in the past few years, and there’s a palpable excitement in boardrooms about generative AI.
Your competitors are exploring it, your employees are experimenting with it, and you’re right to take it seriously.
But here’s where I see companies making a critical mistake: many are seriously exploring the proposition of building their own large language models (LLMs).
Let me be direct: unless you’re sitting on billions in AI research funding, building your own LLM is likely a shortcut to draining your tech budget with little to show for it.
You might think, “But our industry is unique; we need our own model.”
I hear this often, and I understand the impulse. There’s an AI arms race mentality pushing companies to seek a competitive edge through custom AI solutions.
But what I’ve observed is that this drive to build proprietary LLMs often stems from misconceptions about what it really takes to build, train, and maintain AI models.
I understand the allure of building your own LLM.
These concerns aren’t irrational.
Companies like Anthropic, Microsoft, and Google have shown what’s possible with custom LLMs. They’ve built powerful models that understand context, generate human-like responses, and tackle complex tasks.
Their success makes it tempting to follow their path.
But here’s what I’ve noticed in my conversations with leaders: they look at these tech giants as role models without considering the vast gulf in resources.
These companies pour millions into AI research, infrastructure, and talent.
Some estimates say that OpenAI’s GPT-3.5 model with about 150 billion parameters cost them about $5 million to build. GPT-4 costs are estimated to be 20x more eye-watering. Bloomberg’s relatively smaller GPT (50 billion parameters) is estimated to have cost them $1 million+.
These companies maintain massive data centers, employ thousands of AI researchers, and can absorb the costs of multiple failed iterations. And more importantly, AI is their core business—not a tool to support their core business.
Your company likely has different constraints and priorities. Before committing to building an LLM, you need to understand the true scope of what you’re considering. The resources required aren’t just substantial but are major barriers to entry.
Training and serving an LLM isn’t like typical software development. Looking at companies that have successfully built LLMs reveals a sobering picture of resource requirements.
For context, even relatively modest LLMs with fewer parameters than leading models require immense resources to train and maintain.
The costs scale exponentially with model size. Leading models like GPT-4 required investments that dwarf the entire technical budgets of most organizations.
Think of it this way: every time you need to update the model with new data or capabilities, you’re looking at another full training cycle. Even if you start small, the resource requirements quickly become unsustainable for most organizations.
Remember those AI project failure rates I mentioned? Many of those failures stem from underestimating these fundamental resource requirements. Teams often discover midway through that they’ve burned through their budget without achieving their desired results.
The computational costs of building an LLM are restrictive as it is. But what many tech leaders also forget is that:
And there’s the opportunity cost.
While your team struggles with the complexities of LLM development, competition might be rapidly deploying solutions using existing models.
Every month spent on custom LLM development is a month you’re not spending on actual business problems.
Almost never.
Consider it only if you have unique requirements that existing models absolutely cannot meet, can justify the massive investment through clear revenue gains, and have both the technical expertise and financial resources for a multi-year commitment.
Even then, I’d recommend you reconsider.
The companies I see succeeding with AI aren’t building models from scratch. They’re focusing their resources on smart implementation, effective integration, and solving actual business problems.
One highly effective approach is the RAG (retrieve, augment, generate) pattern, which creatively leverages existing LLMs to meet specialized needs.
By retrieving relevant information, augmenting it with specific context, and generating targeted outputs, companies can deploy off-the-shelf models that align precisely with their business requirements—without the overhead of building from scratch.
General-purpose LLMs grant you access to advanced AI capabilities while someone else handles the expensive, complicated parts. The costs for access are minimal compared to building and maintaining your own LLM.
Think about what you’re getting.
With companies like OpenAI, Anthropic, and Google continually pushing boundaries, your subscription includes automatic updates and improvements.
This way, your engineering team can focus on adding business value—building applications that solve real problems, integrating AI into workflows, and creating competitive advantages.
That’s where real innovation happens—not in training models, but in applying them cleverly to business problems.
“But what about accuracy? Won’t a custom-trained model better understand our specific needs?”
I hear this concern constantly, and it’s based on a fundamental misconception about how modern LLMs work.
The real key to accuracy isn’t having your own model—it’s optimizing how you work with existing ones. This optimization happens along two critical axes: context and model behavior.
Context optimization becomes your primary focus when your use cases extend beyond the model’s general knowledge.
For instance, a financial services company might need the model to understand proprietary trading strategies. A healthcare provider might need it to work with the latest clinical guidelines. A manufacturing firm might need it to understand specific operational procedures.
In each case, the challenge isn’t the model’s fundamental capabilities, it’s ensuring it has access to the right information at the right time.
This is where RAG (Retrieve, Augment, Generate) patterns and prompt engineering shine.
Instead of training a new model, RAG lets you give a general-purpose LLM dynamic access to your specific knowledge base.
Vector databases, which store the semantic meaning of text passages, enable the model to retrieve similar information based on meaning rather than exact wording. This allows for precise identification of relevant data for each query.
Prompt engineering then structures this information optimally for the model, essentially teaching it to speak your language and understand your context in real-time.
Model behavior refinement focuses on getting consistent, properly formatted outputs that align with your needs. Your:
This is where fine-tuning proves invaluable and helps reinforce these patterns and expectations, making a general-purpose model behave like a specialized one without the overhead of full model training. More on this in a bit.
The path to effective AI implementation doesn’t run through building your own LLM.
I’ve watched companies exhaust their resources trying to build custom models when they should have been focusing on clever applications of existing technology.
Instead, you could optimize general-purpose LLMs to align outcomes with your goals. There are three ways to do this: in-context learning, model fine-tuning, and prompt engineering.
This is the most efficient approach available..
Modern LLMs are astonishingly adaptable. They can understand and apply new information on the fly. You provide context in your prompts, guide the model with examples, and get remarkably accurate outputs. No special training is required.
The beauty of this approach is its flexibility—you can adjust your prompts and context as your needs evolve.
Fine-tuning offers another powerful middle ground.
Instead of building a model from scratch, you’re customizing an existing one for your specific needs.
The resource requirements are orders of magnitude smaller than full model training. You need thousands of examples rather than billions, weeks rather than months, and reasonable computing power rather than massive clusters.
Fine-tuning shines when you need consistent behavior or a deep understanding of specialized terminology.
Let me emphasize something that often gets overlooked: prompt engineering.
This is your most powerful tool when working with existing LLMs, yet many organizations rush to build custom models without mastering this crucial skill.
Effective prompt engineering can make a general-purpose LLM perform like a specialized one at a fraction of the cost.
The key is understanding that prompts are essentially programming; they’re how you tell the model exactly what you want and how you want it.
Also, prompt engineering is infinitely adaptable. Unlike a custom model that’s expensive to retrain, you can modify your prompts instantly as your needs change.
You can A/B test different approaches, refine your instructions, and optimize for different scenarios without any additional computing costs.
The allure of building your own LLM is understandable, but it’s a path paved with hidden costs and complexity.
Through proper application of open LLMs, vector databases, and RAG patterns, you can build sophisticated AI systems without the expense and risk of custom model development.
Your competitive advantage in AI won’t come from owning a model. It will come from how effectively you apply existing technology to solve real business problems.
As I mentioned, the organizations I see succeeding aren’t the ones building models from scratch but are the ones focusing their resources on smart implementation and effective integration.
This approach is inherently future-proof. As general-purpose models continue to improve, your applications automatically benefit from these advances. While others are stuck maintaining their custom models, you can focus on what really matters: creating value for your business.