Keep your friends close but your AI closer

Local AI at Sofics

As AI rapidly becomes part of the modern workplace, companies face a fundamental question: should we use public AI services like ChatGPT, Claude, Gemini,… or local AI? One might expect us to always favor local AI machines (more production of our semiconductor IP), but we’re not crooked like that. This blog post shares our own evaluation approach with real use cases.

Robin during a presentation about the AI initiatives at Sofics

When the AI hype began, we at Sofics formed an internal AI workgroup containing engineers, managers and developers. With a simple and clear mindset “Don’t rush, don’t hush” we find value in AI to help our colleagues, all the while not blindly chasing trends.

It quickly became clear that public AI isn’t always suitable, especially when working with sensitive content like confidential information from customers or partners and internal documents. Public AI services often log inputs and use them for training, which raises concern. While many paid plans offer the option to opt out of data usage or logging, it’s important to remember that these companies employ top-tier legal teams to craft their terms and conditions. I wouldn’t be surprised if this ensures that, once you’re no longer on a paid plan, they can legally exploit any data they still have.

Sometimes you’re contractually or legally (e.g. GDPR) prohibited to process certain data using external services. Even if we – as an IP company – legally could, it feels uncomfortable and careless to communicate our thought process with external services.

We’d advise anyone introducing AI in a professional environment, to set up local AI functionalities that can be used for testing purposes. Making AI Proof of Concepts will go a lot quicker when developers can work on them, without initially having to worry about data sensitivity. Once a proof of concept has proven itself with local AI, the only way is up.

The main drawback of running AI locally is that it’s more expensive. In the rapidly evolving AI landscape, the most advanced hardware quickly becomes outdated, despite its high cost. What is considered cutting-edge today may no longer hold that status in the near future. Meanwhile the competitive nature of the industry has led to tons of online services to offer their tools for little or no cost (for now, at least).

Side note:
Shortly after the creation of our internal AI workgroup, I contributed some code to the open-source project “PrivateGPT”. Fast forward one year: the creator of that project is now the CEO of Zylon, a company offering secure on-premise AI for regulated industries.
This is one of many cases demonstrating that local AI is booming.

AI criteria

You have a use case for AI and wonder: should this run locally?
Before deciding which route to take, we’ve learned to evaluate each use case against some criteria:

1. Result quality

Output should be correct, clear and concise. Some tasks are forgiving. Others, like legal document analysis, demand precision.

2. Speed

Slow AI tools can break a user’s flow. Speed is essential for interactive tools, but less so for batch jobs (e.g. transcriptions).

3. Expenses

Public AI: You pay per token (input and output words), and it can add up fast.
Local AI: Requires hardware and some initial setup. The primary ongoing cost is the electricity it consumes.

4. Privacy

What happens to your data matters. Public AI models can often retain data for training or analysis. Local models ensure you’re able to fully control what happens with your data within your own network.

5. Independence

AI vendors can change terms, pricing and availability. Owning the whole process, gives you long-term control and predictability.

6. Usability

Ease of use often depends more on the integration rather than the AI model. Yet a great AI model is only useful if people can and want to interact with it in an easy way.

Concrete examples

Use Case 1: Transcribing & Summarizing Meetings

Scenario: Turn recorded team meetings into readable transcripts and summaries.

Results: Transcription quality from open-source models like OpenAI’s Whisper is excellent and can run locally. Summarizing very big texts in a meaningful / specific way might require a very good AI model.
Speed: Not critical. Getting the summaries within the same or next day is probably good.
Expenses: Summarizing long transcripts using public AI can be pricy since you pay per token whereas local processing is much more cost-effective.
Privacy: Meetings will inherently involve internal information. Even when no explicitly sensitive details are discussed, the voices of participants themselves can be considered sensitive data. With the rapid advancements in e.g. voice cloning technology, you might be surprised how easily this kind of information can be misused. Processing meeting recordings within your own infrastructure lowers the risk of a GDPR violation.

Verdict: We use Whisper locally to transcribe our meetings. For summarizing these transcripts we use local models, mainly because we consider meetings inherently sensitive. No cutting-edge hardware is required due to speed not being crucial. Although the summaries aren’t flawless every time, they’re often effective in capturing key points, supporting personal note-taking.

Use Case 2: Drafting LinkedIn Posts

Scenario: Help creating & polishing short professional content.

Results: Existing public tools like ChatGPT / DeepSeek Chat are some of the latest and greatest.
Privacy: Very low risk: the content is intended for public sharing anyways.
Speed: great speeds
Expenses: Low, can even be free .
Independence: The public tool might suddenly no longer exist / use a different model.

Verdict: Public AI is perfect for this case. No reason to over-engineer it. We made a custom GPT with background on Sofics to help us with draft posts. Be wary that fully outsourcing tasks to your favorite model or tool might seem like a good idea, until something about the tool changes. Suddenly your company’s public voice has, an oddly timed, different personality.

Use Case 3: Check and compare confidential license agreements

Scenario: In the future we’d love to have AI check confidential documents.

Results: This concerns very important documents. Naturally, we’d like the very best existing models to check them for anomalies / loopholes.
Privacy: Disclosure or distribution of these documents is prohibited.
Speed: Speed matters little in this case. Slower / older hardware is ok, as long as it has a ton of memory for big models and context.

Verdict: Since public AI is out of the question, this is clearly a job for local AI. For optimal results, we could consider a hybrid approach:
Temporarily rent a secure hardware instance from a trusted cloud provider, deploy a cutting-edge open-source language model on it, and analyze sensitive documents in a protected environment. After completing the analysis, fully dismantle the setup and terminate the rental to ensure no data remnants remain.
This approach may not physically be local, but we’d fully control every step of the way and send data over private, encrypted connections.

Wrapping Up

Choosing between a public or local AI isn’t always as black and white as it may seem. Try to weigh each case carefully by balancing quality, speed, cost, privacy and control. Performant public AI tools are currently very cost-effective and great for low-risk tasks. Local AI becomes essential once you move on to more sensitive or confidential data. Taking this case-by-case approach helps achieve the benefits of AI without compromising on security or reliability.