PHP at Scale #8
Welcome to the eight edition of PHP at Scale. I am diving deep into the principles, best practices and practical lessons learned from scaling PHP projects — not only performance-wise but also code quality and maintainability.
This month, I would like to discuss AI. I struggled a lot with choosing this topic. I know everyone has been speaking about LLMs for the last couple of months, and it is not directly connected with PHP, but I believe some findings are still worth sharing. It seems this is something we (the PHP community) need to adjust to. If you’ve read my interview with PHP Foundations CEO Roman Pronskiy, he actually mentioned that in one of his answers.
In the AI discussion, I would like to focus on one thing that might make it different from what you have read/researched so far: I will talk about adding features to products we work on instead of using AI to simply generate code.
I promise to get back to pure scalability topics next month 😉.
I am not an AI/LLM expert, but I have done enough tests to guide you through some possibilities that we have when developing our products.
Unfortunately, there are not many good articles on that topic. Because of that, I had to switch the approach a bit and add more information from my side. I still linked to a couple of interesting articles. I hope that the little change of format for this edition will work for you. As always, any feedback is more than appreciated.
Groundwork
In order to test some LLM integrations, you obviously need an LLM 😉. You can go ahead with Anthropic, OpenAI (and compatible APIs), or… you can run an LLM on your machine. It will of course not be as good as the paid ones, but still pretty decent.
I’ve started with spending 5$ on Anthropic API access, it covers months of tests and works very well. If you can afford this or ChatGPT, I highly recommend going this way.
The free alternative is Ollama. It’s super easy to set up and will take only a couple of minutes. I use it quite a lot.
For Ollama to work, you will need to pull a model, you can search them here.
On my rather old MacBook M1 Pro, I can run models up to ~16B without any performance issues. As an example, below is a simple question I’ve sent to qwen2.5:14b (just run `ollama run qwen2.5:14b` in your console)
Usually, the models that fit on my MacBook will be significantly worse compared to state-of-the-art models that you pay for, but they handle simple cases very well. A simple case could be a product comment summary.
But this is a PHP newsletter, right? For most of my PHP-related tests, I tend to use LLPhant. It’s not perfect, but it allows me to easily check most of the things I would like to discuss today.
Chatting and system prompts
Just to kick off our journey with LLPhant, let’s try sending messages to LLM:

One important thing to know at this stage is that you can alter how the LLM behaves using a system message/prompt. An example looks like this:

Now the response will be something like this:
Ahoy there, seafaring scalers of PHP land! May the sea breeze fill your sails and the code be always smooth underfoot as we delve into the latest tidings of PHP at scale. Keep a weather eye open for new updates, tips, and treasures that await ye!
Thanks to this, you can enforce some behaviour on the LLM. If you build something for your product, that the user can interact with, this is the first step to block some unwanted usage. Unfortunately, the open-source models are easily tricked into doing unrelated things 😀.
I also tend to add some useful data in the system prompt. For example, you can pass a CSV or JSON with data, that the LLM then summarizes. Let’s say a list of reviews, that the user can ask questions about.
Tools
Now, the above already might be useful, but tools/functions are the things that will improve the integration A LOT. LLM calls them to run a specific action, or get more data.
Just imagine scenarios, in which your users discuss something with an LLM-based chatbot, and it is able to perform actions for them — add products to the basket, create reports, batch actions on tasks that match criteria, etc.
Using tools is quite easy:

And running the code returns:
The email has been sent to example@example.com
with the subject "Who is Marie Curie?" and the body:
Marie Curie was a pioneering Polish-French physicist and chemist who received two Nobel Prizes.
Embeddings
Using tools is not always enough and might not cover all scenarios. A good example would be a case when the LLM should have some specific, project-based knowledge. Let’s say, you would like to build an assistant that can answer questions based on your help articles. It would be very hard/inefficient to handle this flow using tools/functions. What you can do instead, is embedding. Embedding requires a bit more code, so I won’t provide any examples here, but rather link an article below.
Embedding requires some preliminary steps:
Your content needs to be split into chunks. Chunks are usually a couple of words, a bit like sentences.
The chunks are run through the embedding process, which transforms the text into high-dimensional vector representations that capture semantic meaning.
The resulting vectors are then stored in the database.
When a new discussion is started, you run the same process on the question, but instead of storing the vector in the database, you search for similar vectors. Thanks to that, you receive related chunks of text from the database, that are related to the question.
Then you pass the related chunks of code together with the question to the LLM. Thanks to that, the LLM gets some input knowledge it can use to answer the question.
You can check out this article to learn more.
Model Context Protocol - MCP
So far we investigated how we can add AI to our products, now let’s turn it around, and check how we can add our products to LLMs. MCP is recently gaining a lot of traction, and it gets more interesting every week. Actually, MCP usage is a bit limited as of today — it works in Cursor and Claude Desktop, but OpenAI already announced they will also support it. This means MCP is most useful for technical people right now, but I expect this to change in the upcoming weeks/months.
With that said, what is MCP? Basically, this is a protocol that standardizes the way to integrate different services/tools with LLMs. A bit like functions, but instead of providing them with the question, they are configured inside the LLM itself. I have a couple of MCP servers connected to my Claude Desktop, allowing me e.g. run API calls against a SaaS we develop (Creating tasks in the inbox).
The easiest way to play with MCP if you have an API ready is to probably use Zapier MCP. But before that, you can just have a look at some examples, or browse many YouTube videos showcasing how it integrates tools like Blender for 3D modelling etc.
Unfortunately, I have not tested any dedicated PHP libraries for writing MCP servers yet, and there are not many libraries to do that, but it is definitely good to check out the protocol itself.
You can also use MCP as a client - hooking them similarly as you do with tools. There are already different catalogues of available MCP servers like this one, or this one.
AI is more than LLMs, and there are sometimes better tools than AI 😉
Although the global focus has been on LLMs in recent months, it is good to keep in mind that there is more behind AI that might suit your needs better. So before you jump into some specific use, you might consider if e.g. Neural networks would be a better fit, or maybe AI is not required at all, and something else will work better. For example, there are great forecasting tools that do not require AI at all 😉.
—
That’s it for this month, I hope you found that interesting and were not annoyed by the change of format. I’ll be back with the usual format next month.
—
All the links used in this edition.
A free, open-source tool that simplifies running large language models (LLMs) locally on your computer — Ollama. For Ollama to work, you will need to pull a model, you can search them here.
A comprehensive PHP Generative AI Framework using OpenAI GPT 4. Inspired by Langchain — LLPhant.
Leverage Generative AI in your PHP E-Commerce website with Qdrant and LLPhant.
The code of this edition is available as a gist here.
This page showcases various Model Context Protocol (MCP) servers that demonstrate the protocol’s capabilities and versatility.
Why is this newsletter for me?
If you are passionate about well-crafted software products and despise poor software design, this newsletter is for you! With a focus on mature PHP usage, best practices, and effective tools, you'll gain valuable insights and techniques to enhance your PHP projects and keep your skills up to date.
I hope this edition of PHP at Scale is informative and inspiring. I aim to provide the tools and knowledge you need to excel in your PHP development journey. As always, I welcome your feedback and suggestions for future topics. Stay tuned for more insights, tips, and best practices in our upcoming issues.
May thy software be mature!