From the Developer's Perspective: Getting into AI Chaos - Can It Boost My Productivity?
Until now, I had never used ChatGPT, GitHub Copilot, Amazon CodeWhisperer, or any of the well-known alternatives – not even once. Is it finally time to change that? Not really. Let’s see what my approach was.
It is hard to find someone who hasn't heard about those tools or hasn’t used them. They became popular fast. I chose a different way - I built a fully local, offline, free, and mostly open-source setup.
A few days in use and it has already helped me with some work:
- suggested useful code completions of "if" conditions in permission model, errors thrown in request body validation, data models construction, debugging,
- gave me a warning for a possible bug related to the paginated response I was processing,
- gave me advice on how to convert my algorithm to a one-liner,
- helped me to produce a more sales-like version of my technical explanation.
Let's dive into the story behind it and explore how you can set it up as well.
What was my motivation?
Why am I “getting into”?
Remaining competitive in the market is important, and losing track in this rapidly evolving sector is not a wise direction. Developers claim that AI assistants increase their productivity, so I don’t want to stay behind.
Why so late?
While I understand the importance of staying up-to-date and not getting stuck in the previous century, I'm not really the kind of person who immediately jumps onto mainstream trends. Usually, I observe, think, and later proceed to find my way. Often, there are flaws fixed and optimizations implemented.
Why not just use any of the well-known and reliable services?
The work-related part of my usage is proprietary client code, so voluntarily sending data to an external service I don’t have full control of is a no-go. Despite most of them claiming to be SoC2 compliant, ISO compliant, and asserting that they don’t process data in any way – I’m not buying that. Based on the lawsuits filed against tech giants in the recent years, I have tremendous trust issues with these highly commercially driven mainstream services.
What did my research reveal?
Research of current AI possibilities led me to the Large Language Models (LLMs). My focus was mainly (but not exclusively) aimed at work-related tasks. The results showed that I needed a set of tools instead of an all-in-one solution.
I've defined the following expectations and requirements:
- Easy to use, preferably with a GUI and minimal or no console
- Integration with the IDE (without the need to leave its window)
- Code completion
- Actions on code selection, including at least common ones like explanation, writing tests, and finding bugs
- Custom chat prompts
- Chat functionality preferably with support for media (images)
- Freedom to choose any LLM
- Uncensored information output
My local setup:
IDE and OS: JetBrains WebStorm IDE running on Windows 10
Hardware: i7-6700HQ, 32 GB RAM, GTX 1070 8 GB VRAM
Half of what I have is sufficient, so don’t worry if your machine has less memory – more just means more options. Also, you don’t need the same IDE and OS, the tools support almost everything.
I’m running everything natively and directly – without WSL2 or Docker. For the largest models, I can get to hardware utilization of 31.8/32GB RAM and 7.8/8GB VRAM – smooth usage and stability all day long. In the Docker environment it was a pain to get the GPU offloading working and the potential of hardware resources wasn’t fulfilled within the virtualization environment.
Software I currently use:
Here is my final setup. If you are interested in everything I’ve tried, you can find it at the end of the article.
LM Studio (https://lmstudio.ai/) – The best all-in-one GUI so far. The EXE file installs everything – a GUI with a running engine behind it. It works flawlessly, is very easy to use, supports both text and images (vision enabled), uses the GGUF model format only, provides seamless GPU utilization, and includes a local server (OpenAI-compatible API) so other software can connect to the local instance.
Despite being local, free, and privacy-focused, telemetry was something I wanted to rule out since it’s not open source. I’ve created a Man-In-The-Middle local proxy, routed everything from that process through it, and sniffed traffic to confirm no data are sent outside – it didn’t disappoint, only new versions check. To be even more calm, someone may want to block outside traffic in the firewall.
For IDE plugins, I use a combination of a few options:
CodeGPT (https://github.com/carlrobertoh/CodeGPT) – This powerful open-source plugin connects to a local (OpenAI-like) server when configured with a local host address and a dummy API key. While it doesn’t provide code completion at the time of writing, it excels in performing operations on selected code and offers a simple chat inside the IDE.
Tabby (https://github.com/TabbyML/tabby) – An open-source code completion plugin with strong potential, uses local models, but it can be challenging to get it running. My satisfaction with its outputs is lower compared to JetBrain’s plugin mentioned above. It worked best with its default model (StarCoder), although my favorite Deepseek Coder suggested a pure mess or didn’t work at all. I like it, but it needs some more work.
When it comes to models, the comprehensive open model library at https://huggingface.co/ offers everything you might need. Mostly, I download quantized models from “The Bloke”. There is a GGUF version of almost everything – I choose 4bit (Q4_K_M) variants.
Among my favorite base models in different fields are:
- For a lightweight programming model: https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct
- General model: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
- Uncensored general and advanced programming or computer science: https://huggingface.co/cognitivecomputations/dolphin-2.7-mixtral-8x7b (or its older versions 2.5 and 2.6)
It is essential to remember that good prompts do wonders, with the first being the most important – the system prompt or pre-prompt defining the character and behavior of the model.
Software I’ve tried but currently not using:
Disclaimer: The information below reflects only my subjective experience at the time of writing. Due to the active development of these tools, the situation can change very quickly.
- Ollama (https://github.com/jmorganca/ollama) – Considering the most popular console LLM runner, it is expected to have support for native Windows soon (except WSL2 which is already supported). However, it requires a separated GUI.
- Oobabooga Text Generation Web UI (https://github.com/oobabooga/text-generation-webui) – Despite being the most popular GUI, I had a terrible experience trying to run it due to constant errors, mostly Python-related. It is also not as user-friendly as LM Studio.
- GPT4All (https://github.com/nomic-ai/gpt4all) – It was challenging to get it running, and the user experience was poor. Loading the basic instruct model caused an instant crash without any information.
- LoLLMs-WebUI (https://github.com/ParisNeo/LoLLMs-WebUI) – While it supports a wide range of formats, including images and sounds, it didn't work well for me and was CPU-only despite GPU support. It was also challenging to use, similar to Text Generation Web UI.
- Faraday (https://faraday.dev) – Focused on AI characters, similar to LM Studio but with more bugs. It regularly gets stuck during operations, requiring a restart.
- Codeium (https://plugins.jetbrains.com/plugin/20540-codeium-ai-autocomplete-and-chat-for-python-js-ts-java-go) – Requires login and uses online services.
- CodiumAI (https://plugins.jetbrains.com/plugin/21206-codiumai--integrity-agent-powered-by-gpt-3-5-4) – Requires login and uses online services.
- Sourcery (https://plugins.jetbrains.com/plugin/12631-sourcery) – Promises limited functionality for open-source code without online services but couldn't be made usable without login.
What is my conclusion?
Despite not being a user of any commercial tools like ChatGPT, I'm grateful that revolutionary tools like it have been released to the public. In my opinion, this has naturally led to the expansion of the community interested in open models, making every day a race to produce the best model.
Since the research and setup took me several days and nights, this article serves as my two cents given back. Hopefully, someone can reuse my setup or, at the very least, find inspiration from it. It’s too soon to rate the impact on my productivity but I consider the experience I have very promising. Don’t forget to keep up with innovations.