LM Studio

a desktop application that allows you to run large language models (LLMs) locally on your computer, without needing an internet connection or cloud services. - Home / r/LocalLLaMA / chatGPT

LM Studio 0.4

Performance

They can be accessed and tailored by model

Memory

context length may have huge impact on memory consumption if going above 8k

Performance

cpu thread can be choose on custom model preference
as well as GPU Layer offloading

Server

On prompt UI (green term):

enable Server 🟢 (Status running) and configure Server settings
- Serve on Local Network 🟠 -> allow remote access
- Server port 1234
- Make sure a model is loaded

Test it works

$ curl http://yves-lab:1234/v1/chat/completions \
                       -H "Content-Type: application/json" \
                       -d '{
                     "model": "qwen3",
                     "messages": [
                       {"role": "user", "content": "Hello, respond with one word"}
                     ]
                   }'

answer

{
  "id": "chatcmpl-udbu6gq8kshmddzqmhcrwq",
  "object": "chat.completion",
  "created": 1765637850,
  "model": "qwen/qwen3-vl-8b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello",
        "reasoning_content": "",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 2,
    "total_tokens": 16
  },
  "stats": {},
  "system_fingerprint": "qwen/qwen3-vl-8b"
}⏎

Tools

Enable LLMs to interact with external functions and APIs.

Skills

LLMs need prompts. Prompts can get very big very quickly. The so called “skills” are simply a mechanism to extend the prompt dynamically.

Models

Devstral 2

Second-generation Devstral for agentic coding. Built for tool use to explore codebases, edit multiple files, and power software engineering agents with newly added vision support.

Plan then Execute

LangFuse

Langfuse LM Studio integration - helps you trace, monitor, debug, evaluate, and analyze what your LLM-based application (or agent system) does.
- LocalAGI vs LangFuse -

Goose

Goose - an open-source AI agent designed primarily to help developers by automating engineering tasks: generating code, executing commands, debugging, running workflows, integrating with tools (e.g. version control, APIs, file system), etc.

LocalAGI

LocalAGI / github - a self-hostable agent orchestration platform. It lets you build & run autonomous / semi-autonomous “agents” (or even teams of agents) on your own hardware (CPU or GPU), without needing cloud APIs or external services.
- User Guide - using LocalAGI, a self-hosted AI agent platform that runs 100% locally. It covers creating and managing agents, working with agent groups, interacting with agents, monitoring agent activities, and using browser agents.

#4577