LM Studio

a desktop application that allows you to run large language models (LLMs) locally on your computer, without needing an internet connection or cloud services. - Home / chatGPT

Performance

They can be accessed and tailored by model

Memory

  • context length may have huge impact on memory consumption if going above 8k

Performance

  • cpu thread can be choose on custom model preference
  • as well as GPU Layer offloading

Server

On prompt UI (green term):

  • enable Server 🟢 (Status running) and configure Server settings
    • Serve on Local Network 🟠 -> allow remote access
    • Server port 1234
    • Make sure a model is loaded

Test it works

$ curl http://yves-lab:1234/v1/chat/completions \
                       -H "Content-Type: application/json" \
                       -d '{
                     "model": "qwen3",
                     "messages": [
                       {"role": "user", "content": "Hello, respond with one word"}
                     ]
                   }'
answer

{
  "id": "chatcmpl-udbu6gq8kshmddzqmhcrwq",
  "object": "chat.completion",
  "created": 1765637850,
  "model": "qwen/qwen3-vl-8b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello",
        "reasoning_content": "",
        "tool_calls": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 2,
    "total_tokens": 16
  },
  "stats": {},
  "system_fingerprint": "qwen/qwen3-vl-8b"
}

Tools

Enable LLMs to interact with external functions and APIs.

Skills

LLMs need prompts. Prompts can get very big very quickly. The so called “skills” are simply a mechanism to extend the prompt dynamically.

Models

Devstral 2

Second-generation Devstral for agentic coding. Built for tool use to explore codebases, edit multiple files, and power software engineering agents with newly added vision support.

see also

Plan then Execute

localworkflow

LangFuse

Goose

  • Goose - an open-source AI agent designed primarily to help developers by automating engineering tasks: generating code, executing commands, debugging, running workflows, integrating with tools (e.g. version control, APIs, file system), etc.

LocalAGI

  • LocalAGI / github - a self-hostable agent orchestration platform. It lets you build & run autonomous / semi-autonomous “agents” (or even teams of agents) on your own hardware (CPU or GPU), without needing cloud APIs or external services.
    • User Guide - using LocalAGI, a self-hosted AI agent platform that runs 100% locally. It covers creating and managing agents, working with agent groups, interacting with agents, monitoring agent activities, and using browser agents.

see also

Written on November 26, 2025, Last update on
LLM ghidra reverse