meloqui

Melodic speech for LLMs

Enterprise-ready multi-provider LLM chat SDK with a unified interface for OpenAI, Anthropic, Google, and Ollama (local models).

Features

Stage 1 - Foundation ✅

✅ TypeScript-first with full type safety
✅ Multi-provider support (OpenAI, Anthropic, Google, Ollama)
✅ Unified chat interface across providers
✅ Automatic API key resolution from environment
✅ Comprehensive test coverage
✅ Built on Vercel AI SDK for reliability

Stage 2 - Core Features ✅

✅ Streaming responses with AsyncIterator
✅ Conversation history management
✅ Persistent storage with FileStorage
✅ In-memory conversation tracking
✅ Production-ready error handling

Stage 3 - Enterprise Features ✅

✅ Automatic retries with exponential backoff
✅ Rate limiting with token bucket algorithm
✅ Structured logging with configurable levels
✅ Tool system with registration and execution

Stage 4 - Advanced Features ✅

✅ Function/tool calling LLM integration
✅ Multi-step agentic tool execution
✅ Tool calling across all providers (OpenAI, Anthropic, Google)

Stage 5 - Extensibility ✅

✅ Custom provider plugins
✅ Capability-based provider validation

Coming Soon

Stage 6 - Future Enhancements

Multi-modal inputs (images, audio)
Fine-tuned model support

Documentation

Installation & Quick Start - Get started in minutes
API Reference - Complete API documentation
Examples - Working code examples
Contributing Guide - Contribution guidelines
Code of Conduct - Community standards
Security Policy - Security and vulnerability reporting
License - MIT License
Third-Party Notices - Dependency licenses

Installation

npm install meloqui

Quick Start

Basic Chat

import { ChatClient } from 'meloqui';

// Create a client (API key from OPENAI_API_KEY env variable)
const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4'
});

// Send a message
const response = await client.chat('Hello! What is TypeScript?', {
  temperature: 0.7,
  maxTokens: 150
});

console.log(response.content);
// Output: TypeScript is a strongly typed programming language...

Streaming Responses

import { ChatClient } from 'meloqui';

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4'
});

// Stream responses in real-time
for await (const chunk of client.stream('Tell me a story')) {
  process.stdout.write(chunk.content);
}

Conversation History

import { ChatClient } from 'meloqui';

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  conversationId: 'user-123' // Enable automatic history tracking
});

await client.chat('My name is Alice');
await client.chat('What is my name?'); // Remembers context

// Retrieve full history
const history = await client.getHistory();
console.log(history); // [{ role: 'user', content: 'My name is Alice' }, ...]

Persistent Storage

import { ChatClient, FileStorage } from 'meloqui';

const storage = new FileStorage('./conversations');

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  conversationId: 'user-123',
  storage // Conversations persist across sessions
});

await client.chat('Remember: my favorite color is blue');

// Later, in a new session...
const newClient = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  conversationId: 'user-123',
  storage // Loads previous conversation
});

await newClient.chat('What is my favorite color?'); // Remembers!

Ollama (Local Models)

Run models locally with Ollama for privacy and offline use.

import { ChatClient } from 'meloqui';

// Use Ollama with local models (no API key needed)
const client = new ChatClient({
  provider: 'ollama',
  model: 'llama3.2'  // or mistral, codellama, phi, etc.
});

const response = await client.chat('Explain quantum computing');
console.log(response.content);

Remote Ollama Server:

const client = new ChatClient({
  provider: 'ollama',
  model: 'llama3.2',
  baseUrl: 'http://my-server:11434'
});

Requirements: Ollama must be running locally (default: http://localhost:11434) or accessible via network. Install with brew install ollama (macOS) or see ollama.ai for other platforms.

Enterprise Features

Retry Logic

Automatic retry with exponential backoff for resilience against transient failures.

import { ChatClient } from 'meloqui';

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  retryConfig: {
    maxAttempts: 3,           // Maximum retry attempts
    initialBackoffMs: 1000,   // Initial backoff delay
    maxBackoffMs: 10000,      // Maximum backoff delay
    backoffMultiplier: 2      // Exponential multiplier
  }
});

// Automatically retries on network errors, rate limits, and transient failures
const response = await client.chat('Your message');

Features:

Exponential backoff with configurable multiplier
Automatic retry on 429 (rate limit) and 5xx errors
Configurable maximum attempts and backoff limits
Structured logging of retry attempts

Example: See examples/with-retry.ts

Rate Limiting

Token bucket rate limiting to prevent hitting API limits.

import { ChatClient } from 'meloqui';

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  rateLimitConfig: {
    requestsPerMinute: 60,    // Max requests per minute
    tokensPerMinute: 90000    // Max tokens per minute
  }
});

// Requests automatically queued to stay within limits
const response = await client.chat('Your message');

Features:

Dual rate limiting: requests per minute AND tokens per minute
Token bucket algorithm with automatic refill
Queued request handling - no dropped requests
Provider-specific defaults (OpenAI, Anthropic, Google)

Default Limits:

OpenAI: 60 req/min, 90K tokens/min
Anthropic: 50 req/min, 100K tokens/min
Google: 60 req/min, 60K tokens/min
Ollama: No limits (local)

Example: See examples/with-rate-limiting.ts

Logging

Structured logging with configurable log levels for observability.

import { ChatClient, ConsoleLogger } from 'meloqui';

// Create logger with desired level
const logger = new ConsoleLogger({
  level: 'debug'  // 'debug' | 'info' | 'warn' | 'error'
});

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  logger
});

// All operations are logged with context
const response = await client.chat('Your message');

Log Levels:

debug: Verbose logging (request/response details, retry attempts)
info: General operations (API calls, token usage)
warn: Warnings (rate limit approaching, retries)
error: Errors and failures

Logged Events:

API requests and responses
Retry attempts with backoff delays
Rate limit queue status
Token usage and costs
Error details with stack traces

Custom Loggers:

Implement the ILogger interface for custom logging:

import { ILogger } from 'meloqui';

class CustomLogger implements ILogger {
  debug(message: string, context?: Record<string, unknown>): void {
    // Your implementation
  }
  info(message: string, context?: Record<string, unknown>): void {
    // Your implementation
  }
  warn(message: string, context?: Record<string, unknown>): void {
    // Your implementation
  }
  error(message: string, error?: Error, context?: Record<string, unknown>): void {
    // Your implementation
  }
}

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  logger: new CustomLogger()
});

Example: See examples/with-logging.ts

Tool System

import { ToolRegistry } from 'meloqui';

// Create registry
const registry = new ToolRegistry();

// Register a tool
registry.registerTool(
  'getWeather',
  async (location: string) => {
    // Your implementation
    return { temperature: 72, condition: 'sunny' };
  },
  {
    description: 'Get current weather for a location'
  }
);

// Execute directly
const result = await registry.executeTool('getWeather', 'Paris');
console.log(result); // { temperature: 72, condition: 'sunny' }

// Get all tools for LLM context
const tools = registry.getTools();

// Use with ChatClient for automatic tool calling
const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  tools: registry
});

Features:

Type-safe tool registration
Async tool execution
LLM integration for automatic tool calling
Multi-step agentic execution
Tool metadata and descriptions
Error handling for tool failures

Tool Interface:

interface Tool {
  name: string;
  handler: ToolHandler;
  metadata: {
    description?: string;
    [key: string]: unknown;
  };
}

Example: See examples/with-tools.ts for both simple and agentic tool calling demonstrations.

Custom Provider Plugins

Create your own provider to use any LLM:

import { ChatClient, ProviderPlugin } from 'meloqui';

class MyCustomProvider implements ProviderPlugin {
  readonly name = 'my-provider';
  readonly capabilities = {
    chat: true,
    streaming: true,
    toolCalling: false,
    vision: false,
    audio: false
  };

  async chat(messages, options) {
    // Your implementation here
    return { content: 'Response', role: 'assistant' };
  }

  async *stream(messages, options) {
    // Your streaming implementation
    yield { content: 'Chunk', role: 'assistant' };
  }

  supportsTools() { return this.capabilities.toolCalling; }
  supportsStreaming() { return this.capabilities.streaming; }

  async chatWithTools(messages, tools, options) {
    // Tool calling implementation (if supported)
    return { content: 'Response', role: 'assistant' };
  }

  async *streamWithTools(messages, tools, options) {
    // Streaming with tools (if supported)
    yield { content: 'Chunk', role: 'assistant' };
  }
}

// Use your custom provider
const provider = new MyCustomProvider();
const client = new ChatClient({ provider, model: 'my-model' });

const response = await client.chat('Hello!');

Features:

Full type safety with ProviderPlugin interface
Capability-based validation (streaming, tool calling)
No API key required for custom providers
Access capabilities via client.capabilities

Capability Validation:

The client validates capabilities before operations:

// If your provider has streaming: false
const client = new ChatClient({ provider, model: 'model' });
await client.stream('Hello'); // Throws CapabilityError

Configuration

Basic Configuration

import { ChatClient } from 'meloqui';

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',
  apiKey: 'your-api-key' // Optional: defaults to OPENAI_API_KEY env var
});

Advanced Configuration

const client = new ChatClient({
  provider: 'openai',
  model: 'gpt-4',

  // Authentication (recommended over apiKey)
  auth: {
    type: 'api-key',
    apiKey: 'your-api-key'
  },

  // Custom base URL (for proxies or local models)
  baseUrl: 'https://custom-endpoint.com/v1',

  // Conversation tracking
  conversationId: 'user-123-session-456'
});

Chat Options

const response = await client.chat('Your message', {
  temperature: 0.7,      // Sampling temperature (0-2)
  maxTokens: 1000,       // Maximum tokens to generate
  topP: 0.9,             // Nucleus sampling parameter
  model: 'gpt-3.5-turbo' // Override default model
});

API Reference

`ChatClient`

Main client for interacting with LLM providers.

Constructor

new ChatClient(config: ChatConfig)

Parameters:

config.provider: The LLM provider ('openai' 'anthropic' 'google' 'ollama')
config.model: The specific model to use (e.g., 'gpt-4', 'claude-3-opus')
config.apiKey?: API key (optional, falls back to environment variable)
config.auth?: Authentication configuration (recommended over apiKey)
config.baseUrl?: Custom base URL for API requests
config.conversationId?: Optional conversation ID for tracking

Methods

`chat(message: string, options?: ChatOptions): Promise<ChatResponse>`

Send a chat message and get a response.

Parameters:

message: The user message to send
options?: Optional chat configuration

Returns:

ChatResponse with content, role, and metadata

Example:

const response = await client.chat('Hello!', { temperature: 0.7 });
console.log(response.content);

`stream(message: string, options?: ChatOptions): AsyncIterator<StreamChunk>`

Stream a chat response for real-time output.

Parameters:

message: The user message to send
options?: Optional chat configuration

Returns:

AsyncIterator<StreamChunk> yielding incremental response chunks

Example:

for await (const chunk of client.stream('Tell me a story')) {
  process.stdout.write(chunk.content);
}

`getHistory(): Promise<Message[]>`

Retrieve the full conversation history for the current conversation.

Returns:

Array of Message objects representing the conversation

Example:

const history = await client.getHistory();
history.forEach(msg => {
  console.log(`${msg.role}: ${msg.content}`);
});

`clearHistory(): Promise<void>`

Clear the conversation history for the current conversation.

Example:

await client.clearHistory();

Types

`ChatResponse`

interface ChatResponse {
  content: string;           // The generated response
  role: 'assistant';         // Always 'assistant'
  metadata?: {
    model?: string;          // Model that generated response
    tokensUsed?: number;     // Total tokens consumed
    finishReason?: string;   // Why generation stopped
  };
}

`ChatOptions`

interface ChatOptions {
  temperature?: number;      // 0-2, higher = more random
  maxTokens?: number;        // Max tokens to generate
  topP?: number;             // Nucleus sampling (0-1)
  model?: string;            // Override default model
}

`Message`

interface Message {
  role: 'user' | 'assistant' | 'system' | 'tool';
  content: string;
  name?: string;             // For tool messages
  toolCallId?: string;       // For tool responses
}

Examples

See the examples directory for complete working examples:

Core Features

basic-chat.ts - Simple chat interaction
streaming.ts - Real-time streaming responses
with-history.ts - Multi-turn conversations with history
with-storage.ts - Persistent storage across sessions

Enterprise Features

with-retry.ts - Automatic retry with exponential backoff
with-rate-limiting.ts - Rate limiting and request queuing
with-logging.ts - Structured logging and observability
with-tools.ts - Tool registration and execution

To run examples:

# Set your API key
export OPENAI_API_KEY=your-key-here

# Core examples
npm run example:basic      # Basic chat
npm run example:streaming  # Streaming responses
npm run example:history    # Conversation history
npm run example:storage    # Persistent storage

# Enterprise examples
npm run example:retry      # Retry logic
npm run example:rate-limit # Rate limiting
npm run example:logging    # Structured logging
npm run example:tools      # Tool system (no API key needed)

Development

Prerequisites

Node.js >= 18.0.0
npm or yarn

Setup

# Install dependencies
npm install

# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Type checking
npm run typecheck

# Linting
npm run lint

# Build
npm run build

Project Structure

agent-chat/
├── src/
│   ├── client/          # ChatClient implementation
│   ├── providers/       # Provider implementations (OpenAI, etc.)
│   ├── types/           # TypeScript type definitions
│   └── index.ts         # Public API exports
├── examples/            # Usage examples
├── dist/                # Compiled output
└── tests/               # Test files (colocated with source)

Running Tests

# All tests
npm test

# Watch mode
npm run test:watch

# Coverage
npm test -- --coverage

Architecture

The SDK follows a layered architecture:

ChatClient (Public API)
    ↓
Provider Interface (IProvider)
    ↓
Provider Implementations (OpenAI, Anthropic, etc.)
    ↓
LangChain Integration
    ↓
LLM APIs

Design Principles

Provider Abstraction: Unified interface across all LLM providers
Type Safety: Full TypeScript support with strict typing
Composability: Use dependency injection for testability
Minimal Dependencies: Leverage LangChain but keep core light
Extensibility: Easy to add new providers and features

Roadmap

Stage 1: Foundation ✅ (Complete)

Stage 2: Core Features ✅ (Complete)

Streaming responses
Conversation history
Persistent storage with FileStorage
In-memory conversation tracking
Production-ready error handling

Stage 3: Enterprise Features ✅ (Complete)

Retry logic with exponential backoff
Rate limiting with token bucket
Structured logging system
Tool registration and execution
Examples and documentation

Stage 4: Advanced Features ✅ (Complete)

Function/tool calling LLM integration
Multi-step agentic tool execution
Tool calling across all providers
Examples and documentation

Stage 5: Extensibility ✅ (Complete)

Custom provider plugins
Capability-based provider validation
Provider registry for string-based instantiation

Stage 6: Local Models ✅ (Complete)

Ollama provider integration
Remote Ollama server support

Stage 7: Future Enhancements

Multi-modal support (images, audio)
Fine-tuned model support

Contributing

Contributions are welcome! We value all contributions, from bug reports to new features.

Before contributing, please read:

Contributing Guidelines - Development workflow and standards
Code of Conduct - Community guidelines

Quick Start for Contributors

Fork and clone the repository
Install dependencies: npm install
Create a feature branch
Write tests first (TDD)
Implement functionality
Ensure all tests pass
Submit a Pull Request

See CONTRIBUTING.md for detailed guidelines.

License

MIT License - see LICENSE for details.

Support

Documentation: Wiki
Issues & Bugs: GitHub Issues
Questions & Discussions: GitHub Discussions
Security Issues: See SECURITY.md for responsible disclosure

This site is open source. Improve this page.