Skip to main content
Components Observability

Helicone

Core Stack

Open-source observability platform for LLM applications

Version
3.0.0
Last Updated
2024-01-08
Difficulty
Beginner
Reading Time
2 min

Helicone

Helicone is an open-source observability platform specifically designed for LLM applications. It provides comprehensive monitoring, cost tracking, and performance analytics for your AI systems.

Key Features

  • LLM-Specific Monitoring: Purpose-built for language model applications
  • Cost Tracking: Monitor and optimize your LLM API costs
  • Request/Response Logging: Complete visibility into LLM interactions
  • Performance Analytics: Track latency, throughput, and success rates
  • Open Source: Self-hostable with full control over your data

Installation

1
pip install helicone

Quick Start

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from helicone import Helicone
import openai

# Initialize Helicone
helicone = Helicone(
    api_key="your-helicone-api-key",
    base_url="https://oai.hconeai.com/v1"
)

# Use with OpenAI
openai.api_base = "https://oai.hconeai.com/v1"
openai.api_key = "your-openai-api-key"

# Add Helicone headers
response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Hello!"}],
    headers={
        "Helicone-Auth": f"Bearer {helicone.api_key}",
        "Helicone-Cache-Enabled": "true"
    }
)

Use Cases

  • LLM Cost Monitoring: Track spending across different models and providers
  • Performance Optimization: Identify bottlenecks and optimize response times
  • Usage Analytics: Understand user patterns and application behavior
  • Debugging LLM Applications: Trace issues through complete request/response logs

Best Practices

  1. Enable Caching: Use Helicone’s caching to reduce costs and improve performance
  2. Set Up Alerts: Configure alerts for cost thresholds and error rates
  3. Use Custom Properties: Tag requests with user IDs or session information
  4. Monitor Rate Limits: Track API rate limit usage to avoid throttling
  5. Regular Analysis: Review analytics regularly to optimize your LLM usage

Integration Examples

With LangChain

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from langchain.llms import OpenAI
from langchain.callbacks import HeliconeCallbackHandler

# Set up Helicone callback
helicone_callback = HeliconeCallbackHandler(
    api_key="your-helicone-api-key",
    app_name="my-langchain-app"
)

# Use with LangChain
llm = OpenAI(
    openai_api_base="https://oai.hconeai.com/v1",
    callbacks=[helicone_callback]
)

response = llm("What is machine learning?")

Custom Properties

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import openai

response = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Hello!"}],
    headers={
        "Helicone-Auth": "Bearer your-api-key",
        "Helicone-Property-User": "user-123",
        "Helicone-Property-Session": "session-456",
        "Helicone-Property-Environment": "production"
    }
)

Resources

Alternatives

Weights & Biases

LangSmith

LLM application observability platform

Key Strengths:
• LangChain native integration
• Excellent debugging tools
Best For:
• LangChain application debugging
• LLM chain optimization
Intermediate

Quick Decision Guide

Choose Helicone for the recommended stack with proven patterns and comprehensive support.
Choose LangSmith if you need langchain application debugging or similar specialized requirements.