A class that enables calls to the Ollama API to access large language models in a chat-like fashion. It extends the SimpleChatModel class and implements the OllamaInput interface.

Example

const prompt = ChatPromptTemplate.fromMessages([
[
"system",
`You are an expert translator. Format all responses as JSON objects with two keys: "original" and "translated".`,
],
["human", `Translate "{input}" into {language}.`],
]);

const model = new ChatOllama({
baseUrl: "http://api.example.com",
model: "llama2",
format: "json",
});

const chain = prompt.pipe(model);

const result = await chain.invoke({
input: "I love programming",
language: "German",
});

Hierarchy

Implements

  • OllamaInput

Constructors

Properties

CallOptions: OllamaCallOptions
ParsedCallOptions: Omit<OllamaCallOptions, never>
baseUrl: string = "http://localhost:11434"
caller: AsyncCaller

The async caller should be used by subclasses to make any async calls, which will thus benefit from the concurrency and retry logic.

model: string = "llama2"
verbose: boolean

Whether to print out response text.

callbacks?: Callbacks
embeddingOnly?: boolean
f16KV?: boolean
format?: StringWithAutocomplete<"json">
frequencyPenalty?: number
logitsAll?: boolean
lowVram?: boolean
mainGpu?: number
metadata?: Record<string, unknown>
mirostat?: number
mirostatEta?: number
mirostatTau?: number
numBatch?: number
numCtx?: number
numGpu?: number
numGqa?: number
numKeep?: number
numThread?: number
penalizeNewline?: boolean
presencePenalty?: number
repeatLastN?: number
repeatPenalty?: number
ropeFrequencyBase?: number
ropeFrequencyScale?: number
stop?: string[]
tags?: string[]
temperature?: number
tfsZ?: number
topK?: number
topP?: number
typicalP?: number
useMLock?: boolean
useMMap?: boolean
vocabOnly?: boolean

Accessors

  • get callKeys(): string[]
  • Keys that the language model accepts as call options.

    Returns string[]

Methods

  • Makes a single call to the chat model.

    Parameters

    • messages: BaseMessageLike[]

      An array of BaseMessage instances.

    • Optional options: string[] | OllamaCallOptions

      The call options or an array of stop sequences.

    • Optional callbacks: Callbacks

      The callbacks for the language model.

    Returns Promise<BaseMessage>

    A Promise that resolves to a BaseMessage.

  • Makes a single call to the chat model with a prompt value.

    Parameters

    • promptValue: BasePromptValue

      The value of the prompt.

    • Optional options: string[] | OllamaCallOptions

      The call options or an array of stop sequences.

    • Optional callbacks: Callbacks

      The callbacks for the language model.

    Returns Promise<BaseMessage>

    A Promise that resolves to a BaseMessage.

  • Generates chat based on the input messages.

    Parameters

    • messages: BaseMessageLike[][]

      An array of arrays of BaseMessage instances.

    • Optional options: string[] | OllamaCallOptions

      The call options or an array of stop sequences.

    • Optional callbacks: Callbacks

      The callbacks for the language model.

    Returns Promise<LLMResult>

    A Promise that resolves to an LLMResult.

  • Generates a prompt based on the input prompt values.

    Parameters

    • promptValues: BasePromptValue[]

      An array of BasePromptValue instances.

    • Optional options: string[] | OllamaCallOptions

      The call options or an array of stop sequences.

    • Optional callbacks: Callbacks

      The callbacks for the language model.

    Returns Promise<LLMResult>

    A Promise that resolves to an LLMResult.

  • A method that returns the parameters for an Ollama API call. It includes model and options parameters.

    Parameters

    Returns {
        format: undefined | StringWithAutocomplete<"json">;
        model: string;
        options: {
            embedding_only: undefined | boolean;
            f16_kv: undefined | boolean;
            frequency_penalty: undefined | number;
            logits_all: undefined | boolean;
            low_vram: undefined | boolean;
            main_gpu: undefined | number;
            mirostat: undefined | number;
            mirostat_eta: undefined | number;
            mirostat_tau: undefined | number;
            num_batch: undefined | number;
            num_ctx: undefined | number;
            num_gpu: undefined | number;
            num_gqa: undefined | number;
            num_keep: undefined | number;
            num_thread: undefined | number;
            penalize_newline: undefined | boolean;
            presence_penalty: undefined | number;
            repeat_last_n: undefined | number;
            repeat_penalty: undefined | number;
            rope_frequency_base: undefined | number;
            rope_frequency_scale: undefined | number;
            stop: undefined | string[];
            temperature: undefined | number;
            tfs_z: undefined | number;
            top_k: undefined | number;
            top_p: undefined | number;
            typical_p: undefined | number;
            use_mlock: undefined | boolean;
            use_mmap: undefined | boolean;
            vocab_only: undefined | boolean;
        };
    }

    An object containing the parameters for an Ollama API call.

    • format: undefined | StringWithAutocomplete<"json">
    • model: string
    • options: {
          embedding_only: undefined | boolean;
          f16_kv: undefined | boolean;
          frequency_penalty: undefined | number;
          logits_all: undefined | boolean;
          low_vram: undefined | boolean;
          main_gpu: undefined | number;
          mirostat: undefined | number;
          mirostat_eta: undefined | number;
          mirostat_tau: undefined | number;
          num_batch: undefined | number;
          num_ctx: undefined | number;
          num_gpu: undefined | number;
          num_gqa: undefined | number;
          num_keep: undefined | number;
          num_thread: undefined | number;
          penalize_newline: undefined | boolean;
          presence_penalty: undefined | number;
          repeat_last_n: undefined | number;
          repeat_penalty: undefined | number;
          rope_frequency_base: undefined | number;
          rope_frequency_scale: undefined | number;
          stop: undefined | string[];
          temperature: undefined | number;
          tfs_z: undefined | number;
          top_k: undefined | number;
          top_p: undefined | number;
          typical_p: undefined | number;
          use_mlock: undefined | boolean;
          use_mmap: undefined | boolean;
          vocab_only: undefined | boolean;
      }
      • embedding_only: undefined | boolean
      • f16_kv: undefined | boolean
      • frequency_penalty: undefined | number
      • logits_all: undefined | boolean
      • low_vram: undefined | boolean
      • main_gpu: undefined | number
      • mirostat: undefined | number
      • mirostat_eta: undefined | number
      • mirostat_tau: undefined | number
      • num_batch: undefined | number
      • num_ctx: undefined | number
      • num_gpu: undefined | number
      • num_gqa: undefined | number
      • num_keep: undefined | number
      • num_thread: undefined | number
      • penalize_newline: undefined | boolean
      • presence_penalty: undefined | number
      • repeat_last_n: undefined | number
      • repeat_penalty: undefined | number
      • rope_frequency_base: undefined | number
      • rope_frequency_scale: undefined | number
      • stop: undefined | string[]
      • temperature: undefined | number
      • tfs_z: undefined | number
      • top_k: undefined | number
      • top_p: undefined | number
      • typical_p: undefined | number
      • use_mlock: undefined | boolean
      • use_mmap: undefined | boolean
      • vocab_only: undefined | boolean
  • Create a new runnable sequence that runs each individual runnable in series, piping the output of one runnable into another runnable or runnable-like.

    Type Parameters

    • NewRunOutput

    Parameters

    Returns RunnableSequence<BaseLanguageModelInput, Exclude<NewRunOutput, Error>>

    A new runnable sequence.

  • Predicts the next message based on a text input.

    Parameters

    • text: string

      The text input.

    • Optional options: string[] | OllamaCallOptions

      The call options or an array of stop sequences.

    • Optional callbacks: Callbacks

      The callbacks for the language model.

    Returns Promise<string>

    A Promise that resolves to a string.

  • Predicts the next message based on the input messages.

    Parameters

    • messages: BaseMessage[]

      An array of BaseMessage instances.

    • Optional options: string[] | OllamaCallOptions

      The call options or an array of stop sequences.

    • Optional callbacks: Callbacks

      The callbacks for the language model.

    Returns Promise<BaseMessage>

    A Promise that resolves to a BaseMessage.

  • Stream all output from a runnable, as reported to the callback system. This includes all inner runs of LLMs, Retrievers, Tools, etc. Output is streamed as Log objects, which include a list of jsonpatch ops that describe how the state of the run has changed in each step, and the final state of the run. The jsonpatch ops can be applied in order to construct state.

    Parameters

    Returns AsyncGenerator<RunLogPatch, any, unknown>

  • Default implementation of transform, which buffers input and then calls stream. Subclasses should override this method if they can start producing output while input is still being generated.

    Parameters

    Returns AsyncGenerator<BaseMessageChunk, any, unknown>

Generated using TypeDoc