Skip to main content

OpenLLM

🦾 OpenLLM is an open platform for operating large language models (LLMs) in production. It enables developers to easily run inference with any open-source LLMs, deploy to the cloud or on-premises, and build powerful AI apps.

Installation

Install openllm through PyPI

%pip install --upgrade --quiet  openllm

Launch OpenLLM server locally

To start an LLM server, use openllm start command. For example, to start a dolly-v2 server, run the following command from a terminal:

openllm start microsoft/Phi-3-mini-4k-instruct --trust-remote-code

Wrapper

from langchain_community.llms import OpenLLMAPI

server_url = "http://localhost:3000" # Replace with remote host if you are running on a remote server
llm = OpenLLMAPI(server_url=server_url)
API Reference:OpenLLMAPI

Optional: Local LLM Inference

You may also choose to initialize an LLM managed by OpenLLM locally from current process. This is useful for development purpose and allows developers to quickly try out different types of LLMs.

When moving LLM applications to production, we recommend deploying the OpenLLM server separately and access via the server_url option demonstrated above.

To load an LLM locally via the LangChain wrapper:

from langchain_community.llms import OpenLLM

llm = OpenLLM(
model_id="microsoft/Phi-3-mini-4k-instruct",
temperature=0.94,
repetition_penalty=1.2,
)
API Reference:OpenLLM

Integrate with a LLMChain

from langchain_core.prompts.prompt import PromptTemplate

template = "What is a good name for a company that makes {product}?"

prompt = PromptTemplate.from_template(template)

chain = prompt | llm

generated = chain.invoke(dict(product="mechanical keyboard"))
print(generated)
API Reference:PromptTemplate
iLkb

Was this page helpful?


You can leave detailed feedback on GitHub.