LMQL is a high-level, front-end language for text generation. This means that LMQL is not specific to any particular text generation model. Instead, we support a wide range of text generation models on the backend, including OpenAI model, llama.cpp and HuggingFace Transformers.
To load models in LMQL, you can use the
lmql.model(...) function which gives you an
lmql.model("openai/gpt-3.5-turbo-instruct") # OpenAI API model lmql.model("random", seed=123) # randomly sampling model lmql.model("llama.cpp:<YOUR_WEIGHTS>.gguf") # llama.cpp model lmql.model("local:gpt2") # load a `transformers` model in-process lmql.model("local:gpt2", cuda=True, load_in_4bit=True) # load a `transformers` model in process with additional arguments lmql.model("gpt2") # access a `transformers` model hosted via `lmql serve-model`
LMQL supports multiple inference backends, each of which has its own set of parameters. For more details on how to use and configure the different backends, please refer to one of the following sections:
Specifying The Model
After creating an
lmql.LLM object, you can pass it to a query program to specify the model to use during execution. There are two ways to do this:
Option A: Specifying the Model Externally
You can specify the model and its parameters externally, i.e. separately from the actual program code:
import lmql # uses 'chatgpt' by default def tell_a_joke(): '''lmql """A great good dad joke. A indicates the punchline Q:[JOKE] A:[PUNCHLINE]""" where STOPS_AT(JOKE, "?") and \ STOPS_AT(PUNCHLINE, "\n") ''' tell_a_joke() # uses chatgpt tell_a_joke(model=lmql.model("openai/text-davinci-003")) # uses text-davinci-003
tell_a_joke query will use ChatGPT by default, but can still be configured to use a different model by passing it as an argument to the query function on invocation.
Option B: Queries with
You can specify the model as part of the query itself. For this, you can use
from in combination with the indented syntax. This can be particularly useful, if your choice of model is intentional and should be part of your program.
argmax "This is a query with a specified 'from'-clause: [RESPONSE]" from "openai/text-ada-001"
Here, we specify
"openai/text-ada-001" directly, but the shown snippet is equivalent to the use of
Note, that the
from keyword is only available with the indented standalone syntax as shown here, where the decoder keywords has to be provided explicitly.
To specify the model when running in the playground, you can use the model dropdown available in the top right of the program editor, to set and override the
model parameter of your query program:
Adding New Model Backends
Due to the modular design of LMQL, it is easy to add support for new models and backends. If you would like to propose or add support for a new model API or inference engine, please reach out to us via our Community Discord or via email@example.com.