💬 Discord 📝 Blog 📖 Docs 📦 GitHub
Supercharge your prompting with
    AAAAAAAAAAA.
  • constraints.
  • a debugger.
  • decoders.
  • 🤗 Transformers.
  • templates.
  • retrieval.
  • interaction.
  • distributions.
  • token masking.
  • control flow.

LMQL is a programming language for language model interaction.

👴 Tell A Joke 🌴 Packing List 🧠 Chain-Of-Thought 👩‍🔬 Meta Prompting 🧮 Calculator 🌎 Wikipedia Search 📖 Key-Value Memory 📊 Distributions 🗣️ Chatbot

LMQL Open In Playground

argmax
"""A list of good dad jokes. A indicates the punchline Q: How does a penguin build its house? A: Igloos it together. Q: Which knight invented King Arthur's Round Table? A: Sir Cumference. Q:[JOKE]
A:[PUNCHLINE]"""
from "openai/text-davinci-003"
where len(JOKE) < 120 and STOPS_AT
(JOKE, "?") and STOPS_AT(PUNCHLINE, "\n") and len(PUNCHLINE) > 1

Model Output

A list of good dad jokes. A indicates the punchline
Q: How does a penguin build its house?
A: Igloos it together.
Q: Which knight invented King Arthur's Round Table?
A: Sir Cumference.
Q:
JOKE What did the fish say when it hit the wall?
A:
PUNCHLINE Dam! Highlighted text is model output.

LMQL Open In Playground

sample(temperature=0.8)
   "A list of things not to forget when
     going to the sea (not travelling): \n"
   "- Sunglasses \n"
   for i in range(4):
"- [THING] \n" from 'openai/text-ada-001' where THING in set
(["Volleyball", "Sunscreen", "Bathing Suite"])

Model Output

A list of things not to forget when going to the sea (not travelling):
- Sunglasses
-
THING Sunscreen
-
THING Volleyball
-
THING Sunscreen
-
THING Volleyball
Highlighted text is model output.

LMQL Open In Playground

# zero-shot cot based on https://arxiv.org/pdf/2205.11916.pdf
argmax
   """Q: 
It was Sept. 1st, 2021 a week ago. What is the date 10 days ago in MM/DD/YYYY? Answer Choices: (A) 08/29/2021 (B) 08/28/2021 (C) 08/29/1925 (D) 08/30/2021 (E) 05/25/2021 (F) 09/19/2021 A: Let's think step by step."""
"[REASONING]\n" "Therefore, among A through F, the answer is[RESULT]" assert RESULT == "A"
from "openai/text-davinci-003" where RESULT in ["A", "B", "C", "D", "E", "F"]

Model Output

Q: It was Sept. 1st, 2021 a week ago. What is the date 10 days ago in MM/DD/YYYY?
Answer Choices: (A) 08/29/2021 (B) 08/28/2021 (C) 08/29/1925 (D) 08/30/2021 (E) 05/25/2021 (F) 09/19/2021
A: Let's think step by step.
REASONING

Sept. 1st, 2021 was a week ago, so 10 days ago would be 8 days before that, which is August 23rd, 2021.

Therefore, the answer is (A) 08/23/2021.

Therefore, among A through F, the answer is
RESULT A Highlighted text is model output.

LMQL Open In Playground

# metaprompting based on https://arxiv.org/pdf/2102.07350.pdf
beam(n=2)
"Q: What are Large Language Models?\n\n" "A good person to answer this question would be[EXPERT]\n\n" expert_name = EXPERT.rstrip(".\n")
"For instance,{expert_name}
would answer[ANSWER]"
from "openai/text-davinci-001" where STOPS_AT(EXPERT, ".") and STOPS_AT(EXPERT, "\n") and STOPS_AT(ANSWER, ".")

Model Output

Q: What are Large Language Models?

A good person to answer this question would be
EXPERT a natural language processing (NLP) engineer.

For instance, a natural language processing (NLP) engineer would answer
ANSWER that large language models are a type of machine learning algorithm that are used to predict the next word in a sentence. Highlighted text is model output.

LMQL Open In Playground

import re
from lmql.demo import gsm8k_samples

def calc(expr):
expr = re.sub(r"[^0-9+\-*/().]", "", expr) return eval(expr) argmax(openai_chunksize=64, max_len=2048) QUESTION = "Josh decides to try flipping a house. He buys a house for $80,000 and then puts in $50,000 in repairs. This increased the value of the house by 150%. How much profit did he make?" # few shot samples "{gsm8k_samples()}" # prompt template "Q: {QUESTION}\n" "Let's think step by step.\n" for i in range(4): "[REASON_OR_CALC]" if REASON_OR_CALC.endswith("<<"): " [EXPR]" " {calc(EXPR)
}>>"
elif REASON_OR_CALC.endswith("So the answer"): break "is[RESULT]" from 'openai/text-davinci-003' where STOPS_AT(REASON_OR_CALC, "<<") and STOPS_AT(EXPR, "=") and STOPS_AT(REASON_OR_CALC, "So the answer")

Model Output

...en puts in $50,000 in repairs. This increased the value of the house by 150%. How much profit did he make?
Let's think step by step.
REASON_OR_CALC Josh bought the house for $80,000 and put in $50,000 in repairs.
The value of the house increased by 150%, so the new value of the house is $80,000 + 150% of $80,000 = <<
EXPR 80,000 + (80,000*1.5) = 200000.0>> REASON_OR_CALC $200,000.
The profit Josh made is the difference between the new value of the house and the amount he spent on it, which is $200,000 - $80,000 - $50,000 = <<
EXPR 200,000 - 80,000 - 50,000 = 70000>> REASON_OR_CALC $70,000.
So the answer
is RESULT $70,000. Highlighted text is model output.

LMQL Open In Playground

async def wikipedia(q):
from lmql.http import fetch try: q = q.strip("\n '.") pages = await fetch(f"https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles={q}&origin=*", "query.pages") return list(pages.values())[0]["extract"][:280] except: return "No results" argmax "Q: From which countries did the Norse originate?\n" "Action: Let's search Wikipedia for the term '[TERM]\n" result = await wikipedia(TERM) "Result: {result}
\n"
"Final Answer:[ANSWER]" from "openai/text-davinci-003" where STOPS_AT(TERM, "'")

Model Output

Q: From which countries did the Norse originate?
Action: Let's search Wikipedia for the term '
TERM Norse'.
Result: Norse is a demonym for Norsemen, a medieval North Germanic ethnolinguistic group ancestral to modern Scandinavians, defined as speakers of Old Norse from about the 9th to the 13th centuries.
Norse may also refer to:
Final Answer:
ANSWER The Norse originated from North Germanic countries, including Denmark, Norway, Sweden, and Iceland. Highlighted text is model output.

LMQL Open In Playground

# simple kv storage
storage = {} def assign(key, value): storage[key] = value; return f'{{{key}: "{value}"}}' def get(key): return storage.get(key) argmax(n=1, openai_chunksize=128, max_len=2048, step_budget=4*2048) """In your reasoning you can use actions.
You do this as follows: `action_name(<args>) # result: <inserted result>` To remember things, you can use 'assign'/'get': - To remember something: `assign("Alice", "banana") # result: "banana"` - To retrieve a stored value: `get("Alice") # result: "banana"` Always tail calls with " # result". Using these actions, let's solve the following question. Q: Alice, Bob, and Claire are playing a
game. At the start of the game, they are each holding a ball: Alice has a black ball, Bob has a brown ball, and Claire has a blue ball. \n\nAs the game progresses, pairs of players trade balls. First, Bob and Claire swap balls. Then, Alice and Bob swap balls. Finally, Claire and Bob swap balls. At the end of the game, what ball does Alice have? A: Let's think step by step.\n"""
for i in range(32):
"[REASONING]" if REASONING.endswith("# result"): cmd = REASONING.rsplit("`",1)[-1] cmd = cmd[:-len("# result")] "{eval(cmd)}`\n" else: break """Therefore at the end of the game, Alice has the[OBJECT]""" assert "blue ball." in OBJECT from "openai/text-davinci-003" where STOPS_AT(REASONING, "# result") and STOPS_AT(REASONING, "Therefore, ") and STOPS_AT(OBJECT, ".") and STOPS_AT(OBJECT, ",")

Model Output

...he end of the game, what ball does Alice have?
A: Let's think step by step.
REASONING
At the start of the game:
`assign("Alice", "black") # result
{Alice: "black"}`
REASONING `assign("Bob", "brown") # result {Bob: "brown"}`
REASONING `assign("Claire", "blue") # result {Claire: "blue"}`
REASONING
After Bob and Claire swap balls:
`assign("Bob", "blue") # result
{Bob: "blue"}`
REASONING `assign("Claire", "brown") # result {Claire: "brown"}`
REASONING
After Alice and Bob swap balls:
`assign("Alice", "blue") # result
{Alice: "blue"}`
REASONING `assign("Bob", "black") # result {Bob: "black"}`
REASONING
After Claire and Bob swap balls:
`assign("Claire", "black") # result
{Claire: "black"}`
REASONING `assign("Bob", "brown") # result {Bob: "brown"}`
REASONING
At the end of the game, Alice has a blue ball:
`get("Alice") # result
blue`
Therefore at the end of the game, Alice has the
OBJECT blue ball. Highlighted text is model output.

LMQL Open In Playground

argmax
   """Review: We had a great stay. Hiking
     in the mountains was fabulous and the
     food is really good.\n
   Q: What is the underlying sentiment of
     this review and why?\n
   A:[ANALYSIS]\n
   Based on this, the overall sentiment of
     the message can be considered to be[CLASSIFICATION]"""
from 
   "openai/text-davinci-003"
distribution
CLASSIFICATION in [" positive", " neutral", " negative"]

Model Output

Review: We had a great stay. Hiking in the mountains was fabulous and the food is really good.

Q: What is the underlying sentiment of this review and why?

A:
ANALYSIS The underlying sentiment of this review is positive because the reviewer enjoyed their stay, the hiking, and the food.

Based on this, the overall sentiment of the message can be considered to be
CLASSIFICATION
P(CLASSIFICATION) =
- positive 0.9998711120293567
- neutral 0.00012790777085508993
- negative 9.801997880775052e-07
Highlighted text is model output.

LMQL Open In Playground

argmax 
   "{:system}
You are a marketing chatbot for the language model query language (LMQL)."
for i in range(10): "{:user} {await input()
}"
"{:assistant} [ANSWER]" from "chatgpt"

Model Output

<lmql:system/> You are a marketing chatbot for the language model query language (LMQL).<lmql:user/> What is the best way to interact with LLMs?<lmql:assistant/>ANSWER The best way to interact with LLMs (Language Model Models) is through a query language like LMQL. LMQL allows you to easily and efficiently query large language models and retrieve the information you need. With LMQL, you can specify the input text, the output format, and the model you want to use , all in a single query. This makes it easy to integrate LLMs into your applications and workflows, and to get the most out of these powerful language models. Additionally, LMQL provides a standardized way of interacting with LLMs, which makes it easier for developers and data scientists to collaborate and share their work .
Highlighted text is model output.
Learn more about LMQL by hovering over underlined parts of the code.

LMQL is a project by the SRI Lab at ETH Zürich.

Getting Started with LMQL

LMQL is available in a web-based Playground IDE or can be installed via the Python package manager:

Run Locally

pip install lmql
To run LMQL locally, read the Getting Started section of the documentation.

This is an early relase of LMQL and we welcome your feedback and contributions.
Reach out via our Community Discord, GitHub Issues, or Twitter.

Scripted Prompts

LMQL generalizes natural language prompting, making it more expressive while remaining accessible. For this, LMQL builds on top of Python, allowing users to express natural language prompts that also contain code. The resulting queries can be directly executed on language models like OpenAI's GPT models. Fixed answer templates and intermediate instructions allow the user to steer the LLM's reasoning process.

Learn more about Scripting

for i in range(n): "[THOUGHT]" if THOUGHT.endswith("<<"): " [EXPR]" " {calc(EXPR)}>>" elif THOUGHT.endswith("So the answer"): break "is[RESULT]"

Output Constraints

In LMQL, users can specify high-level, logical constraints over the language model output. These constraints are then automatically converted into token-level prediction masks, which can be enforced eagearly during text generation. This allows to enforce many constraints strictly, making it impossible for the model to generate content that does not satisfy the requirements. This simplifies multi-part prompting and integration, as it provides better guarantees about the output format.

Learn more about Constraints

Playground and Debugger

LMQL includes a Playground IDE for query development. This enables users to inspect the interpreter state, validation result and model results at any point during generation, e.g. to inspect the different hypotheses explored during beam search.

Efficiency

Using novel, partial evaluation semantics, LMQL evalutes and controls the LM decoding process on a token level, leading to significant efficiency gains over existing approaches. Compared with 🤗 Transformers' generate(), LMQL can save up to 80% in consumed tokens by optimizing multi-part prompts end-to-end, while maintaining a high-level, declarative syntax.

Frontend/Backend Separation

LMQL provides a high-level frontend to interact with language models, making query code portable and model-agnostic. This is achieved by abstracting over model-specific implementation details like batching, decoding and tokenization.

The actual language model runs out-of-process or even remotely, allowing for easy development and quick prototyping.

Read The Research Paper
Prompting Is Programming: A Query Language For Large Language Models
Accepted at ACM SIGPLAN PLDI'23
SRIlab @ ETH Zürich, Switzerland

Large language models have demonstrated outstanding performance on a wide range of tasks such as question answering and code generation. On a high level, given an input, a language model can be used to automatically complete the sequence in a statistically-likely way. Based on this, users prompt these models with language instructions or examples, to implement a variety of downstream tasks. Advanced prompting methods can even imply interaction between the language model, a user, and external tools such as calculators. However, to obtain state-of-the-art performance or adapt language models for specific tasks, complex task- and model-specific programs have to be implemented, which may still require ad-hoc interaction.

Based on this, we present the novel idea of Language Model Programming (LMP). LMP generalizes language model prompting from pure text prompts to an intuitive combination of text prompting and scripting. Additionally, LMP allows constraints to be specified over the language model output. This enables easy adaption to many tasks, while abstracting language model internals and providing high-level semantics.

To enable LMP, we implement LMQL (short for Language Model Query Language), which leverages the constraints and control flow from an LMP prompt to generate an efficient inference procedure that minimizes the number of expensive calls to the underlying language model.

We show that LMQL can capture a wide range of state-of-the-art prompting methods in an intuitive way, especially facilitating interactive flows that are challenging to implement with existing high-level APIs. Our evaluation shows that we retain or increase the accuracy on several downstream tasks, while also significantly reducing the required amount of computation or cost in the case of pay-to-use APIs (26-85% cost savings).

Read Full Paper

Experimental Results
Compared to standard decoding using 🤗 Transformers' generate() function, LMQL allows for high-level control and requires less tokens to be processed.

Chain-Of-Thought reasoning with LMQL vs. standard decoding.
Query statistics of using LMQL for interactive language model querying vs. standard decoding.
*We estimate cost savings based on the current token price of $0.02/1K tokens of the GPT-3 davinci model.

BibTeX


@article{beurer2022prompting,
  title={Prompting Is Programming: A Query Language For Large Language Models},
  author={Beurer-Kellner, Luca and Fischer, Marc and Vechev, Martin},
  journal={{PLDI} '23},
  year={2022}
}