Learn how to get started with LMQL and write your first program.
2. Write Your First Query
A very simple Hello World LMQL query looks like this:
"Say 'this is a test':[RESPONSE]" where len(TOKENS(RESPONSE)) < 25
Say this is a test: RESPONSE This is a test
Note: You can click Open In Playground to run and experiment with this query.
This simple LMQL program consists of a single prompt statement and an associated
"Say 'this is a test'[RESPONSE]": Prompts are constructed using so-called prompt statements that look like top-level strings in Python. Template variables like
[RESPONSE]are automatically completed by the model. Apart from single-line textual prompts, LMQL also support multi-part and scripted prompts, e.g. by allowing control flow and branching behavior to control prompt construction. To learn more, see Scripted Prompting.
where len(RESPONSE) < 10: In this second part of the statement, users can specify logical, high-level constraints on the output. LMQL uses novel evaluation semantics for these constraints, to automatically translate character-level constraints like
len(RESPONSE) < 25to (sub)token masks, that can be eagerly enforced during text generation. To learn more, see Constraints.
3. Going Further
Extending on your first query above, you may want to add more complex logic, e.g. by adding a second part to the prompt. Further, you may want to employ a different decoding algorithm, e.g. to sample multiple trajectories of your program or use a different model.
Let's extend our initial query, to allow for these changes:
sample(temperature=1.2) "Say 'this is a test'[RESPONSE]" where len(TOKENS(RESPONSE)) < 25 if "test" not in RESPONSE: "You did not say 'test', try again:[RESPONSE]" where \ len(TOKENS(RESPONSE)) < 25 else: "Good job"
Going beyond what we have seen so far, this LMQL program extends on the above in a few ways:
sample(temperature=1.2): Here, we specify the decoding algorithm to use for text generation. In this case we use
sampledecoding with slightly increased temperature (>1.0). Above, we implicitly relied on deterministic
argmaxdecoding, which is the default in LMQL. To learn more about the different supported decoding algorithms in LMQL (e.g.
best_k), please see Decoders.
Prompt Program: The main body of the program remains the prompt. As before, we use prompt statements here, however, now we also make use of control-flow and branching behavior.
On each LLM call, the concatenation of all prompt statements so far, form the prompt used to generate a value for the currently active template variable like
RESPONSE. This means the LLM is always aware of the full prompt context so far, when generating a value for a template variable.
After a prompt statement has been executed, the contained template variables are automatically exposed to the surrounding program context. This allows you to react to model output and incorporate the results in your program logic. To learn more about this form of interactive prompting, please see Scripted Prompting.
These basic steps should get you started with LMQL. If you need more inspiration before writing your own queries, you can explore the examples included with the Playground IDE or showcased on the LMQL Website.