2.1 Flows: Orchestrating Multi-Step Interactions

So far, we have explored calling ai.generate() for single-turn interactions from our application code. However Genkit also provides a powerful abstraction called Flows that allows us to orchestrate multi-step interactions with language models.

Flows provide a number of benefits out of the box:

Type-safe inputs and outputs using Zod schemas (Runtime validation and TypeScript types)
Streaming support, allowing us to send partial responses as they are generated by the model
Developer UI Support for testing and debugging flows
Easy deployment into different environments i.e. express, cloud functions, serverless, etc.

A flow is a wrapper around a Javascript/TypeScript function and in every essence behaves like one. But you are probably wondering why we need it, this is because it adds a number of parameters that are a quality-of-life improvements for working with Large Language Models. For instance, inputs and outputs can be defined using Zod schemas, allowing for runtime validation and TypeScript type inference.

For example, in the previous example, where we summarized a web page, we can re-write it as a flow like so:

ai.defineFlow(
  {
    name: 'SummarizeWebPage',
    inputSchema: z.object({
      url: z.string().url().describe('The URL of the web page to summarize'),
    }),
    outputSchema: z.object({
      summary: z.string().describe('A concise summary of the web page'),
    }),
  },
  async ({ url }) => {
    const html = await c.fromURL(url);

    // extract the main content from the HTML
    const htmlContent = html('body').text();

    // convert HTML to plain text, using html-to-text
    const content = convert(htmlContent);

    console.log({ content });

    const { text } = await ai.generate({
      prompt: `
      Summarize the following web page, in a concise manner.

      Context:
      ${content}
    `,
    });

    return { summary: text };
  },
);

A few things to note here:

We define the flow using ai.defineFlow(), which takes a configuration object and an async function - the function that implements the flow logic and will be called when executing the flow.
The configuration object contains the name of the flow, an inputSchema and an outputSchema, both defined using Zod. This allows Genkit to validate the inputs and outputs at runtime, and also infer TypeScript types for us.
- The flow will enforce that the input to the function matches the inputSchema, throwing an error if it does not.
- Similarly, the output of the function must match the outputSchema.
The function itself is similar to what we had before, but now it receives the inputs as parameters, destructured from an object.

Zod Schemas

Genkit leverages Zod schemas to define the structure of inputs and outputs for flows. This provides several advantages:

Runtime Validation: Inputs and outputs are validated at runtime, ensuring that they conform to the expected structure.
Type Inference: TypeScript types are automatically inferred from the Zod schemas, providing type safety and autocompletion in your IDE.
Clear Documentation: The schemas serve as clear documentation of what the flow expects and returns. This is especially useful, not only for developers but also LLMs, when implementing tool calling, as we will see later.

Zod is a very powerful schema definition library, and we recommend checking out the Zod documentation to learn more about its capabilities.

Streaming LLM Responses

One of the powerful features of Genkit Flows is built-in support for streaming LLM responses. This allows you to send partial responses to the client as they are generated by the model, providing a more responsive user experience.

LLMs take their sweet time to generate responses, and in many cases, waiting for the entire response to be generated before sending it to the client can lead to a sluggish user experience. With streaming, you can start sending parts of the response as soon as they are available.

This is particularly useful for long-form content generation, where users can start reading the content while the rest is still being generated. This is a much better user experience compared to waiting for the entire content to be ready.

Flows support streaming out of the box. And all you need to do is use the ai.generateStream() method instead of ai.generate() when invoking the LLM within the flow. This will return a stream that you can iterate over to get the partial responses. And the flow function itself can accept a second parameter, a callback function, which can be used to send the partial responses to the caller. Here is how we can modify the previous flow to support streaming:

ai.defineFlow(
  {
    name: 'SummarizeWebPage',
    inputSchema: z.object({
      url: z.string().url(),
    }),
    outputSchema: z.object({
      summary: z.string(),
    }),
  },
  async ({ url }, streamFn) => {
    const html = await c.fromURL(url);

    // extract the main content from the HTML
    const htmlContent = html('body').text();

    // convert HTML to plain text, using html-to-text
    const content = convert(htmlContent);

    console.log({ content });

    const { stream, response } = ai.generateStream({
      prompt: `
      Summarize the following web page, in a concise manner.

      Context:
      ${content}
    `,
    });

    for await (const chunk of stream) {
      streamFn(chunk);
    }

    const { text } = await response;

    return { summary: text };
  },
);

A few caveats to note here:

The model you are using must support streaming. Not all models do, especially older ones. Make sure to check the model documentation to see if it supports streaming.
The caller of the flow must also support receiving streamed responses. This is usually done via Server-Sent Events (SSE) or WebSockets. We will see how to do this later for different environments and frontend frameworks.
Please also note that streaming is optional. You can still use flows without streaming if you prefer the traditional request-response model.

Genkit Developer UI

One of the most powerful Genkit features is the Developer UI, which provides a visual interface for testing and debugging flows. The Dev UI allows you to easily invoke flows with different inputs, view the outputs, and inspect the execution details.

This is incredibly useful for development and debugging, as it allows you to quickly iterate on your flows and see how they behave with different inputs.

Please note that the Genkit Dev UI also supports other Genkit abstractions such as prompts and tools. However, in this section, we will focus on its capabilities for flows. We will explore its other features in later sections, as we cover those abstractions.

To start the Dev UI, you can use the following command:

npx genkit start -o -- npx tsx ./path/to/file.ts

Where ./path/to/file.ts is the entry point of your Genkit application. The -o flag opens the Dev UI in your default web browser. And the npx tsx ./path/to/file.ts part runs compiles the TypeScript code on the fly, allowing you to run the Dev UI without needing to build your project first and also ensures that any changes you make to your code are reflected in the Dev UI immediately, without needing to restart the Dev UI server.

Once the Dev UI is running, you can navigate to the “Flows” section to see a list of all defined flows. You can select a flow to view its details, including the input and output schemas.

[TODO: Insert screenshot of Dev UI Flows section here]

From here, you can enter different inputs to test the flow. The Dev UI will validate the inputs against the defined schema and display any validation errors. Once the inputs are valid, you can invoke the flow and view the outputs.

[TODO: Insert screenshot of Dev UI Flow invocation here]

Deploying Flows from Code

Once you have defined a flow, you can invoke it from your application code just like a regular function. Genkit takes care of all the underlying details, including input validation, output validation, and streaming (if enabled).

For example, to invoke the SummarizeWebPage flow we defined earlier, you can do the following:

const { summary: text } = await summarizeWebPage({ url });

This will call the flow with the provided input, validate the input and output, and return the result. This can be called in any JavaScript/TypeScript server environment, such as Node.js, Deno, Cloud Functions, Serverless, etc.

We can also invoke the flow in a streaming manner, like so:

const stream = summarizeWebPage.stream({ url });
for await (const chunk of stream) {
  // send chunk to client via SSE or WebSocket
}

This will return a stream that you can iterate over to get the partial responses as they are generated by the model.

On top of that, Genkit provides easy deployment options for flows into different such as express and firebase functions. For express, you can use the “ npm package as shown below:

// ...other imports
import { expressHandler } from '@genkit-ai/express';

app.post('/summarize', expressHandler(summarizeWebPage));

This will create an express route that invokes the summarizeWebPage flow when a POST request is made to /summarize. You can then call this endpoint from your frontend application as shown below:

Curl Example
Typescript (Fetch)

  curl --request POST \
    --url http://localhost:3000/summarize \
    --header 'content-type: application/json' \
    --data '{
    "data": {
      "url": "https://genkit.dev/docs/devtools/"
    }
  }'

  const fetch = require('node-fetch');

  const url = 'http://localhost:3000/summarize';
  const options = {
    method: 'POST',
    headers: {'content-type': 'application/json'},
    body: '{"data":{"url":"https://genkit.dev/docs/devtools/"}}'
  };

  try {
    const response = await fetch(url, options);
    const data = await response.json();
    console.log(data);
  } catch (error) {
    console.error(error);
  }

A few things to note here:

The request body must contain a data field, which holds the input to the flow.
The response will contain the output of the flow, under the result field.
Error handling is also done automatically, with validation errors and other exceptions being returned in the response, along with appropriate HTTP status codes.

On top of serving a single flow, the @genkit-ai/express package also provides a way to serve multiple flows under a single endpoint, using the flow name to route the requests. This is done using the startFlowServer() function, where you provide an array of flows to serve, among other options. We can adapt the previous example to serve multiple flows, as shown below:

import { startFlowServer } from '@genkit-ai/express';
// ... other imports

const summarizeWebPage = ai.defineFlow(
  {
    name: 'SummarizeWebPage', // <-- the name is important here, as it will be used for routing
    inputSchema: z.object({
      url: z.string().url(),
    }),
    outputSchema: z.object({
      summary: z.string(),
    }),
  },
  async ({ url }, streamFn) => {
    // ... flow implementation, the same as before
  },
);

// ... define other flows

// start the flow server with multiple flows
startFlowServer({
  flows: [
    // list of flows to serve
    summarizeWebPage,
  ],
  port: 3000,
});

A few things to note here:

Each flow must have a unique name, as this will be used for routing the requests. So, if our server is running on http://localhost:3000, we can invoke the SummarizeWebPage flow by making a POST request to http://localhost:3000/SummarizeWebPage, and similarly for other flows based on their names.
The request and response formats remain the same as before, with the input provided in the data field of the request body, and the output returned in the result field of the response body.
The startFlowServer() function also accepts other options, such as:
- cors - to enable CORS for the flow endpoints - important for frontend applications that will be calling the flows from a different origin.
- pathPrefix - for example, if you want all flow routes to be prefixed with /api/flows, you can set the prefix option to /api/flows.
- jsonParserOptions - to customize the JSON body parser used for parsing the request bodies.

This makes it easy to deploy multiple flows in a single server, providing a clean and organized way to manage your Genkit-powered backend.

Here is the curl example for calling the flow endpoint:

curl --request POST \
  --url http://localhost:3000/SummarizeWebPage \
  --header 'content-type: application/json' \
  --data '{
  "data": {
    "url": "https://genkit.dev/docs/devtools/"
  }
}'

With this, from our frontend application, we can easily send HTTP requests to the flow endpoints to invoke the flows and get the results. We can also use Genkit’s client library to stream and run the flows directly from the frontend, which we will explore in later sections.

import { runFlow, streamFlow } from 'genkit/beta/client';

// Running the flow
const { summary } = await runFlow('SummarizeWebPage', {
  url: 'localhost:8080/SummarizeWebPage',
  input: { url: 'https://genkit.dev/docs/devtools/' },
});

// Streaming the flow
const streamResult = streamFlow({
  url: `localhost:8080/SummarizeWebPage`,
  input: { url: 'https://genkit.dev/docs/devtools/' },
});

// once we have the streamResult, we can iterate over the stream to get
// the partial responses, which we can send to the UI as they arrive, like so:
for await (const chunk of streamResult.stream) {
  console.log(chunk);
}

console.log(await streamResult.output);

Don’t worry, we will explore this in the next section, after we work on our next recipe.

Recap

In this section, we explored Genkit Flows, a powerful abstraction for orchestrating multi-step interactions with language models. We saw how flows provide type-safe inputs and outputs using Zod schemas, built-in support for streaming LLM responses, and easy deployment into different environments such as express.

We also explored the Genkit Developer UI, which provides a visual interface for testing and debugging flows. The Dev UI allows you to easily invoke flows with different inputs, view the outputs, and inspect the execution details.

Next, we are going to work on a recipe that demonstrates how to build a multi-step interaction using Genkit Flows. We will see how to define a flow that orchestrates multiple calls to the LLM, handles inputs and outputs, and supports streaming.

🚀 Get the Complete Code

This recipe includes a full GitHub repository with working code, setup instructions, and examples. Unlock access to build faster and support this cookbook!

Complete, tested code repositories

Copy-paste ready examples

Lifetime access with updates

Support future recipes & content

Unlock All Recipes - 50% Off