Skip to main content
All CollectionsWorkflowsAction Guides
Action: Extract Info From Text
Action: Extract Info From Text
Updated over 2 months ago

Overview

The Extract Info From Text action is a powerful tool that allows you to extract specific pieces of information from larger bodies of text or data sources. Similar to the Generate Text action, it is highly versatile and can be used across a wide range of Workflows and use cases. Whether you’re extracting answers to specific questions, pulling metadata like titles and descriptions from web pages, or isolating key sections or bullet points, this action gives you precise control over the information you need. By providing a prompt and background context, you can direct the model to extract exactly what you’re looking for. The Extract Info From Text action is often used in conjunction with other actions, such as web scraping or text generation, to build complex, multi-step Workflows.

Usage Examples

  • Extract answers to specific questions - Use the Extract Info From Text action to pull specific answers from a block of text. For example, you can input a text blob and a query like, “What are the real-world applications mentioned?” The model will return just the relevant bullet points or sentences that answer the question, without any additional text.

  • Retrieve metadata from web pages - If you need to extract SEO metadata like title tags and descriptions from a web page, you can use this action. After scraping the page content, input the relevant text and prompt the model to extract the metadata elements, such as “Extract the meta title and meta description from this HTML content.”

  • Summarize key points from text - Extract specific sections or key points from longer texts. For example, if you have a detailed blog post, you can prompt the action to “Extract the key bullet points summarizing the main ideas discussed.” The output will be a concise list of the main takeaways.

  • Extract data elements from structured or semi-structured text - If you need to pull out data points like dates, names, or locations from a text, you can craft a prompt to extract these specific elements. For example, “Extract all the dates mentioned in this meeting transcript” would return a list of the relevant dates.

Inputs

  • Prompt - The instructions provided to the model on how to perform the extraction task. Be sure to include the query, the specific question or information you want to extract from the text, as well as the text (the main text or content from which you want to extract specific information).

    Extract the answer to the question from the text and return just the answer, nothing else. Blog text: In the real world, large language models are being used for: • Content generation (articles, stories, scripts, etc.) • Summarization • Question answering • Language translation • Code generation ...
  • Background - The additional context or system prompt you provide to help the model better understand how to perform the extraction task.

    You are an AI assistant that excels at extracting specific information from text based on a query.
  • Model - The specific large language model to execute this extraction task. Choosing the right model can significantly impact the accuracy of the extracted information. Different models may excel at different types of extraction tasks(e.g. GPT 4o for structured information such as code/HTML, Anthropic Claude 3 Opus for long text).

Advanced Inputs

This action does not have any advanced inputs.

Outputs

The primary output of the Extract Info From Text action is the specific information or data that the model extracts from the provided text based on your prompt and context. The output format and content will vary depending on the task:

  • Extracted Answers - Direct answers or information extracted from the text in response to a query. For example, if asked, “What are the real-world applications mentioned?” the model will output just the relevant bullet points or sentences.

  • Extracted Metadata - Specific elements like title tags, meta descriptions, or other metadata from structured data sources such as HTML.

  • Extracted Summaries or Key Points - The action can also be used to generate concise summaries or lists of key points from a longer text, based on the specified prompt.

  • Extracted Data Elements - Pull out specific data points like dates, names, locations, or other details from unstructured or semi-structured text.

The output is highly variable and flexible, designed to meet your specific extraction needs. As with the Generate Text action, the quality and relevance of the output will depend on factors such as the quality of the prompt, the amount of context provided, the choice of model, and the temperature setting.

Troubleshooting

  • Extract action not working as expected - The extract action is designed to pull out specific information from a given text. If it's not working as expected, double-check that the prompt and background are properly formatted to extract the desired information. The action relies heavily on these inputs, so ensuring they are clear and specific is crucial.

  • Model selection issues - Different models excel at different tasks. If you are having trouble extracting certain types of information (e.g. code snippets, long-form answers), try changing the model to one better suited for that purpose. For example, GPT 4o may work better for HTML extraction, while Anthropic Claude 3 Opus could be a better fit for large text blocks.

  • Difficulty extracting metadata - When trying to extract metadata like titles and descriptions from web pages, make sure the scraping step is working properly first. The extract action can only operate on the text provided, so if the scrape didn't capture the metadata, the extract will fail. Verify the scrape URL and settings.

  • Hallucinations or inaccuracies - Inaccurate outputs can occur if the model lacks sufficient context or if the prompt is unclear. To mitigate this, provide as much relevant context as possible, follow prompt best practices, and use a low temperature setting during testing.

Related Actions

  • Extract Data From Text - This action allows you to extract specific data points or a list of data elements from a given text. It is ideal for pulling structured information from unstructured sources, making it a complementary tool to the Extract Info From Text action.

  • Scrape Webpage - The Scrape Webpage action allows you to extract text or HTML from a website by providing its URL. This enables you to programmatically scrape websites on a schedule or on-demand, with options to control the output format and behavior. Common use cases include extracting data from websites for analysis, monitoring changes over time, or repurposing web content for other applications.

  • Generate Text - The Generate Text action is the most versatile and flexible action, serving as a "pocket knife" that allows you to execute any custom prompt against any supported model. It gives you full control over how to leverage the model's capabilities to perform a wide variety of text generation tasks. As such, the Generate Text action is frequently used in many different types of Workflows where you need granular control over the model's inputs and outputs.

Did this answer your question?