Worked with Perplexity to create prompts for training and evaluating LLMs

Overview

SoftAge created prompts for training and evaluating large language models (LLMs).

Prompts are critical for guiding LLMs towards desired outputs, shaping their understanding and capabilities. We partnered with Perplexity, to create a diverse and high-quality dataset of prompts.

Challenge

Maintaining prompt diversity is essential for effective LLM training. Diverse prompts expose the model to various scenarios, fostering a broader range of skills and preventing bias. However, ensuring 100% natural and authentic prompts became a challenge as our workforce approached creative limitations after generating a significant volume of prompts.

Solution

To address this challenge, SoftAge implemented a two-pronged approach:

  • In-House Quality Control Tool: We developed a proprietary tool to identify plagiarism and AI-generated content within the prompt dataset. This ensured the prompts were original and reflected human creativity.
  • Creative Inspiration Process: We built a process curating high-quality, creative websites and content across various domains. This served as a source of inspiration for our workforce, helping them generate fresh and diverse prompts for Perplexity's LLMs.

Result

Through this collaboration, SoftAge successfully created a diverse dataset containing over 50 domains and catering to hundreds of LLM use cases. Perplexity's LLMs benefited from the richness and variety of the prompts.