OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

The AI Training Data Arms Race: OpenAI's Quest for Human Baseline

In the rapidly evolving landscape of artificial intelligence (AI), the quality of training data has become a crucial factor in determining the performance of AI models. To bridge the gap between AI capabilities and real-world tasks, OpenAI, a leading AI research organization, has embarked on a mission to establish a human baseline for various tasks. This involves collecting real-world tasks from human contractors, which will serve as a benchmark for evaluating the performance of AI models.

The Project's Objective

OpenAI's project aims to create a comprehensive dataset of real-world tasks, which will enable the company to measure the performance of its AI models against human professionals across various industries. This human baseline will serve as a reference point for evaluating the capabilities of AI models, with the ultimate goal of achieving artificial general intelligence (AGI), an AI system that outperforms humans at most economically valuable tasks.

The Role of Contractors

To achieve this objective, OpenAI has hired a team of contractors across various occupations to collect real-world tasks modeled after those they have performed in their full-time jobs. The contractors are asked to describe tasks they have done in their current or past jobs and upload real examples of work they have produced. The examples should be concrete outputs, such as Word documents, PDFs, PowerPoint presentations, Excel spreadsheets, images, or code repositories.

The Task Request and Deliverable

According to OpenAI's presentation, real-world tasks have two components: the task request and the task deliverable. The task request is what a person's manager or colleague told them to do, while the task deliverable is the actual work they produced in response to that request. The company emphasizes that the examples contractors share should reflect real, on-the-job work that they have actually done.

Example of a Task

One example outlined in the OpenAI presentation is a task from a Senior Lifestyle Manager at a luxury concierge company for ultra-high-net-worth individuals. The goal is to prepare a short, 2-page PDF draft of a 7-day yacht trip overview to the Bahamas for a family who will be traveling there for the first time. The task includes additional details regarding the family's interests and what the itinerary should look like. The "experienced human deliverable" then shows what the contractor in this case would upload: a real Bahamas itinerary created for a client.

Data Scrubbing and Confidentiality

OpenAI instructs contractors to delete corporate intellectual property and personally identifiable information from the work files they upload. Under a section labeled "Important reminders," OpenAI tells workers to "remove or anonymize any: personal information, proprietary or confidential data, material nonpublic information (e.g., internal strategy, unreleased product details)." The company also provides advice on how to delete confidential information using a ChatGPT tool called "Superstar Scrubbing."

Risks and Implications

Evan Brown, an intellectual property lawyer with Neal & McDevitt, warns that AI labs that receive confidential information from contractors at this scale could be subject to trade secret misappropriation claims. Contractors who offer documents from their previous workplaces to an AI company, even scrubbed, could be at risk of violating their previous employers' nondisclosure agreements or exposing trade secrets.

The AI Training Data Arms Race

The documents reveal one strategy AI labs are using to prepare their models to excel at real-world tasks. Firms like OpenAI, Anthropic, and Google are hiring armies of contractors who can generate high-quality training data in order to develop AI agents capable of automating enterprise work. This has created a lucrative sub-industry within the AI training world, with Handshake valued at $3.5 billion in 2022 and Surge reportedly valued at $25 billion in fundraising talks last summer.

Forward-Looking Thoughts

As the AI training data arms race continues, it is essential to consider the implications of this trend. The collection and use of real-world tasks and data from human contractors raise important questions about data ownership, confidentiality, and intellectual property. As AI models become increasingly sophisticated, it is crucial to ensure that they are developed and deployed in a responsible and transparent manner. The future of AI development will depend on our ability to balance the benefits of AI with the risks and challenges associated with its development and deployment.

Conclusion

OpenAI's project to establish a human baseline for various tasks is a significant step towards achieving AGI. The collection and use of real-world tasks and data from human contractors raise important questions about data ownership, confidentiality, and intellectual property. As the AI training data arms race continues, it is essential to consider the implications of this trend and ensure that AI models are developed and deployed in a responsible and transparent manner.

Source: https://www.wired.com/story/openai-contractor-upload-real-work-documents-ai-agents/