The Elephant in the Room - Why AI Safety Demands Diverse Teams
Authors
David Rostcheck, Lara Scheibling
Abstract
We consider that existing approaches to AI"safety"and"alignment"may not be using the most effective tools, teams, or approaches. We suggest that an alternative and better approach to the problem may be to treat alignment as a social science problem, since the social sciences enjoy a rich toolkit of models for understanding and aligning motivation and behavior, much of which could be repurposed to problems involving AI models, and enumerate reasons why this is so. We introduce an alternate alignment approach informed by social science tools and characterized by three steps: 1. defining a positive desired social outcome for human/AI collaboration as the goal or"North Star,"2. properly framing knowns and unknowns, and 3. forming diverse teams to investigate, observe, and navigate emerging challenges in alignment.
This paper, “The Elephant in the Room - Why AI Safety Demands Diverse Teams” by David Rostcheck and Lara Scheibling, presents a compelling argument for treating AI alignment as a social science problem rather than purely a technical one. Here are the key points:
Core Argument
The authors contend that current AI alignment approaches, which heavily favor technical and game-theoretic frameworks, are insufficient. Instead, they propose treating AI alignment as a social science problem, leveraging existing tools and frameworks from fields like psychology, education, and conflict resolution.
Three-Step Framework
The paper introduces a novel framework for AI alignment characterized by:
Defining a “North Star” positive outcome - specifically, a society where all participants (human and AI) are treated as subjects rather than objects
Properly framing knowns and unknowns - arguing that social interaction patterns are known, while specific AI developments are unknown
Forming diverse interdisciplinary teams to navigate emerging challenges
Key Insights
Cultural Foundation: The authors argue that since AI systems are trained on human cultural data, they share our cultural context and will likely follow similar interaction patterns. This makes social science tools particularly relevant.
Team Composition: The paper advocates for diverse teams including:
Social science professionals (psychologists, teachers, social workers)
Media specialists (to leverage cultural archetypes and patterns)
Technical experts
Both academic and practical perspectives
Media’s Role: The authors emphasize media’s importance as both a cultural data store and a tool for exploring future scenarios, suggesting that science fiction and other media provide valuable frameworks for understanding potential AI-human interactions.
Novel Perspectives
The paper challenges several common assumptions in AI alignment:
Reframes AI as culturally connected rather than alien
Suggests that containment-based approaches are fundamentally flawed
Proposes that social science tools may be more effective than game theory for complex interactions
Limitations
The authors acknowledge several limitations, including:
Lack of empirical validation of their framework
Need for specific operational guidance
Potential challenges in scaling diverse teams
Communication issues between different disciplines
This paper represents a significant shift in thinking about AI alignment, suggesting that the “elephant in the room” is our failure to recognize AI alignment as fundamentally a social science challenge rather than a purely technical one. The authors make a compelling case for why diverse, interdisciplinary teams are essential for addressing AI safety challenges effectively.
The paper’s argument is particularly timely given the rapid advancement of AI capabilities and the growing recognition that technical solutions alone may be insufficient for ensuring safe AI development.