Computational Social Listening Lab | UPenn

Evaluating Large Language Models for Cyberbullying Behavior

Penn researchers developed “evaluator agents” — specialized AI systems that test large language models for potential cyberbullying behavior by generating nuanced, demographically diverse prompts. The study uncovered reasoning blind spots in some leading language models that could lead to harmful cyberbullying-adjacent outputs, with implications for responsible AI deployment. Featured: Shreya Havaldar, Eric Wong, Lyle Ungar.

Read the full story →