Couverture de Ep. 12: The Cyborg Behavioral Scientist

Ep. 12: The Cyborg Behavioral Scientist

Ep. 12: The Cyborg Behavioral Scientist

Écouter gratuitement

Voir les détails

3 mois pour 0,99 €/mois

Après 3 mois, 9.95 €/mois. Offre soumise à conditions.

À propos de ce contenu audio

Tomaino, Cooke and Hoover used ChatGPT-4, Bing Copilot, and Google Gemini to execute an entire research project from initial idea to final manuscript. They documented what these systems accomplished and where they failed across six stages of academic work. This paper is a reflective, empirical probe into the limits of AI as a research collaborator. It offers a clear-eyed diagnosis of what’s currently possible, what’s still missing, and why the human researcher remains essential not just for quality, but for meaning.TL;DR? AI can mimic scientific work convincingly while fundamentally misunderstanding what makes it meaningful.Article: Tomaino, G., Cooke, A. D. J., & Hoover, J. (2025). AI and the advent of the cyborg behavioral scientist. Journal of Consumer Psychology, 35, 297–315. Available at SSRN. Detailed notes1. Purpose and setupThe paper sets out to examine whether Large Language Models (LLMs) can meaningfully perform the tasks involved in behavioural science research. Rather than speculate, the authors designed a practical test: conduct an entire behavioural research project using AI tools (ChatGPT-4, Bing Copilot, and Google Gemini) at every stage where possible. Their goal was to document what these systems can do, where they fall short, and what that tells us about the evolving relationship between AI and human thought in knowledge production. They call this process the “cyborg behavioural scientist” model where AI and human roles are blended, but with minimal human intervention wherever feasible.They assessed AI performance across six canonical research stages:* Ideation* Literature Review* Research Design* Data Analysis* Extensions (e.g., follow-up studies)* Manuscript Writing2. IdeationThe ideation phase tested whether LLMs could generate viable research questions. The authors used prompting sequences to elicit possible topics in consumer behaviour and asked the AIs to propose empirical research directions.Findings:* AIs provided broad, somewhat vague suggestions (e.g. “Digital consumption and mental health”), which lacked the specificity required for testable hypotheses.* When asked to generate more focused ideas within a chosen theme (“ethical consumption”), the outputs improved. The researchers selected a concept called “ethical fatigue” - the idea that overexposure to ethical branding messages could dull their persuasive effect.* To get from general territory to a research-ready idea required multiple layers of human-guided refinement. The AI could not identify research gaps or develop theoretically sound rationales.Conclusion: LLMs can function as brainstorming partners, surfacing domains of interest and initial directions, but they lack the epistemic grip to generate research questions that are original, tractable, and well-positioned within the literature. 3. Literature reviewOnce a topic was selected, the authors asked the AIs to identify relevant literature, assess novelty, and suggest theoretical foundations.Findings:* The AIs failed to access or cite relevant academic literature. Most references were hallucinated, incorrect, or drawn from superficial sources.* The models often praised the research idea without offering critical evaluation or theoretical positioning.* The inability to access closed-access journals was a major barrier. Even when articles were available, the AIs rarely retrieved or interpreted them meaningfully.Conclusion: AI cannot currently perform reliable literature reviews - its lack of access, weak interpretive depth, and tendency to hallucinate references make this stage unsuitable for unsupervised delegation.4. Research designThe AIs were tasked with designing an experiment to test the hypothesis that ethical branding becomes less effective when consumers are overexposed to it.Findings:* The AI-generated designs were broadly plausible but flawed. Some included basic confounds (e.g. varying both message frequency and content type simultaneously).* With human corrections (e.g. balancing exposure conditions, clarifying manipulations), the designs became usable.* Stimuli generation (e.g. ethical vs. non-ethical brand statements) was one of the strongest areas for AI—responses were realistic, targeted, and ready for use.* The AIs failed to produce usable survey files in Qualtrics’ native format (QSF). ChatGPT attempted it, but the output didn’t meet schema requirements.Conclusion: AI shows potential as a design assistant, especially for stimulus creation and generating structural ideas, but human researchers must ensure validity, feasibility, and proper implementation, so technical execution remains limited.5. Data analysisHere, the authors uploaded actual survey data and asked the AIs to perform statistical analysis.Findings:* Gemini could not handle data uploads, so only ChatGPT and Bing were tested.* Both AIs recognised the appropriate statistical test (ANOVA), and produced plausible-looking outputs.* However, the reported statistics...
Les membres Amazon Prime bénéficient automatiquement de 2 livres audio offerts chez Audible.

Vous êtes membre Amazon Prime ?

Bénéficiez automatiquement de 2 livres audio offerts.
Bonne écoute !
    Aucun commentaire pour le moment