<p dir="ltr">We explore the potential of Generative Artificial Intelligence (AI) agents created using open-access and locally hosted Large Language Models (LLMs) in replicating human survey behaviour and mode choice preferences in scenario-based travel surveys. The aim is to establish performance and validation benchmarks for utilizing AI agents in travel behaviour analysis, agent-based simulations, and other use cases. Accordingly, we developed a systematic scientific approach to assess the performance of seven open-access foundational LLMs, with parameters ranging from one to seventy billion, which can be generalized for creating and validating the performance of Generative AI agents in various applications.</p><p dir="ltr">The AI agents were developed using a zero-shot learning approach, incorporating both unrestricted sociodemographic and static prompting, as well as a dynamic restricted sociodemographic prompting strategy. The performance of these agents was validated against the human benchmark dataset, evaluating their effectiveness and reliability in capturing and replicating nuanced travel behaviour.</p>