Booking.com outlines five steps to turn agentic AI pilots into production services

Booking.com is moving agentic AI from pilot projects to production-grade services, outlining a structured path that turns experimentation into operational value by improving partner-to-guest messaging and setting out five practical lessons for deploying agents at scale.

Across the enterprise landscape, leaders frequently report that they have tested agents but have yet to ship anything customers can rely on. The resulting gap between ambitious proofs of concept and day-to-day business impact raises an obvious concern: pilots consume time and resources, yet organizations cannot afford endless trials that fail to reach production. That reality informs the approach taken by Booking.com’s director of data and machine learning platform, Huy Dao, who is tasked with delivering tangible value from AI, including agentic systems designed for real customer needs rather than abstract experimentation.

Dao frames the company’s broader ambition as the “connected trip,” an effort to treat flights, hotels, and attractions as parts of a single, coordinated experience. Achieving that requires working across varied data sources and processes. Within that context, the team’s first agentic application focuses on a concrete problem: enabling timely, accurate communication between lodging partners and guests.

The initiative progressed by tightly aligning technology choices with the service surface used by partners every day. Rather than building stand-alone tools, the team placed the agent directly inside the existing web-based partner portal so that staff could use it where they already manage messages. That decision underscores a central theme in Booking.com’s rollout: let the use case dictate the implementation, and integrate new capabilities at the points of real work.

Technology Overview

Booking.com’s agentic services sit on a data and computing foundation assembled to support applied AI at scale. The platform includes Snowflake as a core data layer, ThoughtSpot for analytics, Astronomer and Airflow for orchestration, Immuta for access control, Arize for machine-learning observability, and AWS for cloud infrastructure. The team evaluates and uses AI models from multiple providers, including OpenAI, Amazon Bedrock, and Google Gemini, matching capabilities to specific tasks as needed.

The partner-to-guest system itself was developed internally in Python. To help the agent reason about diverse inquiries, the team employed LangGraph, an open-source agentic framework. This combination of a curated data stack, model flexibility, and a reasoning layer is designed to support reliable response generation while preserving oversight and traceability across the pipeline.

How It Works

The first step centered on the business challenge most visible to partners: responding to guest questions quickly and accurately. Before the agentic rollout, the process could stall if hotel staff needed to gather more information or were unavailable. That delay—potentially hours—was a pain point for both guests waiting on answers and partners managing inboxes.

To address that, the team launched a trusted assistant that keeps the human firmly in the loop. Called Smart Messenger, this agentic layer collects relevant partner, property, and reservation information and proposes a response that staff can review and send. The intent is not to replace partner judgment but to compress the work required for a high-quality reply from minutes to a near-instant confirmation when the suggested answer is acceptable.

As confidence increases, the system supports a second phase: delegation. With Auto-Reply, partners define custom replies and allow the agent to send immediate responses to common questions—such as parking availability—without manual intervention. This setup is particularly useful outside business hours, ensuring guests receive timely information even when staff are offline.

Booking.com reports that early experiments with the agentic approach yielded a 73% increase in partner satisfaction compared to prior messaging tools. The agent learns from historical interactions and user feedback, adjusting its outputs to improve accuracy and relevance over time. Those gains carry operational benefits: if guests receive the information they need in the first exchange, they have less reason to escalate to customer support, reducing overall support volume and associated costs.

Industry Impact

The company’s experience offers a blueprint for organizations grappling with how to turn agentic AI from an idea into a dependable service. The playbook begins with a clearly scoped use case, adds a data platform capable of supporting that use case end to end, and implements the agent in a way that respects existing workflows. It then advances through measured phases that keep humans in control while enabling targeted automation where trust and outcomes justify it.

Equally important, the effort highlights the operational realities that can surface only at production scale. Dao emphasizes that performance issues such as latency emerge during live use and must be managed by simplifying architecture and platform choices. That observation reinforces the need to treat productionization as an engineering problem, not just a modeling exercise. Monitoring, access controls, and orchestration—supported here by Arize, Immuta, and Airflow—are integral to making agentic behavior dependable day after day.

The reported satisfaction improvement suggests that well-scoped agents can lift both user experience and partner efficiency. While the focus is on messaging, the underlying approach—use case first, platform second—can guide other enterprise teams seeking to move beyond pilots. The emphasis on integrating capabilities inside existing portals and tools is particularly instructive for organizations that want adoption without adding new interfaces to learn.

Future Implications

Dao expects continued development over the next 24 months, with investment directed toward generative and agentic AI that raises the quality of the travel experience rather than experimentation for its own sake. The aim is to deliver interactions that meet contemporary expectations shaped by conversational systems elsewhere, with responses that feel immediate, informed, and context-aware.

That future path returns to the lessons that grounded the initial rollout. First, select a business challenge that matters today. Second, assemble a data and orchestration stack that enables rapid iteration without compromising governance or observability. Third, test the use case carefully, keeping humans in control while the technology proves its value. Fourth, as confidence grows, delegate routine tasks to the agent where predefined responses make sense. Finally, keep scanning for opportunities to reuse the platform and refine the service as real-world constraints—such as latency—surface in production.

Underpinning these steps is a conviction that AI is not a passing trend but a practical tool that can reshape how work gets done. For Booking.com, that means using agentic systems to close the gap between a guest’s question and a reliable answer, within the partner workflows that already exist. The result is a measured approach to agent deployment—focused, observable, and tied to business outcomes—that other organizations can adapt as they seek to turn pilots into production services.

Real-Time Crypto News & Insights