KDD Paper in Production: Tsinghua × Meituan Learn Order Pool

KDD Paper in Production: How Tsinghua × Meituan “Learned” Order Pooling from Skilled Couriers, Boosting Peak Efficiency by 55%

There’s an underestimated efficiency lever in meal delivery: order pooling — having one courier simultaneously deliver multiple directionally similar orders. Done well, couriers deliver 2-3 more orders per hour and platform capacity utilization surges. Done poorly, every order arrives late. This “many-to-one” matching problem is NP-hard, and at Meituan’s scale of 70 million daily orders, real-time solving seems impossible.

A joint team from Meituan and Tsinghua University — Yile Liang, Jiuxia Zhao, Donghui Li, Jie Feng, Chen Zhang, and colleagues — proposed an elegant approach: instead of theoretically solving optimal pooling, learn the efficient pooling patterns that experienced couriers (Skilled Couriers) achieve “by intuition” from their actual delivery trajectories. The system, named SCDN, is deployed in Meituan’s production dispatch system. Online tests show 45-55% courier efficiency improvement during lunch peak while maintaining on-time delivery rates.

Core Insight: Skilled Couriers Are the Best “Algorithm”

The paper’s starting point is supremely practical. Among Meituan’s 6.24 million couriers (1 million+ active daily), significant skill stratification exists. Experienced couriers know their zones intimately — commercial district layouts, elevator wait times, residential compound entry points, even how fast different restaurants prepare food. They’ve naturally developed efficient order pooling strategies — which restaurants’ orders can be picked up together, which neighborhoods can be delivered sequentially. This “tacit knowledge” is embedded in their delivery trajectories.

The problem: this knowledge is unstructured and personal — platform algorithms can’t directly utilize it. Traditional pooling algorithms based on simple distance and time-window calculations miss the “environmental awareness” that skilled couriers possess. Two restaurants that seem far apart might be on different floors of the same commercial building; two seemingly opposite delivery addresses might be efficiently combined via a shortcut that doesn’t appear on navigation apps.

Technical Approach: From Trajectories to Knowledge Graph — The SCDN Framework

The Skilled Courier Delivery Network (SCDN) framework converts tacit knowledge into computable system capability in three steps:

Step 1: Build a heterogeneous attributed network. All skilled couriers’ historical delivery trajectories are transformed into a graph network — nodes represent commercial districts, restaurants, and delivery locations; edges represent actual courier movements between them. Edges carry rich attributes: movement frequency, time-period distribution, and pooling patterns (which nodes are frequently visited by the same courier in the same trip). This network is essentially an “experience knowledge graph” of the entire city’s delivery environment.

Step 2: Enhanced GATNE embedding. Building on GATNE (a heterogeneous network embedding method from prior Tsinghua KDD work), the paper enhances it for food delivery scenarios. Through graph representation learning, each network node is encoded into a low-dimensional vector. The key innovation: these vectors encode skilled couriers’ “environmental knowledge.” If two locations are frequently visited together by skilled couriers in the same delivery trip, their vectors will be close in embedding space — even if geographically distant.

Step 3: Real-time pruning and pooling. With low-dimensional vectors, the order pooling search space is dramatically compressed. When a new order arrives, the system simply computes vector similarity between that order’s location and currently unassigned orders — a simple vector operation completable in milliseconds. The NP-hard combinatorial search becomes an approximate nearest-neighbor search, making high-quality pooling feasible under real-time constraints.

“Scale-Effect Hotspots”: A Strategic Discovery Beyond Order Pooling

An additional finding carries significant operational value. By analyzing embedding vector clustering patterns, the system automatically identified “Scale-Effect Hotspot Areas” — zones where high commercial density, order directional convergence, and delivery route overlap naturally favor high-density pooled delivery.

The strategic significance extends far beyond algorithms: platforms can deliberately cultivate and strengthen these hotspot areas’ scale effects through capacity scheduling, courier incentives, and merchant acquisition strategies. Add courier quotas in hotspot zones, provide traffic boosts for zone merchants, optimize pickup flow design. Transform passive “hotspot discovery” into active “hotspot cultivation” — converting algorithmic insights into operational strategy.

Production Deployment Results: 45-55% Peak Efficiency Gain

Unlike many academic papers limited to simulation, SCDN is deployed in Meituan’s production dispatch system. Online A/B test results:

Courier efficiency up 45-55% during lunch peak (11:00-13:00)
Order pooling quality and coverage dramatically improved — more orders successfully pooled with better combined routes
On-time delivery rate maintained — efficiency gains didn’t sacrifice user experience
All stakeholders (couriers, consumers, platform) satisfied

A 45-55% efficiency improvement is staggering. It means during lunch peak, couriers complete roughly half more deliveries per hour on average. With 1 million+ daily active couriers and a 2-hour lunch peak, the system-level capacity release is equivalent to adding hundreds of thousands of couriers’ worth of capacity — with zero additional labor cost.

Implications for the Logistics Industry

1. Frontline workers’ experience is the most undervalued data asset. The paper’s core approach — extracting knowledge from skilled workers’ behavior — applies to any labor-intensive logistics scenario. Veteran warehouse workers’ “intuition” about bin layouts, experienced drivers’ route “knowledge,” senior dispatchers’ situational “anticipation” — all this tacit knowledge can be converted into system capability through similar methods. The key: first record behavioral data, then extract knowledge patterns.

2. Graph representation learning is a powerful weapon for logistics optimization. Logistics networks — delivery, warehouse, or supply chain — are inherently graph-structured. Embedding nodes and edges into low-dimensional spaces efficiently captures hidden relationships that traditional methods miss. Logistics companies should track graph neural network (GNN) development closely.

3. Operations strategy should grow from algorithmic insights. The “Scale-Effect Hotspot” discovery demonstrates a new operational methodology: first use data and algorithms to discover patterns, then formulate strategies accordingly — rather than the traditional “decide strategy first, find data to validate.” This “data-driven operations” mindset is redefining logistics management methodology.

4. Latency is the first constraint in production deployment. The paper chose linear vector operations over complex deep models for real-time pooling decisions — reflecting deep engineering pragmatism. At 260 orders per minute, any computation exceeding 100ms means delays and backlogs. Academia’s “best performance” and industry’s “best performance-to-latency ratio” are entirely different objectives.

Source: Liang, Y., Zhao, J., Li, D., Feng, J., Zhang, C., Ding, X., Hao, J., & He, R. “Harvesting Efficient On-Demand Order Pooling from Skilled Couriers: Enhancing Graph Representation Learning for Refining Real-time Many-to-One Assignments.” KDD (ACM SIGKDD Conference on Knowledge Discovery and Data Mining). | Meituan + Tsinghua University

Explore

Logistics

Regions

KDD Paper in Production: Tsinghua × Meituan Learn Order Pooling from Skilled Couriers, 55% Peak Efficiency Gain

Related Posts

Meituan Cuts Order Cancellations by 25% with AI Bonus Framework

A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale

Maersk: Latin America’s New Consumer Dynamics Reshape Logistics, Aging Accelerates Supply Chain Restructuring

CSDDD Unleashed: How the EU’s Hard Law Directive Is Forcing Global Supply Chain Reengineering

A Multi-stage Bonus Allocation Framework for Meal Delivery Platforms

How Meituan Uses Gaussian Mixture Models to Optimize Food Delivery: New Research from Tsinghua University

Leave a Reply Cancel reply

Recommended

Breaking Silos: How Digital Supply Chains Become the Engine for Massive Carbon Reduction

Tariffs Reshape North American Supply Chains in February

GFA Launches Asia Policy Matrix Covering 8 Textile-Producing Countries — www.just-style.com

ATRI Survey: Significant Disparities in Views Between Drivers and Transportation Companies on Key Issues in the Freight Industry

SCI.AI

Categories

Welcome Back!

Create New Account!

Retrieve your password

Scan to share via WeChat

Add New Playlist