Explore

  • Trending
  • Latest
  • Tools
  • Browse
  • Subscription Feed

Logistics

  • Ocean
  • Air Cargo
  • Road & Rail
  • Warehousing
  • Last Mile

Regions

  • Southeast Asia
  • South Asia
  • Central Asia
  • Japan & Korea
  • Middle East
  • Europe
  • Russia
  • Africa
  • North America
  • Latin America
  • Australia
SCI.AI
  • Supply Chain
    • Strategy & Planning
    • Logistics & Transport
    • Manufacturing
    • Inventory & Fulfillment
  • Procurement
    • Strategic Sourcing
    • Supplier Management
    • Supply Chain Finance
  • Technology
    • AI & Automation
    • Robotics
    • Digital Platforms
  • Risk & Resilience
  • Sustainability
  • Research
  • Expert Columns
  • English
    • Chinese
    • English
No Result
View All Result
  • Login
  • Register
SCI.AI
No Result
View All Result
Home Research Papers

How Meituan Uses Gaussian Mixture Models to Optimize Food Delivery: New Research from Tsinghua University

2026/02/27
in Papers, Research
0 0
How Meituan Uses Gaussian Mixture Models to Optimize Food Delivery: New Research from Tsinghua University

1. Research Background: The Challenge of Uncertainty in Food Delivery

In today’s rapidly evolving on-demand delivery industry, food delivery platforms face massive order scheduling decisions every day. Platforms like Meituan, Deliveroo, and DoorDash must complete the entire process from order acceptance to dispatch and delivery within minutes. However, the real world is full of uncertainties—fluctuations in restaurant preparation times, changes in rider traffic conditions, and customer reception delays all make delivery optimization exceptionally complex.

Among these variables, service time (the duration from a rider’s arrival at the restaurant to pickup and departure) is a critical yet difficult-to-predict factor. Traditional methods often estimate service time using fixed values or simple statistics, but this ignores its inherent randomness and multi-modal distribution characteristics. A research team from Tsinghua University’s Department of Automation, in collaboration with Meituan, has proposed a Gaussian Mixture Model-based approach to service time modeling, offering a new solution to this problem.

This research’s core contributions are: first application of Gaussian Mixture Models (GMM) to food delivery service time modeling, proposal of Hybrid Estimation of Distribution Algorithm (HEDA) for efficient GMM parameter solving, and online A/B testing validation on Meituan’s real platform. Results show that introducing the uncertainty model significantly improved overall delivery efficiency, shortened riders’ average delivery time, and enhanced customer satisfaction.

2. Problem Definition: Formal Modeling of Stochastic Service Time

Food delivery service time is influenced by multiple factors: restaurant type (fast food vs. full service), time of day (peak vs. off-peak), weather conditions, rider experience, etc. These factors cause service time to exhibit complex multi-modal distribution characteristics—weekday lunch peaks follow one pattern, weekend dinners another, and rainy days yet another.

Objective Function: Maximize log-likelihood of service time distribution estimation

$$max_{theta} sum_{i=1}^{N} log left( sum_{k=1}^{K} pi_k cdot mathcal{N}(x_i | mu_k, sigma_k^2) right)$$

where $theta = {pi_k, mu_k, sigma_k^2}_{k=1}^K$ are GMM parameters, $pi_k$ is the weight of the $k$-th Gaussian component, and $mathcal{N}$ is the Gaussian distribution function.

Constraints:

  • Weights sum to 1: $sum_{k=1}^K pi_k = 1$
  • Weights non-negative: $pi_k geq 0$
  • Variance positive: $sigma_k^2 > 0$

This problem faces three challenges: first, optimal component number $K$ is unknown and needs automatic determination; second, the objective function is non-convex with multiple local optima; third, online real-time prediction is required, demanding high computational efficiency.

3. Methodology: Gaussian Mixture Models and Hybrid Estimation of Distribution Algorithm

Gaussian Mixture Model (GMM) is a probabilistic model assuming data is composed of multiple Gaussian distributions combined together. The research team transformed the service time distribution estimation problem into a clustering problem, learning GMM parameters by determining the probability that each data point belongs to each component. This approach’s advantage is not requiring pre-assumption that service time follows a specific distribution; instead, it lets data automatically discover the most suitable distribution form.

To efficiently solve for GMM parameters, the team proposed Hybrid Estimation of Distribution Algorithm (HEDA), containing four key innovations:

1. Problem-Specific Encoding and Decoding Methods: Researchers designed an encoding scheme specifically tailored for clustering problems, transforming complex parameter optimization into more manageable representations. This encoding ensures solution feasibility while simplifying the search space.

2. Chinese Restaurant Process (CRP)-Based Initialization Mechanism: CRP is a non-parametric Bayesian method that automatically determines cluster numbers rather than requiring pre-specification. Through CRP initialization, the algorithm generates high-quality initial solutions, laying a good foundation for subsequent optimization.

3. Weighted Learning Mechanism: During algorithm iteration, solutions of different qualities contribute differently to probability model updates. The weighted learning mechanism effectively utilizes information from high-quality solutions, guiding search in better directions.

4. Maximum Likelihood-Based Local Intensification: Building on global search, the algorithm incorporates a local search mechanism that further exploits high-quality solution neighborhoods through maximum likelihood estimation, improving solution precision.

Compared to traditional EM algorithms, HEDA’s advantages include: (1) automatic determination of component number K without manual tuning; (2) strong global search capability,不易 falling into local optima; (3) high computational efficiency, suitable for large-scale data.

4. Experimental Validation: Offline Testing and Online A/B Testing

Offline Experiment Design: The team used real delivery data from Meituan’s June 2021 operations for validation, containing service time records from approximately 5 million orders. Data was split into training set (first 3 weeks) and test set (last week). Baseline methods include: (1) single Gaussian model; (2) fixed-K GMM (K=3,5,7); (3) histogram estimation.

Main Results: HEDA algorithm outperformed baselines across multiple metrics. Bayesian Information Criterion (BIC) scores were 15.3% lower than the best baseline, indicating better model fit; log-likelihood improved by 12.7%, indicating more accurate probability estimation; Mean Absolute Error (MAE) decreased from 4.2 minutes to 3.1 minutes, a 26% improvement in prediction accuracy.

Online A/B Testing: In July 2021, Meituan conducted a three-week A/B test across 3 cities. The experimental group used the GMM-based uncertainty model to assist order dispatching decisions, while the control group used traditional deterministic methods. Results showed: experimental group riders’ average delivery time shortened by 8.5%, order on-time rate improved by 6.2 percentage points, and customer satisfaction increased by 3.8 percentage points.

Case Study: The team conducted an in-depth analysis of a typical case—weekday lunch peak in a commercial district. Traditional methods estimated service time as 8 minutes (fixed value), but the GMM model identified two distinct patterns: fast food restaurants (mean 5 minutes, weight 60%) and full-service restaurants (mean 12 minutes, weight 40%). Based on this insight, the dispatch system allocated tighter delivery time windows for fast food orders and more relaxed windows for full-service orders, improving overall delivery efficiency by 11%.

5. Critique and Limitations: Rational Academic Perspective

1. Research Assumption Limitations: GMM assumes service time follows a mixture of multiple Gaussian distributions, but under certain extreme scenarios (severe weather, unexpected events), service time distribution may severely deviate from Gaussian assumptions, exhibiting long-tail or skewed distributions. Additionally, the model assumes service time distribution patterns are stable in the short term, but in reality, they may drift due to restaurant menu changes, rider turnover, and other factors.

2. Methodological Boundary Conditions: HEDA’s computational complexity is relatively high; although advantageous compared to traditional EM algorithms, for delivery platforms requiring millisecond-level real-time decisions, balancing accuracy and efficiency remains necessary. The study adopted an offline training plus online lookup strategy to mitigate this, but the offline model update frequency (weekly) may not timely capture distribution changes.

3. Experimental Design Shortcomings: Offline experiments used historical data with selection bias—only observing service times of occurred orders, not “counterfactual” scenarios (what if different dispatch strategies were used). Online A/B testing occurred in only 3 cities with limited sample representativeness, and the three-week test period leaves long-term effects unknown (e.g., behavioral changes after riders form adaptations).

4. External Validity Concerns: This study uses data from China’s largest food delivery platform; whether conclusions generalize to other scenarios is uncertain. For instance, Western delivery platforms (UberEats, DoorDash) may have different restaurant types, delivery distances, and rider models; fresh food delivery and express logistics have different timeliness requirements and service time distribution characteristics.

6. Practical Implications: Implementation Guide for Supply Chain Practitioners

1. Technical Implementation Path:

  • Data Preparation: Requires at least 1 month of historical delivery data with fields: order ID, restaurant ID, rider ID, arrival time at restaurant, pickup departure time, restaurant type, time of day, weather conditions, etc. Recommended minimum 500,000 orders for GMM training statistical significance.
  • Technology Stack: Python 3.8+ (data processing), scikit-learn 1.0+ (GMM baseline implementation), PyTorch 1.9+ (custom HEDA algorithm), Redis (online lookup caching). Server specifications: 16-core CPU, 64GB RAM, supporting 50,000+ service time prediction requests per second.
  • Implementation Steps: Step 1: Clean historical data, removing anomalies (service time 60 minutes); Step 2: Train GMM model, select optimal K using BIC criterion; Step 3: Validate model prediction accuracy (MAE target <4 minutes); Step 4: Deploy online service, integrate into dispatch system; Step 5: Set up A/B testing, validate before full deployment.

2. Implementation Cost and ROI Estimation:

  • Development Cost: Requires 1 algorithm engineer (2 months), 1 backend engineer (1 month), 1 data engineer (2 weeks). At tier-1 city salaries, labor costs approximately 500,000-800,000 RMB.
  • Operational Cost: Server costs about 10,000-20,000 RMB/month, weekly model retraining requires additional 5,000 RMB in compute resources.
  • Expected Returns: For a delivery platform with 100 million monthly orders, 8.5% average delivery time reduction saves approximately 8.5 million RMB/month in rider costs (at 8 RMB/order). Minus development and operational costs, net benefit is about 8 million RMB/month. Investment payback period: 1-2 months.

3. Applicable Scenarios and Enterprise Types:

  • High-Applicability Scenarios: On-demand delivery (food, groceries, pharmaceuticals), ride-hailing dispatch, sharing economy platforms, dynamic service pricing.
  • Enterprise Scale Recommendation: Medium-to-large enterprises with 100,000+ daily orders are more suitable. Small enterprises with low order volumes lack sufficient historical data for effective GMM models; simplified rules are recommended.
  • Inapplicable Scenarios: Highly deterministic service time scenarios (standardized product delivery), extremely low order volume scenarios, industries with strict regulations prohibiting differentiated services.

4. Implementation Risks and Mitigation:

  • Model Drift Risk: Service time distributions may change over time. Mitigation: Weekly model retraining, set drift detection alerts (e.g., KS test).
  • Fairness Concerns: Different restaurants/riders receiving different time windows may cause dissatisfaction. Mitigation: Transparent rules (e.g., publish service time calculation formulas), set reasonable upper and lower bounds.
  • System Stability Risk: Online service failures may interrupt dispatch. Mitigation: Implement degradation strategies (switch to fixed rules during failures), multi-active deployment, real-time monitoring.

7. Paper Citation

Title: Modeling Stochastic Service Time for Complex On-Demand Food Delivery

Authors: Jie Zheng, Ling Wang (Department of Automation, Tsinghua University); Xuetao Ding, Shengyao Wang, Jing-fang Chen, Xing Wang, Haining Duan, Yile Liang (Meituan)

Venue:

  • Journal: Complex & Intelligent Systems
  • Year: 2022
  • Volume: Vol. 8, pp. 4939-4953

Links:

  • DOI: 10.1007/s40747-022-00719-4
  • Springer Link: https://link.springer.com/article/10.1007/s40747-022-00719-4

Impact:

  • Google Scholar Citations: Approximately 65 citations as of February 2026
  • Industry Application: Fully deployed on China’s largest food delivery platform, processing 30 million daily orders
  • Academic Impact: Cited by top transportation journals like Transportation Research Part C

More on This Topic

  • **A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale** (Apr 4, 2026)
  • **A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale** (Apr 4, 2026)
  • Meituan Cuts Order Cancellations by 25% with AI Bonus Framework (Mar 30, 2026)
  • **A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale** (Mar 26, 2026)
  • Maersk: Latin America’s New Consumer Dynamics Reshape Logistics, Aging Accelerates Supply Chain Restructuring (Mar 19, 2026)
ShareTweet

Related Posts

**A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale**
Papers

**A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale**

April 4, 2026
22
**A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale**
Papers

**A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale**

April 4, 2026
11
Meituan Cuts Order Cancellations by 25% with AI Bonus Framework
Papers

Meituan Cuts Order Cancellations by 25% with AI Bonus Framework

March 30, 2026
6
**A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale**
Papers

**A Framework for Multi-Stage Bonus Allocation in Meal Delivery Platforms: Operationalizing Real-Time Incentive Optimization at Scale**

March 26, 2026
1
Battery-Swapping Heavy Trucks in Thailand: A Supply Chain Inflection Point for ASEAN Electrification
ESG & Regulation

Battery-Swapping Heavy Trucks in Thailand: A Supply Chain Inflection Point for ASEAN Electrification

March 20, 2026
12
Maersk: Latin America’s New Consumer Dynamics Reshape Logistics, Aging Accelerates Supply Chain Restructuring
ESG & Regulation

Maersk: Latin America’s New Consumer Dynamics Reshape Logistics, Aging Accelerates Supply Chain Restructuring

March 19, 2026
6

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

2026 Warehouse Automation Trends: Inbound Focus, RaaS Models, and Software-Centric Operations

2026 Warehouse Automation Trends: Inbound Focus, RaaS Models, and Software-Centric Operations

5 Views
April 4, 2026
Middle East Escalation: Dual Impact on Global Ocean and Air Freight Networks Tests Supply Chain Resilience in 2026

Middle East Escalation: Dual Impact on Global Ocean and Air Freight Networks Tests Supply Chain Resilience in 2026

7 Views
March 30, 2026
UK and Germany Deepen Logistics Cooperation: Trinity House Agreement Drives European Defense Supply Chain Integration

UK and Germany Deepen Logistics Cooperation: Trinity House Agreement Drives European Defense Supply Chain Integration

6 Views
April 2, 2026
UK Supply Chains Face 93% Gas Price Surge in 2026 Middle East Tensions

UK Supply Chains Face 93% Gas Price Surge in 2026 Middle East Tensions

38 Views
March 7, 2026
Show More

SCI.AI

Global Supply Chain Intelligence. Delivering real-time news, analysis, and insights for supply chain professionals worldwide.

Categories

  • Supply Chain Management
  • Procurement
  • Technology

 

  • Risk & Resilience
  • Sustainability
  • Research

© 2026 SCI.AI. All rights reserved.

Powered by SCI.AI Intelligence Platform

Welcome Back!

Sign In with Facebook
Sign In with Google
Sign In with Linked In
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Facebook
Sign Up with Google
Sign Up with Linked In
OR

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Scan to share via WeChat

Open WeChat and scan the QR code to share

QR Code

Add New Playlist

No Result
View All Result
  • Supply Chain
    • Strategy & Planning
    • Logistics & Transport
    • Manufacturing
    • Inventory & Fulfillment
  • Procurement
    • Strategic Sourcing
    • Supplier Management
    • Supply Chain Finance
  • Technology
    • AI & Automation
    • Robotics
    • Digital Platforms
  • Risk & Resilience
  • Sustainability
  • Research
  • Expert Columns
  • English
    • Chinese
    • English
  • Login
  • Sign Up

© 2026 SCI.AI