Reinforcement Learning and Gradient Boosting for Dynamic Pricing in Configure-Price-Quote Systems: A Multi-Vertical Empirical Study
Corresponding author: Rajesh Soma
Abstract
Most enterprise Configure, Price, Quote (CPQ) deployments still run on deterministic rule engines designed for a simpler era of product catalogs and stable pricing environments. As catalogs expand and buyer expectations shift, these systems increasingly become a source of friction rather than velocity. This study builds and evaluates Machine Learning (ML)-CPQ, a six-layer system that tackles CPQ's three core bottlenecks: configuration accuracy, pricing intelligence, and approval latency using a combination of gradient boosting, Proximal Policy Optimization reinforcement learning, and transformer-based Natural Language Processing (NLP). The study trained and tested the system on a synthetic dataset of 14,200 sales quotes constructed from calibrated statistical distributions derived from published CPQ failure-mode rates and practitioner benchmarks, spanning the manufacturing, enterprise SaaS, and telecommunications verticals, and compared ML-CPQ against representative rule-based baselines for each vertical. The improvements were substantial and consistent: quote generation time dropped by 51.7%, configuration error rate declined from 8.0% to 2.89%, approval cycle time shortened by 61.9%, and average revenue per closed-won deal increased by 4.6%. These results are reported in detail, including vertical-level breakdowns and an ablation study that isolates each component's contribution. In addition, the study documents practical obstacles in data quality, model explainability, and sales team adoption as these obstacles are often underreported relative to headline performance numbers.
Keywords:
configure price quote, CPQ automation, machine learning, dynamic pricing, reinforcement learning, sales automationReferences
D. Sabin and R. Weigel, "Product Configuration Frameworks: A Survey," IEEE Intelligent Systems, vol. 13, no. 4, pp. 42–49, Jul. 1998.
"Configure Price and Quote (CPQ) Software Market: Growth Analysis, Size and Forecast 2025-2029," Technavio Research, Market Research IRTNTR41048, Mar. 2026. [Online]. Available: https://www.technavio.com/report/configure-price-and-quote-software-market-industry-size-analysis.
M. Lewis and L. Tipping, "Gartner Magic Quadrant for Configure, Price and Quote Applications," Gartner, Jan. 2025. https://www.gartner.com/en/documents/6102427.
State of Sales, 8th ed. San Francisco, CA, USA: Salesforce Inc., 2024.
D. Jannach, M. Zanker, A. Felfernig, and G. Friedrich, Recommender Systems: An Introduction. New York City, NY, USA: Cambridge University Press, 2011.
A. Felfernig, M. Jeran, G. Ninaus, F. Reinfrank, S. Reiterer, and M. Stettinger, "Basic Approaches in Recommendation Systems," in Recommendation Systems in Software Engineering, M. P. Robillard, W. Maalej, R. J. Walker, and T. Zimmermann, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014, pp. 15–37.
T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, Aug. 2016, pp. 785–794.
H. Xia and Y. Wang, "Enhancing Neural Collaborative Filtering for Product Recommendation by Integrating Sales Data and User Satisfaction," Electronics, vol. 14, no. 16, Aug. 2025, Art. no. 3165.
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, "Proximal Policy Optimization Algorithms." arXiv, 2017.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding," in Proceedings of NAACL-HLT 2019, Minneapolis, MN, USA, Jun. 2019, pp. 4171–4186.
P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," in NIPS’20: Proceedings of the 34th International Conference on Neural Information Processing System, Red Hook, NY, USA, Dec. 2020, pp. 9459–9474.
L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001.
A. Ben Mrad and H. M. Alsowayyan, "Interpretable Machine Learning for Price Index Forecasting: A Case Study with Rolling Windows and SHAP," Engineering, Technology & Applied Science Research, vol. 16, no. 1, pp. 30954–30962, Feb. 2026.
H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, "Communication-Efficient Learning of Deep Networks from Decentralized Data," in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 2016, vol. 54.
B. J. Dietvorst, J. P. Simmons, and C. Massey, "Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err.," Journal of Experimental Psychology: General, vol. 144, no. 1, pp. 114–126, 2015.
M. A. Alwadi, "Fuel Sales Price Forecasting Using Time Series, Machine Learning, and Deep Learning Models," Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 22360–22366, Jun. 2025.
Downloads
How to Cite
License
Copyright (c) 2026 Rajesh Soma

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
