Modares Mechanical Engineering

Modares Mechanical Engineering

Reinforcement Learning and Sliding Mode Hybrid Controller for an Enhanced Inverted Pendulum on Cart System

Document Type : Original Article

Authors
Institute of Intelligent Control Systems, K. N. Toosi University of Technology, Tehran, Iran
10.48311/mme.2025.117163.82870
Abstract
This paper presents a comprehensive control study of an inverted pendulum on cart system enhanced with torsional spring-damper dynamics and cart damping. A Linear Quadratic Regulator (LQR) was designed using a linearized model, while Model Predictive Control (MPC) and Reinforcement Learning (RL) controllers were augmented with Monte Carlo simulations to evaluate robustness and sensitivity. Results demonstrated comparable performance among all methods in linear regimes, with stabilization times under 20 seconds and overshoot variation of 3%. To address nonlinear dynamics, a hybrid SMC-RL strategy was developed, reducing settling time and improving capability of maintaining stability under nonlinear behavior and large initial angles to 120-150°. The proposed SMC-RL framework achieved a success rate in stabilizing the system from diverse initial conditions, significantly outperforming standalone controllers in transient response and adaptability. System stability was formally verified through Lyapunov analysis and empirically confirmed by Monte Carlo simulations, which demonstrated consistent performance with minimal standard deviation across 80 randomized trials.
Keywords

Subjects


[1]           M. Tayefi and Z. Geng, "Self-balancing controlled Lagrangian and geometric control of unmanned mobile robots," Journal of Intelligent & Robotic Systems, vol. 90, no. 1, pp. 253-265, 2018, doi: 10.1007/s10846-017-0666-7.
[2]           M. Fauziyah, Z. Amalia, I. Siradjuddin, D. Dewatama, R. P. Wicaksono, and E. Yudaningtyas, "Linear quadratic regulator and pole placement for stabilizing a cart inverted pendulum system," Bulletin of electrical engineering and informatics, vol. 9, no. 3, pp. 914-923, 2020, doi: 10.11591/eei.v9i3.2017.
[3]           I. Siradjuddin, B. Setiawan, A. Fahmi, Z. Amalia, and E. Rohadi, "State space control using LQR method for a cart-inverted pendulum linearised model," International Journal of Mechanical and Mechatronics Engineering, vol. 17, no. 1, pp. 119-126, 2017.
[4]           S. J. Chacko and R. J. Abraham, "On LQR controller design for an inverted pendulum stabilization," International Journal of Dynamics and Control, vol.11,no.4,pp.1584-1592,2023, doi: 10.1007/s40435-022-01079-0.
[5]           S. Y. Kim, C. H. Kang, and C. G. Park, "Multiple Frequency Tracking and Mitigation Based on RSPWVD and Adaptive Multiple Linear Kalman Notch Filter," International Journal of Control, Automation and Systems, vol. 18, no. 5, pp. 1139-1149, 2020/05/01 2020, doi:  10.1007/s12555-019-0123-4.
[6]           E. S. Varghese, A. K. Vincent, and V. Bagyaveereswaran, "Optimal control of inverted pendulum system using PID controller, LQR and MPC," in IOP Conference Series: Materials Science and Engineering, 2017, vol. 263, no. 5: IOP Publishing,p.052007, doi: 10.1088/1757-899X/263/5/052007
[7]           R. Balan, V. Maties, O. Hancu, and S. Stan, "A predictive control approach for the inverse pendulum on a cart problem," in IEEE International Conference Mechatronics and Automation, 2005, 2005, vol. 4: IEEE, pp. 2026-2031, doi: 10.1109/ICMA.2005.1626874.
[8]           L. Messikh, E.-H. Guechi, and S. Blažič, "Stabilization of the cart–inverted-pendulum system using state-feedback pole-independent MPC controllers," Sensors, vol. 22, no. 1, p. 243, 2021, doi: https://doi.org/10.3390/s22010243.
[9]           R. Balan, V. Maties, and S. Stan, "A solution of the inverse pendulum on a cart problem using predictive control," in Proceedings of the IEEE International Symposium on Industrial Electronics, 2005. ISIE 2005., 2005, vol. 1: IEEE, pp. 63-68, doi: 10.1109/ISIE.2005.1528889.
[10]         S. Nakatani and H. Date, "Swing up control of inverted pendulum on a cart with collision by monte carlo model predictive control," in 2019 58th Annual Conference of the Society of Instrument and Control Engineers of Japan (SICE), 2019:IEEE,pp.1050-1055, doi: 10.23919/SICE.2019.8859912.
[11]         D. S. Deighan, S. E. Field, C. D. Capano, and G. Khanna, "Genetic-algorithm-optimized neural networks for gravitational wave classification," Neural Computing and Applications, vol. 33, no. 20, pp. 13859-13883, 2021/10/01 2021, doi: 10.1007/s00521-021-06024-4.
[12]         C. A. Manrique Escobar, C. M. Pappalardo, and D. Guida, "A parametric study of a deep reinforcement learning control system applied to the swing-up problem of the cart-pole," Applied Sciences, vol. 10, no. 24, p. 9013, 2020, doi: 10.3390/app10249013.
[13]         A. Ataka, A. Sandiwan, H. Tnunay, D. R. Utomo, and A. I. Cahyadi, "Inverted pendulum control: A comparative study from conventional control to reinforcement learning," Jurnal Nasional Teknik Elektro dan Teknologi Informasi, vol. 12, no. 3, pp. 197-204,2023, doi: 10.22146/jnteti.v12i3.7065.
[14]         A. Dev, K. R. Chowdhury, and M. P. Schoen, "Q-learning based Control for Swing-up and Balancing of Inverted Pendulum," in 2024 Intermountain Engineering, Technology and Computing (IETC), 2024: IEEE, pp. 209-214, doi: 10.1109/IETC61393.2024.10564347.
[15]         R. Hernandez, R. Garcia-Hernandez, and F. Jurado, "Modeling, Simulation, and Control of a Rotary Inverted Pendulum: A Reinforcement Learning-Based Control Approach. Modelling 2024, 5, 1824–1852," ed, 2024, doi: 10.3390/modelling5040095.
[16]         R. S. Bhourji, S. Mozaffari, and S. Alirezaee, "Reinforcement learning DDPG–PPO agent-based control system for rotary inverted pendulum," Arabian Journal for Science and Engineering, vol. 49, no. 2, pp. 1683-1696, 2024, doi:10.1007/s13369-023-07934-2.
[17]         P.-W.Cheng, “Reinforcement learning based controller design applied to rotary inverted pendulum system,” M.S. thesis, National Yang Ming Chiao Tung Univ., Taiwan, 2021.
[18]         T. M. Tijani and I. A. Jimoh, "Optimal control of the double inverted pendulum on a cart: A comparative study of explicit MPC and LQR," Applications of Modelling and Simulation, vol. 5, pp. 74-87, 2021.
[19]         M. Rani and S. S. Kamlu, "Optimal LQG controller design for inverted pendulum systems using a comprehensive approach," Scientific Reports, vol. 15, no. 1, p. 4692, 2025, doi: 10.1038/s41598-025-85581-3.
[20]         D. Maneetham and P. Sutyasadi, "System design for inverted pendulum using LQR control via IoT," International Journal for Simulation and Multidisciplinary Design Optimization, vol. 11, p. 12, 2020, doi: 10.1051/smdo/2020007.
[21]         H. O. ERKOL, "Linear quadratic regulator design for position control of an inverted pendulum by grey wolf optimizer," International Journal of Advanced Computer Science and Applications, vol. 9, no. 4, 2018, doi: 10.14569/IJACSA.2018.090403.
[22]         H. X. Cheng, J. X. Chen, J. Li, and L. Cheng, "Optimal control for single inverted pendulum based on linear quadratic regulator," in MATEC Web of Conferences, 2016, vol. 44: EDP Sciences, p. 02064, doi: 10.1051/matecconf/20164402064.
[23]         U. Yıldıran, "Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method," Sakarya University Journal of Science, vol. 27, no. 6, pp. 1311-1321, 2023, doi: 10.16984/saufenbilder.1286391.
[24]         M. Safeea and P. Neto, "A Q-learning approach to the continuous control problem of robot inverted pendulum balancing," Intelligent Systems with Applications, vol. 21, p. 200313, 2024, doi: 10.1016/j.iswa.2023.200313.
[25]         V. W. Hill, "Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty," arXiv preprint arXiv:2404.04699, 2024, doi: 10.48550/arXiv.2404.04699.
[26]         J. Liu, X. Zhuan, and C. Lu, "Swing-up and balance control of cart-pole based on reinforcement learning DDPG," in International Conference on Bio-Inspired Computing: Theories and Applications, 2022: Springer, pp. 419-429, doi: 10.1007/978-981-99-1549-1_33.
[27]         M. Bajelani and M. Tayefi, "Identification and control of unstable systems using internal and external loops implemented on a multirotor," Mechanical Engineering of University of Tabriz, vol. 51, no. 4, pp. 59-67, 2022, doi: 10.22034/jmeut.2022.12263.
[28]         N. Mohammadi, M. Ebrahimi, M. Tayefi, and A. Nikkhah, "Reinforcement Q-learning based flight control for a passenger aircraft under actuator fault," Discover Mechanical Engineering, vol. 4, no. 1, pp. 1-22, 2025,doi: 10.1007/s44245-025-00090-x.
[29]         H. Asrari, I. Mohammadzaman, and F. Allahverdizadeh, "Robust state-feedback controller for linear parameter-varying systems with time-invariant uncertainties," Scientia Iranica, vol. 30, no. 3, pp. 1148-1157, 2023,doi: 10.24200/sci.2021.56871.4953.
[30]         A. Soltani and M. H. Kamari, "Hybrid Position and Force Control for a Spherical Inverted Pendulum Connected to a Quadrotor in a Constrained Motion," Amirkabir Journal of Mechanical Engineering, vol. 54, no. 10, pp. 2255-2276, 2022, doi: 10.22060/mej.2022.21308.7420.
[31]         N. Mohammadi, M. Tayefi, and M. Zhu, "Vertical take-off and hover to cruise transition for a hybrid UAV using model predictive controller and MPC allocation," Aircraft Engineering and Aerospace Technology, vol. 95, no. 10, pp. 1642-1650, 2023, doi: 10.1108/AEAT-04-2023-0090.