Adaptive charging techniques for Li-ion battery using Reinforcement Learning
Abstract
Li-ion batteries have become a promising technology in recent years and are used everywhere from low-end devices like mobile phones to high-end ones like electric vehicles. In most applications, the discharge of a battery is user-dependent unlike the charging process, which can be optimized. The research in literature is more focused on identifying the techniques that fully charges the battery with no initial charge for different optimum criteria like charging time, temperature rise, and energy loss. But those are not strictly applicable for real-life scenarios, where the charging process rarely starts with zero initial state of charge (SOC). Also, based on the specific requirements there might be time constraints for charging. Considering the above requirements, the key objective of this dissertation is to obtain a charging profile that maximizes the charge gain for any arbitrary initial SOC within a specific time-limit. To solve this objective, reinforcement learning (RL) and multi-stage-constant-current are chosen for adaptive and generalized charging of Li-ion batteries. We have used the deep deterministic policy gradient which is one of the popular RL algorithms. The algorithm is used because of its ability to perform well in continuous state-action spaces and partially observable environments. Incorporating reinforcement learning to find optimal charging profile will open up the gate for a lot of potential high-end applications of Li-ion batteries.
The Li-ion battery is modeled by an equivalent RC circuit that is served as an environment for the algorithm. Initially, we have considered a naive model, where the initial SOC, and charging time-limit are fixed. The model is built using Simulink software with appropriate state, reward and action functions, and the charging profile is obtained that maximizes charge-gain. The hyperparameters are analyzed and tuned accordingly to train the agent. The obtained results are compared with the simulated standard constant-current-constant-voltage charging method, considered as the baseline model. The model is then integrated to accommodate the different initial SOCs. Two methods viz., selective and generic policy approaches are proposed to train the agent and further deployed for varying initial SOCs. From experiments, it is observed that the generic policy approach performs consistently better for different initial SOCs. The model is also trained for different charging time-limits and the performances are analyzed. The model is further extended to minimize energy loss by considering the appropriate reward function.
The proposed generic model performs significantly better for low initial SOC and almost at par with the baseline model for higher initial SOCs when charged for half an hour. For lower charging time-limits, the model always performs better irrespective of initial SOC. The proposed model is also robust to the frequency of communication between the battery management system and smart charger.