Various information is collected from IoT devices through the network. As such device becomes more familiar to the user, services are required to consider the influence of user. However, it is difficult to set the parameters of actuators that build consensus among all users in an environment where people with various preferences coexist. The conventional method minimizes the power consumption under the constraints of the user stress. However, this method has a problem that the calculation overhead is increased as the number of devices and users is increased. In this study, we propose a device control method based on consensus building with reinforcement learning. In the proposed method, the state is reduced by applying reinforcement learning for reducing the calculation overhead. As a result of evaluation, we clarified that our method obtains the device parameters that improve the reward by 1.5 times compared with the conventional method. Moreover, we also clarified that a reward value of 98.6% can be achieved compared to the optimum value.