Adaptive streaming improves user-perceived quality by altering the streaming bitrate depending on network conditions, trading reduced video bitrates for reduced stall times. Existing adaptation approaches, e.g., rate-based, bufferbased, either rely heavily on accurate bandwidth prediction or can be overly-conservative about video bitrates. In this work, we propose a reinforcement learning approach to choose the segment quality during playback. This approach uses only the buffer state information and optimizes for a measure of user-perceived streaming quality. Simulation results show that our proposed approach achieves better QoE than rate-, buffer-based approaches, as well as other reinforcement learning approaches.