Scaling web applications such as e-commerce in cloud by adding or removing servers in the system is an important practice to handle workload variations, with the goal of achieving both high quality of service (QoS) and high resource efficiency. Through extensive scaling experiments of an n-tier application benchmark (RUBBoS), we have observed that scaling only hardware resources without appropriate adaptation of soft resource allocations (e.g., thread or connection pool size) of each server would cause significant performance degradation of the overall system by either under- or over-utilizing the bottleneck resource in the system. We develop a dynamic concurrency management (DCM) framework which integrates soft resource allocations into the system scaling management. DCM introduces a model which determines a near-optimal concurrency setting to each tier of the system based on a combination of operational queuing laws and online analysis of fine-grained measurement data. We implement DCM as a two-level actuator which scales both hardware and soft resources in an n-tier system on the fly without interrupting the runtime system performance. Our experimental results demonstrate that DCM can achieve significantly more stable performance and higher resource efficiency compared to the state-of-the-art hardware-only scaling solutions (e.g., Amazon EC2-AutoScale) under six realistic bursty workload traces.