With the advances in network speeds a single processor cannot cope anymore with the growing number of data streams from a single network card. Multicore processors come at a rescue but traditional SMP OSes, which integrate the software network stack, scale only to a certain extent, limiting an application’s ability to serve more connections while increasing the number of cores. On the other hand, kernel bypass solutions seem to scale better, but limit resource flexibility and control. We propose attacking these problems with a distributed OS design, using multiple network stacks (one per kernel) and relying on multi-queue hardware and hardware flow steering. This creates a single-socket abstraction among kernels while minimizing inter-core communication. We introduce our design, consisting of a distributed network stack, a distributed device driver, and a load-balancing algorithm. We compare our prototype, NetPopcorn, with Linux, Affinity Accept, FastSocket. NetPopcorn accepts between 5 to 8 times more connections and reduces the tail latency compared to these competitors. We also compare NetPopcorn with mTCP and observe that for high core counts, mTCP accepts only 18% more connections yet with higher tail latency than NetPopcorn.