Breadth-first search (BFS) is an essential
graph traversal strategy widely used in many computing
applications. Because of its irregular data access patterns,
BFS has become a non-trivial problem hard to parallelize
efficiently. In this paper, we introduce a parallelization
strategy that allows the load balancing of computation
resources as well as the execution of graph traversals in
hybrid environments composed of CPUs and GPUs. To
achieve that goal, we use a fine-grained task-based parallelization
scheme and the OmpSs programming model. We
obtain processing rates up to 2.8 billion traversed edges
per second with a single GPU and a multi-core processor.
Our study shows high processing rates are achievable
with hybrid environments despite the GPU communication
latency and memory coherence.