Sorting isn't generally an operation you can run in parallel, because you need to be aware of your neighboring elements. If I may, this sounds like it might be premature optimization; have you actually proved via benchmarking that sorting is the slow part of your code? It's probably faster than you'd think