If a patch touches lots of files the action can actually take quite a while to complete, we are sometimes seeing times upward of 40 minutes. An option to parallelize across input files would therefore be greatly appreciated! It is my understanding that there is already some sort of diagnostic-deduplication happening, so this could probably be applied to duplicate results from parallel execution as well (i.e. for diagnostics generated in shared headers).