Given these factors, how can you estimate the optimal number of threads for your algorithm? There is no definitive answer, as it depends on the specific problem, code, and system that you are working with. However, a general rule of thumb is to start with the number of physical cores as a baseline, and then adjust it based on the workload characteristics and the thread overhead. For example, if your workload is CPU-bound and has low communication and synchronization among tasks, you can try increasing the number of threads slightly above the number of cores to see if it improves performance. If your workload is I/O-bound or has high communication and synchronization among tasks, you might want to reduce the number of threads below the number of cores to avoid contention and overhead. You can also use tools such as profilers or benchmarks to measure the performance of your code with different numbers of threads and find the optimal one for your case.
Parallelizing algorithms can be a great way to optimize your code and make use of your hardware resources. However, it also requires careful consideration of the factors that affect the optimal number of threads. By understanding these factors and experimenting with different numbers of threads, you can find the best balance between parallelization and overhead for your algorithm.