Performance tuning the Sitecore Commerce Connect Product Sync
I work with Sitecore Commerce Connect on a daily basis. The framework is extensible and architecture is really robust. Based on the total products in external system, sometimes I have to tune the performance of the Product Sync.
Based on the number of products, I decide whether to use Solr or Lucene. Once that is sorted out, it is time to sort out the pipelines that are taking most of the time for sync. I have listed out the things below that I have tweaked to tune performance of the product sync:
Product in-stock and out-of stock locations: At the end of sync, the stock locations are indexed for each product. If you are not dealing with Stock locations in Sitecore you can disable the pipeline “<commerce.inventory.stockStatusForIndexing>”
Index Rebuild: You may disable master index rebuild on sync completion (good for incremental syncs) and rebuild only commerce master index. The setting is "ProductSynchronization.ProductIndexes”. You can rebuild master index later with the regular index rebuild of your site.
Incremental Sync: Implement Incremental Sync for daily product updates. If you have a lot of products in the external commerce system and not all products are updated daily, then considering implementing incremental sync which updates the Sitecore products only if they have been updated in External Commerce System (ECS). I have a module on Market place that is related to this piece. Refer this blog post
Getting a list of products from Sitecore: Please check how much time is being consumed in getting the products list from sitecore. Product Sync process uses the pipeline "Sitecore.Commerce.Pipelines.Products.GetSitecoreProductList.GetSitecoreProductList" to generate a list of products in Sitecore. Depending on Solr or Lucene this could be an issue if it takes a longer time to get product list from Sitecore. I had to write custom code to read the ids directly from Solr when I was using Solr as my index. It was much faster. You can replace the pipeline with your custom one; write a custom Solr or lucene query to read the externalIds from Sitecore.
public override void Process(ServicePipelineArgs args)
{
SetDatePropertyValue("LastProductSyncStart", DateTime.UtcNow);
this.StatusUpdater.Update("Preparing Sitecore Product List");
List<ProductSyncEntity> list = GetSitecoreProducts(this.productRepository.Template);
list = list.Where(x => x.ExternalId != null).ToList();
args.Request.Properties["SitecoreProductIds"] = (object) list.Select(x => x.ExternalId.FirstOrDefault(y => !y.IsEmptyOrNull())).ToList();
Log.Info(string.Format("{0} products found in sitecore ", (object) list.Count), (object) this);
this.StatusUpdater.Update(string.Format("{0} products found in sitecore ", (object) list.Count));
SetDatePropertyValue("SitecoreReadProductsEndProperty", DateTime.UtcNow);
}
In the snippet above, I have used a database property ("LastProductSyncStart")to store date time of the sync run. These properties are used later in the sync process to determine when was the sync last run. The function GetSitecoreProducts gets the list of products from Sitecore. (Could be from lucene or solr).
private List<ProductSyncEntity> GetSitecoreProducts(string template)
{
var products = new List<ProductSyncEntity>();
using (var searchContext = CreateSearchContext())
{
var predicate = PredicateBuilder.True<ProductSyncEntity>();
predicate = predicate.And(i => i.TemplateName == "Product");
// Custom code specific to my project
predicate = predicate.And(i => i.IsVariant == false);
var totalcount = 0;
var currentPage = 0;
var pageSize = 2000;
while ((currentPage * pageSize) <= totalcount || currentPage == 0)
{
var query =
searchContext.GetQueryable<ProductSyncEntity>().Where(predicate);
query = query.Page(currentPage, pageSize);
var results = query.GetResults();
if (results != null)
{
totalcount = results.TotalSearchResults;
products.AddRange(results.Hits.Select(x => x.Document).ToList());
}
currentPage++;
}
}
return products;
}
My ProductSyncEntity looks like
public class ProductSyncEntity : BaseSearchEntity
{
[IndexField("externalid_tm")]
public List<string> ExternalId { get; set; }
[IndexField("_name")]
public string ProductSku { get; set; }
[IndexField("is_variant_b")]
public bool IsVariant { get; set; }
}
Multi-threading: You can increase the number of threads for the initial load. The setting is :
<setting name="ProductSynchronization.NumberOfThreads" value="8" />
Initially, I was reluctant to increase the number of threads because of this disclaimer in the Commerce Connect Integration guide:
"Due to issues in Sitecore CMS, using more than 1 thread can result in a SQL server deadlock situation, which is why the default configuration only specifies 1 thread."
Later, I decided to give it a try, and I was able to go up to 6 threads without any issue. It drastically improved the performance of the sync.
Delayed Bucket Synchronization: I haven't had a chance to enable this file and test it out, but the documentation mentions that there could be some time saved using delayed sync.
From the Sitecore Commerce Connect Integration guide:
"Doing a bulk synchronization by calling SynchronizeProducts or SynchronizeProductList can cause the creation of many new items that needs to be synchronized. When doing bulk synchronizing, it is faster to delay the bucket synchronization until all new product items have been processed. To further reduce the time spent synchronizing the products bucket, a temporary bucket is used for new product items. The temporary bucket is synchronized after all products have been processed and the bucket content is moved to the main bucket. That will eliminate the time spent touching all existing items in the bucket, which could be significant, e.g. adding a 1.000 new product items to a bucket with 1.000.000 product items, will touch 1.001.000 items to make sure they have not changed. "
Sync only what you want: You can remove the pipelines for the data you don't want to sync. For example, remove the Specification sync pipeline if there are no specifications to be synchronized. For resources, it is better if you can use the URI instead of downloading the resource to media library in sitecore and then reference it in the product. For more info, refer to this blog post
Sync Artifacts: If the artifacts do not need to be synchronized with the products, then remove the pipeline from "Synchronize All Products" pipeline. You can schedule Sync Artifacts weekly and Sync Products daily.
<commerce.synchronizeProducts.synchronizeProducts>
<processor type="Sitecore.Commerce.Pipelines.Products.SynchronizeProducts.RunSynchronizeArtifacts, Sitecore.Commerce">
<patch:delete />
</processor>
</commerce.synchronizeProducts.synchronizeProducts>
Hope you enjoyed reading this bog.
References: https://doc.sitecore.net/Sitecore-Commerce/82 Integration Guide