Cache processing instance (proof of concept)

25 mei 2016 om 00:00 by Ruud van Falier - Post a comment

Many of the Sitecore solutions I encounter during my work as a Sitecore Consultant face the same performance issue as a result of publishing items;
When a content editor publishes an item, the website performance goes down for a short period of time.
In this blog post I will describe why that happens and I will explain a proof of concept that I made that solves this issue.


I'm assuming that you know about Sitecore HTML cache: when a component has rendered, the result is cached and it will not be rendered again assuming the cache parameters stay the same.
Most sites rely on this mechanism heavily. The site could potentionally perform several hundred times slower without caching.

When an item gets published, this cache needs to be cleared in order for our changes to become visible on the site.
This is done by the HtmlCacheClearer after a publish has finished.
Ideally, this would only clear the cache that is affected by the changed items, but Sitecore is not capable of detecting which cache items are affected.
That makes sense considering that components can call other items from code-behind and there is no detectable relationship between the component and those items.
So what happens when an item gets published? Right, all the HTML cache gets cleared.

This means that even if we publish just one item, the site will end up without any cache and therefore will perform poorly for the first several requests.


Before I go into the solution I came up with, I have to mention a little disclaimer:

  • This is a proof of concept and not production-ready!
  • The solution will require an extra Sitecore instance and thus an extra license!

Alright, you're still here? Then let's get into it!
I've thought about this problem for quite some time and came up with several potentional solutions of which this one seemed the best approach.
All of the solutions were based around the same core principal though: rebuilding the initial cache should happen on a separate instance that does not affect the website performance.

First thing I did was replace the >mvc.renderRendering< pipeline processors that are responsible for getting a rendering from the cache and storing a rendered result in the cache.
This allowed me to point it to my own caching provider that (for this proof of concept) is basically just a wrapper around the default HtmlCache object but with some additional features.
The new caching provider allows serialization and deserialization of its contents. Serialization meaning we can convert the contents of our cache to JSON and deserialization meaning we can replace the contents of our cache based on that same JSON.
This is crucial to the solution!

Next was introducing some logic that uses the Event Queue to allow a different instance to handle the cache creation and transfer the created cache back to the other instances.
Creating the cache happens by simply requesting "warmup" pages on the cache processing instance: Pages that will result in important HTML cache.
Note that this means that not all the HTML cache is rebuilt on the cache processing instance, but just the cache that is needed to ensure the site performs well.
This schematic describes what happens:


I think the proof of concept is best explained with a demonstration, so after I finished it I quickly threw together this little demonstration video.
The video is certainly not my best work :) but it describes the solution well enough.
Also make sure to check out the code: which should hopefully clearify it further.

What's next?

I really hope to get some feedback from the community on this solution.
If you guys think that this could be a solid solution, I'm willing to develop it further into a production-ready module.
You can reach out to me on Slack, Twitter or by sending me an email.