In my previous post I explained how to extend Sitecore’s Experience Database so that you can store custom data against each Contact (visitor) of your site. I also recently posted about updating an xDB Contact with basic information. This post will explain how to uniquely identify Contacts from one another and how to merge Contacts, and their data, once you have determined that they are the same User.
To help understand how this can be applied for real clients I’ll use the following example. We have a website where Users can order products online. In the purchase path we have a form to collect base information so we can fulfill their order; name, contact info, address, email address etc. We will also use this information to identify them in xDB and update their Contact to illustrate who they are. This will occur when the user clicks Next on the form – save us running the risk of them not making it to the end of the purchase path.
Before we start. Lesson learned from a war wound. When interacting with the Contact use the Contact in the namespace Sitecore.Analytics.Tracker.Current.Session.Contact not Sitecore.Analytics.Tracker.Current.Contact. You want to make changes to the Contact loaded into Session so its correctly written to xDB on the Session.End event.
Identifying Sitecore Contacts
Sitecore xDB holds unidentified and identified Contacts. The difference is identified Contacts contain information that can be used distinguish them from other Contacts, typically an Email address but can be a username or UserId.
Identifying contacts is important as it allows the Contact to be tracking across multiple devices and browsers. There is also the added benefit of the email address being valuable to clients. Sitecore can identify Contacts out-of-the-box using Email Experience Manager (EXM) and Webforms for Marketers (WFFM), when they open an email in the former and complete a form in the latter.
Identifying contacts can also be done in programmatically in code and is really simple to implement. Get the Contact, Sitecore will always ensure a User has one, then set the identifier to the email the User. Be sure to set the Contact.Identifiers.IdentificationLevel to Known.
But what about .Identify()? (As asked by Martina)
Yes, you can use .Identify() passing in the email address of the User to identify the Contact. However it pretty resource-heavy. It not only identifies the current Contact, but also searches for existing Contacts with the same identifier and then kicks off the MergeContact pipeline. So the following section details how to implement .Identify() without the massive overhead.
Anyways… Once the User’s Session ends, typically 20 minutes after their last action on the site, the Contact will be updated in xDB with its identifier.
That’s all well and good but what if the User already has a Contact that has been identified in xDB but now visiting on a new device? They would have two Contacts, one previously identified and a new one Sitecore loaded into Session. We would have just updated the new Contact with a duplicate identifier and all the data we have collected will be split between two Contacts. Therefore the correct approach to identifying Contacts is to check if the User already has an identified Contact.
So, whenever we interact with a Contact we should always check if a Contact already exists with that identifier. I have found the best way to implement this is creating a ContactFactory that is solely responsible for retrieving Contacts that can be acted upon by other areas of code.
The first step is to check if the identified Contact is already existing in the session. If it is not, use ContactRepository in the API to try to load a Contact with that identifier. If a Contact does not exist with that identifier then we’re clear to use the unidentified one in session. However if one is returned we try to load the contact using the ContactManager in the API and return it.
Now we have implemented a consistent way to retrieve Contacts taking into account if the User has previously been identified in xDB we can use it throughout the code as simply as;
Some of you may have picked on on this already. If we retrieve a previously identified Contact from xDB and start using that. Then all interactions and data; pages viewed, goals completed, engagement plans etc the User made while still linked to their unidentified Contact will be lost. The solution, merge the Contacts.
Fortunately the Sitecore API is there to merge contacts. The first method is to use MergeContacts in the ContactRepository as shown below.
BUT! The data does get transferred from the dying Contact to the survivor however Sitecore.Analytics.Tracker still uses the dying Contact, assigning it’s interactions to it.
Therefore, to merge contacts use the method Identify passing in the identifier of the Contact e.g. Email address. The method will merge the unidentified Contact into the identified Contact. So we would implement it in our ContactFactory.cs just before the matchedContact variable is returned, see below.
However, if we have extended xDB to have custom Facets to hold extra data, the Pipeline which Sitecore.Analytics.Tracker.Current.Session.Identify ultimately triggers will not know how to merge our custom data.
Merging Contacts with custom facets
When you have extended xDB you need to consider how Sitecore will merge the data of two Contacts. Merging of contacts is handled by the MergeContacts pipeline within the Sitecore.Analytics.Pipelines namespace. You will need to replace the existing MergeContactFacets processor with a custom one that implements MergeContactProcessor. A reminder how to do that;
The existing MergeContactProcessor processor already does the work of copying all the default Facets so lets not reinvent the wheel and copy the decompiled code.
So copying the above into our custom processesor we need to extend it merge our custom facets. Inside the Process method we must capture the scenario where the Facet being processed is one of our custom facets checking for a match on the Facet name.
I think most developers will want to avoid using the method the ModelUtilities.DeepCopyFacet() in the API as it completely clears the data in the surviving Contact Facet before copying the dying Contact’s data in. Ultimately its for you as a developer to understand your data and determine if it should be overwritten, merged or discarded and then develop code to achieve that.
So in my example I want to create a DeepCopyFacet() method of our own that doesn’t clear the data and appends the all data from the dying Contact’s Facet to the survivor’s. The method needs to handle the transferring of data stored in MongoDB Attributes, Elements, Dictionaries and nested Collections, which is the hardest part. The easiest part of all this is the actually moving of the data from one to another, see below.
And that’s it!
Identifying a Contact as soon as they hand over that piece of unique information. How to merge two Contacts together to consistently track users without loosing any interactions made or custom data saved. This is a big topic so I’ll add some useful links below for further reading.
Next post in this series will be how to extend the Sitecore Experience Profile. I was planning on doing it before this topic but it seemed more logical to cover this first.