How to Identify and Merge Contacts in Sitecore xDB

In my previous post I explained how to extend Sitecore’s Experience Database so that you can store custom data against each Contact (visitor) of your site. I also recently posted about updating an xDB Contact with basic information. This post will explain how to uniquely identify Contacts from one another and how to merge Contacts, and their data, once you have determined that they are the same User.

To help understand how this can be applied for real clients I’ll use the following example. We have a website where Users can order products online. In the purchase path we have a form to collect base information so we can fulfill their order; name, contact info, address, email address etc. We will also use this information to identify them in xDB and update their Contact to illustrate who they are. This will occur when the user clicks Next on the form – save us running the risk of them not making it to the end of the purchase path.

Before we start. Lesson learned from a war wound. When interacting with the Contact use the Contact in the namespace Sitecore.Analytics.Tracker.Current.Session.Contact not Sitecore.Analytics.Tracker.Current.Contact. You want to make changes to the Contact loaded into Session so its correctly written to xDB on the Session.End event.

Identifying Sitecore Contacts

Sitecore xDB holds unidentified and identified Contacts. The difference is identified Contacts contain information that can be used distinguish them from other Contacts, typically an Email address but can be a username or UserId.

Identifying contacts is important as it allows the Contact to be tracking across multiple devices and browsers. There is also the added benefit of the email address being valuable to clients.  Sitecore can identify Contacts out-of-the-box using Email Experience Manager (EXM) and Webforms for Marketers (WFFM), when they open an email in the former and complete a form in the latter.

Identifying contacts can also be done in programmatically in code and is really simple to implement. Get the Contact, Sitecore will always ensure a User has one, then set the identifier to the email the User. Be sure to set the Contact.Identifiers.IdentificationLevel to Known.


// Get the Contact in session
var contact = Tracker.Current.Session.Contact;
// Set the Contact's identification level to known if not already set
if (contact.Identifiers.IdentificationLevel != ContactIdentificationLevel.Known)
contact.Identifiers.IdentificationLevel = ContactIdentificationLevel.Known;
// Set the Email address entered into the form as the Identifier of the Contact
contact.Identifiers.Identifier = OrderFormModel.EmailAddress;

But what about .Identify()? (As asked by Martina)

Yes, you can use .Identify() passing in the email address of the User to identify the Contact. However it pretty resource-heavy. It not only identifies the current Contact, but also searches for existing Contacts with the same identifier and then kicks off the MergeContact pipeline. So the following section details how to implement .Identify() without the massive overhead.

Anyways… Once the User’s Session ends, typically 20 minutes after their last action on the site, the Contact will be updated in xDB with its identifier.

That’s all well and good but what if the User already has a Contact that has been identified in xDB but now visiting on a new device? They would have two Contacts, one previously identified and a new one Sitecore loaded into Session. We would have just updated the new Contact with a duplicate identifier and all the data we have collected will be split between two Contacts. Therefore the correct approach to identifying Contacts is to check if the User already has an identified Contact.

So, whenever we interact with a Contact we should always check if a Contact already exists with that identifier. I have found the best way to implement this is creating a ContactFactory that is solely responsible for retrieving Contacts that can be acted upon by other areas of code.

The first step is to check if the identified Contact is already existing in the session. If it is not, use ContactRepository in the API to try to load a Contact with that identifier. If a Contact does not exist with that identifier then we’re clear to use the unidentified one in session. However if one is returned we try to load the contact using the ContactManager in the API and return it.


private ContactRepository contactRepository = Sitecore.Configuration.Factory.CreateObject("tracking/contactRepository", true) as ContactRepository;
private ContactManager contactManager = Sitecore.Configuration.Factory.CreateObject("tracking/contactManager", true) as ContactManager;
public Contact GetContact(string emailAddress)
{
if (IsContactInSession(emailAddress))
{
return Tracker.Current.Session.Contact;
}
var existingContact = ContactRepository.LoadContactReadOnly(emailAddress);
Contact contact;
if (contact != null)
{
LockAttemptResult<Contact> lockResult = ContactManager.TryLoadContact(existingContact.ContactId);
switch (lockResult.Status)
{
case LockAttemptStatus.Success:
Contact lockedContact = lockResult.Object;
lockedContact.ContactSaveMode = ContactSaveMode.AlwaysSave;
contact = lockedContact;
break;
case LockAttemptStatus.NotFound:
contact = Tracker.Current.Session.Contact;
break;
default:
throw new Exception(this.GetType() + " Contact could not be locked – " + emailAddress);
}
}
// if Contact in session's identifier is set and not equal to email address create new Contact
else if (Tracker.Current.Session.Contact.Identifiers.Identifier != null
&& !Tracker.Current.Session.Contact.Identifiers.Identifier.Equals(emailAddress, StringComparison.InvariantCultureIgnoreCase))
{
contact = contactRepository.CreateContact(ID.NewID);
contact.System.Value = 0;
contact.System.VisitCount = 0;
contact.ContactSaveMode = ContactSaveMode.AlwaysSave;
}
// Contact in session's identifier is null use that Contact
else
{
contact = Tracker.Current.Session.Contact;
}
// If the matched Contact is not as the same as the Contact Sitecore loaded into session
// or the Contact in session is not identified
// then we need to call .Identify() – more on this later
if (!contact.ContactId.Equals(Tracker.Current.Session.Contact.ContactId)
|| (contact.ContactId.Equals(Tracker.Current.Session.Contact.ContactId) && Tracker.Current.Session.Contact.Identifiers.Identifier == null))
{
Tracker.Current.Session.Identify(emailAddress);
// Contact has been updated via Identify so update the reference
contact = Tracker.Current.Session.Contact;
}
return matchedContact;
}
private bool IsContactInSession(string emailAddress)
{
var tracker = Tracker.Current;
if (tracker != null &&
tracker.IsActive &&
tracker.Session != null &&
tracker.Session.Contact != null &&
tracker.Session.Contact.Identifiers != null &&
tracker.Session.Contact.Identifiers.Identifier != null &&
tracker.Session.Contact.Identifiers.Identifier.Equals(emailAddress, StringComparison.InvariantCultureIgnoreCase))
{
return true;
}
return false;
}

Now we have implemented a consistent way to retrieve Contacts taking into account if the User has previously been identified in xDB we can use it throughout the code as simply as;


Contact contact = ContactFactory.GetContact(orderFormModel.EmailAddress);

view raw

GetContact.cs

hosted with ❤ by GitHub

Some of you may have picked on on this already. If we retrieve a previously identified Contact from xDB and start using that. Then all interactions and data; pages viewed, goals completed, engagement plans etc the User made while still linked to their unidentified Contact will be lost. The solution, merge the Contacts.

Merging Contacts

Fortunately the Sitecore API is there to merge contacts. The first method is to use MergeContacts in the ContactRepository as shown below.


ContactRepository ContactRepository = Sitecore.Configuration.Factory.CreateObject("tracking/contactRepository", true) as ContactRepository;
// The data will be transferred from the dyingContact to the survivingContact
ContactRepository.MergeContacts(survivingContact, dyingContact)

BUT! The data does get transferred from the dying Contact to the survivor however Sitecore.Analytics.Tracker still uses the dying Contact, assigning it’s interactions to it.

Therefore, to merge contacts use the method Identify passing in the identifier of the Contact e.g. Email address. The method will merge the unidentified Contact into the identified Contact. So we would implement it in our ContactFactory.cs just before the matchedContact variable is returned, see below.


// if the matched Contact by email address is not the same Contact in session
if (!matchedContact.ContactId.Equals(Tracker.Current.Session.Contact.ContactId))
{
// call .Identify to merge the session into the identified Contact
Sitecore.Analytics.Tracker.Current.Session.Identify(emailAddress)
}

However, if we have extended xDB to have custom Facets to hold extra data, the Pipeline which Sitecore.Analytics.Tracker.Current.Session.Identify ultimately triggers will not know how to merge our custom data.

Merging Contacts with custom facets

When you have extended xDB you need to consider how Sitecore will merge the data of two Contacts. Merging of contacts is handled by the MergeContacts pipeline within the Sitecore.Analytics.Pipelines namespace. You will need to replace the existing MergeContactFacets processor with a custom one that implements MergeContactProcessor. A reminder how to do that;


<?xml version="1.0"?>
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/"&gt;
<sitecore>
<pipelines>
<mergeContacts>
<processor patch:instead="*[@type='Sitecore.Analytics.Pipelines.MergeContacts.MergeContactFacets, Sitecore.Analytics']"
type="ISlayTitans.CMS.Pipelines.Analytics.MergeContacts.MergeFacets, ISlayTitans.CMS" />
</mergeContacts>
</pipelines>
</sitecore>
</configuration>

The existing MergeContactProcessor processor already does the work of copying all the default Facets so lets not reinvent the wheel and copy the decompiled code.


internal class MergeContactFacets : MergeContactProcessor
{
public override void Process(MergeContactArgs args)
{
Assert.ArgumentNotNull((object) args, "args");
foreach (string name in args.DyingContact.Facets.Keys)
{
IFacet source = args.DyingContact.Facets[name];
IFacet destination = (IFacet) null;
try
{
destination = args.SurvivingContact.GetFacet<IFacet>(name);
}
catch (FacetNotAvailableException ex)
{
}
if (destination != null && destination.IsEmpty)
ModelUtilities.DeepCopyFacet(source, destination);
}
}
}

So copying the above into our custom processesor we need to extend it merge our custom facets. Inside the Process method we must capture the scenario where the Facet being processed is one of our custom facets checking for a match on the Facet name.


// Check if the name of the Facet is the name of our custom Facet
if (name.Equals(KeyInteractionsFacet.FacetName,
StringComparison.InvariantCultureIgnoreCase) && destination != null && !source.IsEmpty)
{
DeepCopyFacet(source, destination);
}
else if (destination != null && destination.IsEmpty && !source.IsEmpty)
{
ModelUtilities.DeepCopyFacet(source, destination);
}

view raw

Process.cs

hosted with ❤ by GitHub

I think most developers will want to avoid using the method the ModelUtilities.DeepCopyFacet() in the API as it completely clears the data in the surviving Contact Facet before copying the dying Contact’s data in. Ultimately its for you as a developer to understand your data and determine if it should be overwritten, merged or discarded and then develop code to achieve that.

So in my example I want to create a DeepCopyFacet() method of our own that doesn’t clear the data and appends the all data from the dying Contact’s Facet to the survivor’s. The method needs to handle the transferring of data stored in MongoDB Attributes, Elements, Dictionaries and nested Collections, which is the hardest part. The easiest part of all this is the actually moving of the data from one to another, see below.


public static void DeepCopyFacet(IFacet source, IFacet destination)
{
Assert.ArgumentNotNull((object)source, "source");
Assert.ArgumentNotNull((object)destination, "destination");
CopyElement((IElement)source, (IElement)destination);
}
private static void CopyElement(IElement source, IElement destination)
{
foreach (IModelMember modelMember in (IEnumerable<IModelMember>)source.Members)
{
IModelAttributeMember modelAttributeMember = modelMember as IModelAttributeMember;
if (modelAttributeMember != null)
{
((IModelAttributeMember)destination.Members[modelMember.Name]).Value
= modelAttributeMember.Value;
}
else
{
IModelDictionaryMember dictionaryMember1 = modelMember as IModelDictionaryMember;
if (dictionaryMember1 != null)
{
IModelDictionaryMember dictionaryMember2
= (IModelDictionaryMember)destination.Members[modelMember.Name];
foreach (string key in (IEnumerable<string>)dictionaryMember1.Elements.Keys)
{
IElement destination1 = dictionaryMember2.Elements.Create(key);
CopyElement(dictionaryMember1.Elements[key], destination1);
}
}
else
{
IModelCollectionMember collectionMember1 = modelMember as IModelCollectionMember;
if (collectionMember1 != null)
{
IModelCollectionMember collectionMember2
= (IModelCollectionMember)destination.Members[modelMember.Name];
for (int index = 0; index < collectionMember1.Elements.Count; ++index)
{
IElement destination1 = collectionMember2.Elements.Create();
CopyElement(collectionMember1.Elements[index], destination1);
}
}
else
{
IModelElementMember modelElementMember1 = modelMember as IModelElementMember;
if (modelElementMember1 != null)
{
IModelElementMember modelElementMember2
= (IModelElementMember)destination.Members[modelElementMember1.Name];
CopyElement(modelElementMember1.Element, modelElementMember2.Element);
}
}
}
}
}
}

And that’s it!

Identifying a Contact as soon as they hand over that piece of unique information. How to merge two Contacts together to consistently track users without loosing any interactions made or custom data saved. This is a big topic so I’ll add some useful links below for further reading.

Next post in this series will be how to extend the Sitecore Experience Profile. I was planning on doing it before this topic but it seemed more logical to cover this first.

Create and Save Contacts to and from xDB – Brian Pedersen

Identifying Contacts – Sitecore Docs

Merge Contacts – Sitecore Docs

 

9 thoughts on “How to Identify and Merge Contacts in Sitecore xDB

  1. Pingback: Gated Content Sitecore Module | Exercising Sitecore

  2. Just a heads up, seems like Sitecore now do the merging of custom facets without the need for a custom processor. I’m not 100% sure which version this was changed in but we are using 8.1 update 2 and the custom processor is not needed for us to merge custom facets :).

    Like

  3. Pingback: Extending Sitecore’s Experience Profile – The Pipelines | Exercising Sitecore

  4. Pingback: Sitecore XP Analytics and Google Analytics: Parts of a Master Plan? | Digital Voyage

  5. Pingback: How to Update Contacts in Sitecore xDB | Exercising Sitecore

Leave a comment