Andy Uzick's Sitecore Blog: March 2013

Tuesday, March 12, 2013

The Heart of the Meta

This article describes Arke System's Meta Tag Manager shared source module, available for free download from the Sitecore Marketplace.

Update 27-March-2013: Several changes have been made to this module, including adding new tag types (properties), adding support for all tag types in the pipeline, and creating "Custom Tags" to replace the less intuitive "Dynamic Processors"). See the documentation file in the Sitecore Marketplace for details.

It's almost inevitable that at some point in a solution's lifespan, there comes a need to add various types of meta tags to the head of your pages. Sometimes it's site-wide tags like turning off the silly IE image toolbar (thankfully a legacy issue now). Sometimes it's something you may need at the page level, like instructing robots not to crawl the page. Sometimes the tag values are static, like a robot directive Sometimes they're dynamic, like page type (template), GUID or other info needed by crawlers or page script. Sometimes the need is simple, sometimes there are very complex tagging requirements for analytics or search crawling.

I've found that over time, solutions wind up with meta tags being handled in dozens of ways ... sometimes in markup, sometimes in code-behind, sometimes in base layouts, sometimes in renderings, sometimes in handlers. Discovering the providence of a tag you find in the final HTML can be challenging.

Recently, a client was concerned about the S.E.O. ramifications of having identical pages at more than one URL on their Sitecore site (for example, with and without "/en" in the path). Having pages like this (or worse, internal links formed differently) can decrease search ranking. If the same page is at two URLs, and there are links to both, the "link juice" is diluted across those pages.

One way to overcome this is to issue a "canonical URL" meta tag. If you emit a meta tag with the same canonical URL on both pages, it informs the search engine which URL is "authoritative" for that page. The search engine will (theoretically) award all of the link juice to the canonical URL.

If you are using a common base layout class, the simplest thing to do would be to have the code-behind of that class hang the canonical meta tag in the page header. But in this case, the client did not have a common base layout.

This kind of thing has come up often enough that I was motivated to create a MetaTag management module. Different chunks of code need to create meta tags, and it’s not always convenient or even possible to limit the timing of meta tag generation to the layout.

This article only covers some highlight from the module. For more detail, see the full documentation at the Sitecore Marketplace.

Overview

Tags can be added four ways:

By adding processors to the InjectMetaTags pipeline.
In the content editor, by adding tag definitions to a GlobalTags folder (these tags appear site-wide).
In the content editor, by selecting pre-defined tags in a field that can be added to any template (these tags appear on individual pages)
In code, at any point in the page lifecycle. This allows code in layouts, sublayouts and renderings to add meta tags “ad hoc”.

Several tag "types" are support to simplify creation of tags in code or configuration:

Static tags: Tags where both the name and value are static, such a “robots noindex”.
Custom tags: Tags where either the name, value or both are generated by custom logic.
Method tags: Tags where the name is static, and the value is derived from an existing static method.
Property tags: Tags where the name is static, and the value is derived from an existing static property.

Method and Property tags allow creation of dynamic tags entirely in configuration, with no coding. These tags derive their values at runttime from existing methods or properties.

There seems to be two subspecies of meta tag markup, those that use a “property” attribute and those that use “name”. Both are still effectively name/value pairs. The “name” style tag seems to be more common, but notable sets of commonly-used tags (like OpenGraph tags) use the “property” structure. This module allows both styles.

Architecture

The module uses a collection of tag description objects (not markup) that persist through the page lifecycle. Tags are added to this collection in various ways, and are flushed to the head section of the page after pre-render. To flush the tags to the page, each is formed into a custom MetaTag control.

No modifications to the solution’s code or markup are required; all of the artifacts and event hooks are created using pipeline processors added to the httpRequestBegin and insertRenderings pipelines.

Adding Tags From Code

Coders can add tags to the page in any point in the page lifecycle, using provided convenience methods.

    PageTags.AddTag(MetaTagType.name, "MyTag", "This is my tag.");

Managing Tags in the Content Editor

Site administrators can create site-wide tags by adding "StaticMetaTag" and "DynamicMetaTag" items to the appropriate folder in the Modules section of the content tree. Items added to the "GlobalTags" folder are emitted on every page; items added to the "OptionalTags" folder are available for content authors to use selectively in their pages (see below).

The definition of a StaticMeta tag is simple; just specify the tag type ("name" or "property"), and the key and value.

To define a dynamic tag, you must first create a simple "Dynamic Processor" class that exposes properties for TagType, Key and Value. Sample Dynamic Processors are provided that emit tags for the Item ID, Template Key and Machine Name. Once this class is defined, you can add it by creating a DynamicMetaTag item and supplying the class signature for the dynamic processor.

namespace Arke.SharedSource.MetaTags.DynamicProcessors
{
  public class ItemID : IDynamicProcessor
  {
    public MetaTagType TagType { get; set; }
    public string Key { get; set; }
    public string Value { get; set; }
 
    public ItemID()
    {
      TagType = MetaTagType.name;
      Key = "ItemID";
      Value = Sitecore.Context.Item.ID.ToString();
    }
  }
}

A base template is provided, which can be added to the base templates of any site template. This gives the content author an opportunity to include the pre-defined tags (those defined n the OptionalTags folder) in any given page.

Managing Tags in the InjectMetaTags Pipeline

The process for gathering, forming and injecting tags is itself managed with a custom pipeline. Developers can add custom tags by inserting processors into this pipeline. They can also change the behavior of the process by replacing existing processors.

For more information on how to manage processes in your solutions by creating custom pipelines, see "Put That in Your Pipe and Process It"

In addition to the processors that gather the meta tags defined in content (global and page-scoped), there are other "demonstrator" processor included in the module. One shows how to implement a fully custom meta tag processor, using the "canonical" meta tag as an example.

There are also processors that allow you to leverage your "Dynamic Processors" or any static method (including any Sitecore static method) to add a global meta, entirely in configuration, without writing a single line of code. This is accomplished by adding processor nodes that have child nodes the define class signatures and names of the processor or static method to be used.

Here's and example of a meta tag pipeline in config (I've shortened the type names for readability):

<Arke.MetaTags.InjectMetaTags>
  <processor type="Arke...CheckContextItem, Arke.SharedSource.MetaTags" />
  <processor type="Arke...CheckHeader, Arke.SharedSource.MetaTags" />
  <processor type="Arke...CanonialUrl, Arke.SharedSource.MetaTags" />
  <processor type="Arke...MethodTag, Arke.SharedSource.MetaTags">
    <TypeSignature>Sitecore.Context, Sitecore.Kernel</TypeSignature>
    <MethodName>GetSiteName</MethodName>
    <TagName>sc_site</TagName>
    <TagType>name</TagType>
  </processor>
  <processor type="Arke...GlobalTags, Arke.SharedSource.MetaTags" />
  <processor type="Arke...ItemTags, Arke.SharedSource.MetaTags" />
  <processor type="Arke...FlushMetaTags, Arke.SharedSource.MetaTags" />
</Arke.MetaTags.InjectMetaTags>

The source code for this module is available at the Sitecore Marketplace. You can also download the documentation file.

Saturday, March 9, 2013

Put That in Your Pipe and Process It

A teacher once told me "When faced with a problem you don't understand, solve any part of it that you do understand, and then step back and look at the problem again." It was supposed to be advice about math, but it is a good advice for troubleshooting any problem. It also taught me that complex problems and processes are best handled in small, discreet pieces.

If you read my blog about renderings, you've heard me lecture on the importance of granularity. When coding our solutions, we often implement complex processes, sequential processes. Sitecore gives us a great tool for making these processes granular and extensible with pipelines.

A pipeline is a framework for defining a process in a series of steps that are handled by individual processors. Developers implement the steps in code following a specified pattern, and define the ordering of these steps in a config file. The pipeline execution engine manages this functionality and exposes a number of supporting methods and properties.

Sitecore exposes many of its internal processes via pipelines, allowing us to do all kinds of magic by adding and modifying processors. Lots of cool enhancements to Sitecore have been shared around the community that leverage these open pipelines.Why don't we open our solutions up to future developers (and our future selves) by using this same technique?

Just like Sitecore has done with many of its processes, we can take our solution's processes, break them down into a series of steps, and implement them with a pipeline. This will allow us to extend or modify the process later (perhaps even using classes in a different assembly) through configuration. This makes our solutions more transparent and more flexible. It leaves an open architecture that allows future developers to change the behavior of the process without changing the original code.

The example code in this article is taken from the MetaTag Manager Module available from the Sitecore Marketplace

The solution I'll use as an example is designed to manage Meta Tag Manager open source module. There are a number of "common" circumstances that influence the insertion of meta tags in that module: managed content items, logic for specialty tags, and a context object that code can use to insert meta tags. These each require their own set of code to manage. Knowing this, and knowing that in the future I might want to implement additional meta tag logic in the future, I used a pipeline to manage this task.

When to Use a Pipeline

Pipelines are a useful tools when you are creating sequential processes that represent significant functionality in your application. This might be something like a data consumer that periodically imports data (via a scheduled task) from an external system, or a process for generating markup as part of your page construction (as in the example pipeline in this article), or a process associated with application start-up, publishing, or any other part of your solution.

Sitecore uses pipelines for most of the processes needed to render every page; this tells me that they should be reasonably efficient. Custom pipelines are most useful for complex processes that may need to be extended or modified later, with the convenience if not having to modify the original code.

Anatomy of a Pipeline

The Sitecore pipeline architecture exists to allow you to define a series of steps in a config file that, when executed in order, implement a process within the solution. The pipeline can be invoked from anywhere in your code where you might otherwise call one or more methods to achieve the same purpose.

The pieces of a pipeline are...

The pipeline definition. This is declared in a config file.
One or more pipeline processors. These represent the steps that could be used in the pipeline, and are implemented in code.
(Optional) A PipelineArgs class, which is implemented in code and is used to allow the steps in the pipeline pass data down the line.
The Sitecore.Pipelines.CorePipeline namespace, which contains all of the API needed to develop and invoke pipelines and pipeline processors.

In a pipeline, each step is defined using a class signature in the config file. The when the pipeline is run, Sitecore uses reflection to instantiate n instance of each processor class, and calls the Process method of that class.

Pipeline Args

It is usually helpful to be able to maintain some context during the execution of the pipeline, so that processors can pass data to each other. Sitecore instantiates a PipelineArgs class (or a custom class that inherits from PipelineArgs) when a pipeline is run. This instance is passed to each processor's Process method, allowing each step access to the PipelineArgs.

This PipelineArgs class exposes some useful methods to each processor. One important one is the AbortPipeline method. If a processor step detects that the pipeline should be terminated without calling the subsequent steps, it can call this method.

The PipelineArgs class also contains a SafeDictionary property called CustomData. Pipeline processors can add objects to this dictionary in order to pass information down the line as subsequent processors are invoked.

Rather than use the CustomData dictionary, I prefer to create a custom "args" class that inherits from Sitecore's PipelineArgs. I can then define properties that do not require casting to be used by pipeline steps. This is also a convenient place to put convenience or utility methods for the processor steps.

namespace Arke.SharedSource.MetaTags
{
 public class InjectMetaTagsPipelineArgs : PipelineArgs
 {
  public List<MetaTagItem> MetaTags;
 
  public InjectMetaTagsPipelineArgs()
  {
   this.MetaTags = new List<MetaTagItem>();
  }
 }
}

When using a custom PipelineArgs class, it is a good idea to create an Interface that our pipeline processors will implement. This ensures that each step is casting the args class properly.

namespace Arke.SharedSource.MetaTags.Pipelines
{
  public interface IInjectMetaTagsPipelineProcessor
  {
    void Process(InjectMetaTagsPipelineArgs args);
  }
}

Pipeline Processors

The steps in your pipeline are implemented with pipeline processors, representing the steps in the process that the pipeline implements. Each is implemented with a class that implements a Process method, which accepts a PipelineArgs argument. The pipeline runner instantiates these classes in turn, and calls the Process method, passing in the PipelineArgs object.

This is where you implement the logic for that step of the process. The processor may read from and.or write to the PipelineArgs object. It can abort the remainder of the pipeline with the AbortPipeline method.

For example, this processor runs early in the pipeline to make sure there is a context item, and aborts the pipeline if necessary:

namespace Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags
{
  public class CheckContextItem : IInjectMetaTagsPipelineProcessor
  {
    public void Process(InjectMetaTagsPipelineArgs args)
    {
      if (Sitecore.Context.Item == null)
      {
        Tracer.Warning(string.Concat("Not injecting meta tags; no context item"));
        args.AbortPipeline();
      }
    }
  }
}

This processor adds a "canonical" meta tag:

namespace Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags
{
  public class CanonialUrl : IInjectMetaTagsPipelineProcessor
  {
    public void Process(InjectMetaTagsPipelineArgs args)
    {
      Tracer.Info(string.Concat("Adding Canonical URL"));
      args.MetaTags.Add(new MetaTagItem(MetaTagType.name, "canonical", Settings.GetAuthorativeUrl()));
    }
  }
}

Architecting the Steps in a Pipeline

What pipeline steps you create, and what they do, is up to you. Remember that pipelines typically define a process in your solution. Sitecore has pipelines that define the processes for things like insertRenderings and publishItem. These steps in these pipelines do thing like detecting if the process is appropriate for the current item or context, building up or transforming data, moving data from one place to another, generating markup, cleaning up, logging messages and saving performance data.

When I'm building a process to be handled by a pipeline, I tend to think about Stephen Covey's advice about analysis and synthesis. To analyze means to break apart, to synthesize means to put together. A pipeline process often begins by analyzing (checking and transforming the input data, context and environment), and then ends with synthesizing the desired output.

What steps your pipeline should take depends on the nature of the task at hand. It is common to have the initial steps determine if the process should execute depending on the context or availability of data or objects, and to instantiate objects and data that will be needed by subsequent steps. The next steps actually perform the process in discreet steps, and the last steps clean up or do some logging.

Try to break the pipeline into very discreet steps. If you find that one of your steps is becoming "spaghetti," break it into multiple steps. Take logic that makes decisions and move them to separate steps. If there is data to be gathered that is used by subsequent steps, put the data gatherers into separate steps. This allows future developers to modify or extend the data-gathering, decision-making and output-producing steps, or to insert their own steps in between.

The pipeline in the example solution starts by checking if meta tags can be injected into the page (by validating the existence of a context item and a head section in the page). It then has a step for every group of tags that might be injected (depending on the nature of the tags), adding them to a collection stored in the PipelineArgs. Finally, it flushes the tags to the page.

By separating the sets of tags to be included, the application administrator can decide to exclude steps, say for example, the canonical URL tag, by removing them from the config. A developer can add logic to include a different set of tags by adding a step to this pipeline, which might call a class in an entirely separate assembly.

The Config File

The pipeline steps are defined in .config. This might be directly in web.config, or in an external config file. I'd recommend an external config file ... see this blog post.

Each step in the pipeline is defined with a processor node, which simply declares the type signature for that step of the pipeline.

A pipeline processor node can also have child nodes. The tag name in these nodes must map to property names on the class for that processor. Sitecore will inject the value (the text) of each node into the corresponding property in the class. This allows you to create more "general purpose" processor steps and hand properties to them at runtime.

<MetaTags.InjectMetaTags>
  <processor type="Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags.CheckContextItem, Arke.SharedSource.MetaTags" />
  <processor type="Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags.CheckHeader, Arke.SharedSource.MetaTags" />
  <processor type="Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags.CanonialUrl, Arke.SharedSource.MetaTags" />
  <processor type="Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags.MethodTag, Arke.SharedSource.MetaTags">
    <TypeSignature>Sitecore.Context, Sitecore.Kernel</TypeSignature>
    <MethodName>GetSiteName</MethodName>
    <TagName>sc_site</TagName>
    <TagType>name</TagType>
  </processor>
  <processor type="Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags.GlobalTags, Arke.SharedSource.MetaTags" />
  <processor type="Arke.SharedSource.MetaTags.Pipelines.InjectMetaTags.FlushMetaTags, Arke.SharedSource.MetaTags" />

Invoking the Pipeline

It's really very simple to invoke a pipeline. You simply call Sitecore.Pipelines.CorePipeline.Run("MyPipelineName", args), where MyPipelineName is the name of the pipeline in the config file, and args is an instance of PipelineArgs (or a custom class derived from it).

  InjectMetaTagsPipelineArgs args = new InjectMetaTagsPipelineArgs();
  Sitecore.Pipelines.CorePipeline.Run(Settings.PIPELINE_NAME, args);

Where to invoke the pipeline depends on what the pipeline does. Sometimes it needs to be invoked from a timed task, sometimes from another pipeline, sometimes from other places. If I have a situation where I need to extend an existing Sitecore pipeline with a complex task, I'll create my own pipeline and invoke it from Sitecore's pipeline. That makes maintenance much easier, because I can define it all in another config file and don't need to patch Sitecore's pipeline more than once.

First, we wire up the the appropriate existing pipeline:

  <pipelines>
      <insertRenderings>
        <processor
          type="Arke.SharedSource.MetaTags.Pipelines.InsertRenderings.InjectMetaTags, Arke.SharedSource.MetaTags"

          patch:after="processor[@type='Sitecore.Pipelines.InsertRenderings.Processors.AddRenderings, Sitecore.Kernel']"
        />
      </insertRenderings>

  </pipelines>

The method we're wiring exists only to fire our custom pipeline:

  public void Process(InsertRenderingsArgs args)
  {
    Assert.ArgumentNotNull(args, "args");
 
    // Don't wire this up when in the sitecore shell
    if (Sitecore.Context.Site.Name.Equals("shell", StringComparison.InvariantCultureIgnoreCase))
    {
      Tracer.Warning("Meta tags not emitted in the shell site.");
      return;
    }
 
    // We'll wire up a handler for after prerender is complete, so any meta tags added
    // by controls (via Arke.SharedSource.MetaTags.PageTags) will be available.
    Sitecore.Context.Page.Page.PreRenderComplete += new EventHandler(RunPipeline);
 
  }

  void RunPipeline(object sender, EventArgs e)
  {
    Profiler.StartOperation("Adding Global MetaTags.");
    try
    {
      InjectMetaTagsPipelineArgs args = new InjectMetaTagsPipelineArgs();
      Sitecore.Pipelines.CorePipeline.Run(Settings.PIPELINE_NAME, args);
    }
    catch (Exception ex)
    {
      Sitecore.Diagnostics.Log.Error("InjectMetaTags failed", ex, "InjectMetaTags");
    }
    Profiler.EndOperation();
  }

Our pipeline in turn declares the steps that would otherwise have to be patched into Sitecore's pipeline, as shown in the config example shown before.

Next time you're building a process for your solution, consider using Sitecore's built-in tools for pipeline management. It may take a little more time now, but it'll make life easier later, when new requirements come down the pipe.

References:

John West: All About Pipelines in the Sitecore ASP.NET CMS
Andy Uzick (Arke Systems): MetaTag Manager Module available from the Sitecore Marketplace