Let's fix Sitecore Fast Query field querying

Fixing Sitecore Fast Query – querying by field ID

Sitecore Fast Query is a XPath like query syntax for retrieving and filtering items from the Sitecore database, which uses the database engine to execute queries. This means a Fast Query is directly translated to a SQL query against one of the Sitecore SQL databases. The key tables/views used in the SQL queries are Items table and Fields view, which queries VersionedFields, UnversionedFields, and SharedFields tables.

TL;DR – Are you only interested in the code to fix Sitecore Fast Query and be able to query by field ID, see this gist.

Motivation

So you might be thinking: why use a Sitecore Fast Query at all? Sitecore has Sitecore Search based on Lucene or Solr indexes! Well, sometimes you require up-to-date data, and can’t rely on the sometimes stale data in a Lucene index.

Fast Query example

Let’s look at an example of a Sitecore Fast Query, and the SQL generated.

Fast Query:

fast://*[@@templateid='{C29492D2-6F79-463C-A7FB-B318FAB90639}' and @#Postal Code#='43870']

Generated SQL:

exec sp_executesql N'SELECT DISTINCT
[i].[ID] [ID], [i].[ParentID] [ParentID]
FROM [Items] [i] 
LEFT OUTER JOIN (
	SELECT [Fields].*
	from [Fields]
	INNER JOIN [Items] ON [Fields].[FieldID] = [Items].[ID] AND lower([Items].[Name]) = ''postal code''
) [Fields1] ON [i].[ID] = [Fields1].[ItemId]
WHERE (([i].[TemplateID] = @value1) and (coalesce([Fields1].[Value], '''') LIKE @value2))',N'@value1 nvarchar(38),@value2 nvarchar(5)',@value1=N'{C29492D2-6F79-463C-A7FB-B318FAB90639}',@value2=N'43870'

The query searches for all items of a specific templateid which has a Postal Code field with value ‘43870’. As you can see the SQL query uses a subselect on Items and Fields to get the field details for the Postal Code field. We should be able to eliminate the join on Items table, if we know the ID of the field we are querying. But how do you specify a field ID instead of name in a Sitecore Fast Query? Turns out no documentation exists – like so many other Sitecore features… So we turn to the best documentation tool for Sitecore, a decompiler, such as dotPeek. My colleague Carsten had already asked him self the same question, and had the answer.

How to query by field ID

Use @f012901edc1e449b79077eedc19cab81d as field name. Notice the leading f followed by a GUID, your field ID. The implementation in Sitecore.Data.DataProviders.Sql.FastQuery.FieldTranslator expects a field name of 33 characters, including the f, which means that you can not include dashes in your guid. Tip: Use the Sitecore.Data.ID.ToShortID() method to convert your field id into the proper format.

Now, one would expect that the query above could be re-written to:

fast://*[@@templateid='{C29492D2-6F79-463C-A7FB-B318FAB90639}' and @f012901edc1e449b79077eedc19cab81d='43870']

Unfortunately this does not work, and throws an error.

Fixing querying by field ID – take 1

When using the XPath Builder tool in Sitecore and the query above, you get the error: Index (zero based) must be greater than or equal to zero and less than the size of the argument list. You might recognize the text, it’s the exception message thrown from string.Format. Running the Fast Query manually through the Sitecore.Data.Database.SelectSingleItem() method, you will see that exception is thrown deep inside Sitecore.Data.DataProviders.Sql.SqlDataApi – if I remember correctly. Having a closer look at the implementation of Sitecore.Data.DataProviders.Sql.FastQuery.FieldTranslator one sees that a IDFieldInfo class is used to build the field querying part – the part in the SQL subselect above. The implementation of IDFieldInfo is quite simple, the important part is this line:

context.AddSubquery(context.SqlApi.Format("(SELECT * FROM {0}Fields{1} WHERE {0}FieldID{1}='") + context.SqlApi.FormatRawID(fieldId) + "')", "Fields")

The fieldId variable is a Sitecore.Data.ID instance, and we all know how the string representation of an ID looks: {3db0ff43-3912-4fe0-9f75-9e26e2051a22}. But hey! Curly brackets have a special meaning in string.Format and should be escaped using {{ and }}. So how does the FormatRawID() method format an ID? Yup, you guessed it: id.ToString().

So if we can fix the FormatRawID() method in Sitecore.Data.DataProviders.Sql.SqlDataApi class, we should have a working Sitecore Fast Query. Fortunately the SqlDataApi class is public abstract and the method protected virtual – two minutes later, we have a new implementation of FormatRawID(). Actually Sitecore has two implementations of the SqlDataApi class, one for SQL Server and one for Oracle. Since we are working with a SQL Server, so we inherit from Sitecore.Data.SqlServer.SqlServerDataApi. But how do we instruct Sitecore to use our new SqlServerDataApi implementation? A quick scan through Sitecore’s web.config reveals:

<dataApis>
  <!-- Data api for accessing SQL Server databases. -->
  <dataApi name="SqlServer" type="Sitecore.Data.SqlServer.SqlServerDataApi, Sitecore.Kernel">
    <param connectionStringName="$(1)" />
  </dataApi>
</dataApis>

Wow! This is exactly what we need! So we change the type attribute to point to our new class, and expect to have a working Sitecore Fast Query. But our query still fails, and the debug breakpoint in FormatRawID() is never reached. Turns out Sitecore doesn’t use the config setting for anything – the setting has the exact same value in the web.config for Oracle.

Fixing querying by field ID – take 2

Instead you have to replace the entire SqlDataProvider, in our case the Sitecore.Data.SqlServer.SqlServerDataProvider. But this class has the following constructor:

public SqlServerDataProvider(string connectionString)
  : base((SqlDataApi) new SqlServerDataApi(connectionString))
{
}

Wow, hardcoded to SqlServerDataApi! Great job, Sitecore. So we are stuck with the default SqlServerDataApi. Instead we must override the Sitecore.Data.DataProviders.Sql.FastQuery.QueryToSqlTranslator which parses a Fast Query to steps/opcodes using Sitecore.Data.Query.QueryParser and translates these to SQL using Sitecore.Data.DataProviders.Sql.FastQuery.IOpcodeTranslator implementations. The previously mentioned class FieldTranslator is such an implementation. Below is a reimplementation of FieldTranslator.RenderFieldByGuid with the only change to use KvIDFieldInfo instance instead of IDFieldInfo. KvIDFieldInfo includes the fix to proper escape the field ID in our Fast Query.

public class KvFieldTranslator : FieldTranslator
{
	protected override string RenderFieldByGuid(Sitecore.Data.ID fieldID, ITranslationContext context)
	{
		context.Data["complex-fields"] = true;
		var info = context.Fields[fieldID] as IFieldInfo;
		if (info == null)
		{
			info = new KvIDFieldInfo(fieldID, context);
			context.Fields[fieldID] = info;
		}

		return this.RenderField(info, context);
	}
}

public class KvIDFieldInfo : IFieldInfo
{
	public string Alias
	{
		get;
		private set;
	}

	public KvIDFieldInfo(Sitecore.Data.ID fieldId, ITranslationContext context)
	{
		this.Alias = context.AddSubquery(context.SqlApi.Format("(SELECT {0}ItemId{1}, {0}Value{1} FROM {0}Fields{1} WHERE {0}FieldID{1}='") + context.SqlApi.Safe(fieldId.ToGuid().ToString()) + "')", "Fields");
	}
}

Now we need to use our custom KvFieldTranslator for query opcodes of type Sitecore.Data.Query.FieldElement.

public class KvQueryToSqlTranslator : Sitecore.Data.DataProviders.Sql.FastQuery.QueryToSqlTranslator
{
	public KvQueryToSqlTranslator(Sitecore.Data.DataProviders.Sql.SqlDataApi api)
		: base(api)
	{
		((Sitecore.Data.DataProviders.Sql.FastQuery.BasicTranslatorFactory)this._factory).Register(typeof(FieldElement), new KvFieldTranslator());
	}
}

And lastly a custom SqlServerDataProvider which instantiates our KvQueryToSqlTranslator.

public class KvSqlServerDataProvider : Sitecore.Data.SqlServer.SqlServerDataProvider
{
	public KvSqlServerDataProvider(string connectionString)
		: base(connectionString)
	{
	}

	protected override Sitecore.Data.DataProviders.Sql.FastQuery.QueryToSqlTranslator CreateSqlTranslator()
	{
		return new KvQueryToSqlTranslator(this.Api);
	}
}

Instruct Sitecore to use our data provider:

<dataProviders>
  <main type="Kraftvaerk.KvSqlServerDataProvider, Kraftvaerk">
    <param connectionStringName="$(1)" />      
    <Name>$(1)</Name>
  </main>
</dataProviders>

Now our Sitecore Fast Query with field ID works as expected, and generates the following SQL query:

exec sp_executesql N'SELECT DISTINCT
[i].[ID] [ID], [i].[ParentID] [ParentID]
FROM [Items] [i]
LEFT OUTER JOIN (
	SELECT [ItemId], [Value]
	FROM [Fields]
	WHERE [FieldID]=''67e0f992-ae49-4113-8408-e355d5be6b97''
) [Fields1] ON [i].[ID] = [Fields1].[ItemId]
WHERE (([i].[TemplateID] = @value1) and (coalesce([Fields1].[Value], '''') LIKE @value2))',N'@value1 nvarchar(38),@value2 nvarchar(5)',@value1=N'{C29492D2-6F79-463C-A7FB-B318FAB90639}',@value2=N'43870'

So our query now only contains a subquery on Fields view for a fixed field id, which should be faster than the previous query. But there is plenty of room for further improvement. In future post I’ll examine the execution plans for the two queries, and show how to optimize them with a few indexes in the Sitecore SQL database.

2 thoughts on “Fixing Sitecore Fast Query – querying by field ID”

  1. Hi Andreas,

    Nice post, quite an extensive work around to make Sitecore Fast Query perform.

    I would like to warn against using fast queries in general. They often lead to performance issues. Since they go directly on the sql database all Sitecore caching layers are skipped. On a large website with many requests you do not want to query the db all the time. You can read this warning from my colleague Alan http://blog.coates.dk/2014/11/19/sitecore-fast-query-syntax-can-kill-your-sql-server-or-website/

    The right thing would be to setup a custom search index with a synchronous maintenance plan. Then the lucene index is updated synchronously with save/publish on all CD servers.

    Last I will advice strongly against creating custom clustered indexes on the Sitecore databases. These WILL break Sitecore updates.

    Just some advice learned from experience.

    Merry Christmas.

    –Anders

  2. Oh and by the way, you could perhaps also use the link database for the scenario you are describing. That is if the postal code is a referenced item and not a value.

    See this post http://laubplusco.net/sitecore-item-extensions-get-referrers-as-items/

    Simply get the postal code item that you are searching for and then use the link database to get all referrers, that is all items pointing to this item, and then filter this list by template id.

    If there can be thousands or more items pointing to the postal code item then this is not a good approach. But if you know it is just a few and always will be then using the link database is even faster than Lucene or fast query and it utilize the Sitecore caches so it scales for large production environments.

Leave a Reply

Your email address will not be published. Required fields are marked *