Debugging whitespace issues in MarkLogic

I’ve just resolved a very confusing bug in some of our MarkLogic XQuery code. It ultimately turned out to be very simple to fix – just adding an ‘indent=no’ parameter to the XSLT.

However, there were three unexpected bits of behaviour that I encountered along the way, that made fixing it much harder than it should have been.

First unexpected thing – MarkLogic was defaulting to ‘indent=yes’ in its XSLT processing.

Second unexpected thing – MarkLogic is applying its indent setting to the results of <xsl:copy-of select=”.”/>. I expected <textarea><xsl:copy-of select=”.”/></textarea> to give me an exact copy of the input, but it doesn’t, because indent is applied to it. Saxon doesn’t apply indent here, even if indent is set to yes.

Third unexpected thing – MarkLogic’s QConsole can’t be trusted to display whitespace accurately. I displayed two almost identical documents in exactly the same way, via doc(‘blah.xml’), in the console, and one of them was indented, and the other one wasn’t. These were the before-and-after documents for our ‘splitDocument’ function, so I naturally thought that splitDocument was applying indentation. It was only when I wrote Scala code to retrieve the XML that I saw what was happening.

Lessons:

1) Always turn on indent=’no’ in stylesheets used in MarkLogic
2) Don’t trust QConsole to debug whitespace issues. Instead, use code.