Post tutorial Report RSS Simpler merging of XML files

Merging xml files is a not uncommon activity when modding. For eg., STALKER stores language strings in xml files and comparing/merging these files is quite tedious most of the time. This article shows how to sort these files first to make it much easier to compare and merge such files.

Posted by on - Basic Other

WinMerge is a great tool for comparing and merging text files - provided the two files you're working with have the same content in mostly the same order. You often run into situations where the source and destination files have been re-ordered by some well-meaning person who didn't think of the consequences for those who later have to merge the changes back into their copies.

The solution is to sort the two files first before merging them. This will bring both files into a common ordering so they can be easily compared and merged. This article goes into how to do this for XML files which are pretty common. It uses a S.T.A.L.K.E.R language file for weapon descriptions as an example, since translating these can be an annoyance if the entries have been moved around a lot.

First you have to ensure the XML is valid. Unfortunately, these files often contain problems like "&" being written directly, which isn't valid xml - it should be written as "&". To find such problems load the file into a site like this one: Xmlvalidation.com and it will tell you where and what the problem is. For our weapons example the only issue was a bunch of &s in words like H&K which need to be rewritten to H&K. Easy to do as a find and replace in Notepad++. When validating XML, I fix the first problem shown by the validator everywhere it shows up in the file, then reload the file into the site and repeat the process until it's valid XML.

Next you do the sort using a tool like this one: Xmlsorter.codeplex.com This is a good tool and all it needs is .NET to run and most people have that on their machines. The tool itself is simple to use: you specify an input file and where to store the sorted output, and finally how to do the sort. Here's an example using a ui file from STALKER where I wanted to sort the file by its tags:

Sort by Tag


For some STALKER files we want to sort on the "id" attribute, not a tag because our xml looks like this:

  <string id="ammo-11.43x23-fmj">
    <text>.45ACP rounds</text>
  </string>
  <string id="ammo-11.43x23-fmj_descr">
    <text>This cartridge was developed by John Browning in 1904 for use in his prototype Colt semi automatic .45 pistol. It was later adopted by the United States Ordnance Department, in 1911. The .45 caliber has since enjoyed popularity for more than a hundred years thanks to its ballistic performance and a relatively small propellant charge. As a result, the bullet is relatively slow but is highly accurate and provides considerable stopping power.</text>
  </string>

Here all entries have the same tag, but differ on the id attribute, so uncheck the "Sort by tag name" option, select the "Sort by specific attributes" option and click on the ... box to get this:

Sort by Attribute


The only attribute in this file is called "id", but if there were more they would all be listed here. Now when you sort, the order will be on the values in the id attribute and not the tags themselves. Perfect for many STALKER language files.

Sorting before comparison/merging can be a very useful tool, one that can save you a lot of time. Hope this was useful.

Post comment Comments
Hegel
Hegel - - 351 comments

Very useful. Thanks for this.

Reply Good karma Bad karma+4 votes
epic40k
epic40k - - 276 comments

This is great thank you! :)

Reply Good karma Bad karma+4 votes
alanberserkrose
alanberserkrose - - 1,186 comments

I use diffchecker.

Reply Good karma Bad karma+2 votes
Post a comment

Your comment will be anonymous unless you join the community. Or sign in with your social account: