I need to make a program to compare several large +100MB XML files, find the dif
ID: 659154 • Letter: I
Question
I need to make a program to compare several large +100MB XML files, find the difference, and spit out JSON. I have written the comparison code in C++ and it performs great but now when I am getting to the bit about working with the XML data and converting it to JSON I find I'm hitting a lot of walls.
Would it be logical to port the XML comparison code to NodeJS, which may perform slightly slower, but make things a lot easier when it comes to converting XML to JSON and modifying the contents?
Any input on this would be appreciated.
Explanation / Answer
IT depends on the frequency of the task in your hand. If it is going to be one time job, or let's say once in a year job, do not worry about the speed, reduce the complexity of your code (independent from the language you are planning to use).
Otherwise, if such XML files are created every day / hour and require JSON conversion, I will even prefer C since standard libraries might be troublesome.
NodeJS is good for network applications, on the other hand, your problem seems like a good old CLI application. For Interpreter languages, you might want to try Ruby or Python.
A quick search might even give you good libraries or open source implementations.
Do not forget to sample your data for testing. Hope this helps.