Page 1 of 2

[BUG] A Bazillion bugs in blueprint.xml

Posted: Tue Oct 02, 2012 12:33 am
by tazardar
Hi,

I'm currently developing a Ship editor for FTL and am currently trying to read the "xml" file. There are literally a hundred errors in there preventing it to be recognized as proper xml. Here is a complete List of my findings.:

1. A comment before the declaration. This is not valid.

2.In the systemBlueprints, the <title> tag is wrongly closed with a </type>. Occurs multiple times.

3. Comments start with "<!--------". This is not valid. Comments can not contain "--" and can not end with a "-". Occurs MULTIPLE times.

4. Somewhere the <speed> tag is wrongly closed with a </image> tag. Occurs multiple times.

5. Somewhere the <title> tag is wrongly closed with a </ship> tag. Occurs multiple times.

6. HUGE chunks of "-" in the comments and of VARYING AMOUNTS. This took like for ever to fix.

7. A <tooltip> tag is wrongly closed with a </desc>-->. Thats not even a tag. Occurs only once, but IIRC in the "failsafe" ship

8. a <shields ...> tag is closed with a </slot> tag after regular slots.

9. a space " " is missing between some start="false" and img=... Same thing with start="true". Occurs multiple times.

10. A comment is opened in the middle of the file but never closed. "<!-- sardonyx"

11. Some <shipBlueprint> tags are closed with a </ship> tag.

Also, in every <shipname>.xml, the <gib1> tag is closed with </gib2>

Took me a few hours to find everything. There is probably more in the event_x.xml and autoBlueprint.xml.

I already wrote some code that fixes the blueprin.xml and <shipname>.xml, which would be a temporary fix.

Greets, TAz

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Wed Oct 03, 2012 11:18 pm
by wayward
Good work, TAz! I suggest you fix those files, test them out, and send them to the developers. Since you've already done most of the heavy lifting, it would be nice not to make them do it all over again :)

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Thu Oct 04, 2012 2:34 am
by Kaerius
Not all of those are actual bugs, in fact several are likely to be weird simply weird coding practice, and may be engine related(the game expects them that way). The <!---> comment ones are unlikely to cause problems, I've seen that done the same way elsewhere(dungeons of dredmor, specifically).

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Thu Oct 04, 2012 6:56 pm
by Icehawk78
Kaerius wrote:Not all of those are actual bugs, in fact several are likely to be weird simply weird coding practice, and may be engine related(the game expects them that way). The <!---> comment ones are unlikely to cause problems, I've seen that done the same way elsewhere(dungeons of dredmor, specifically).

They may not be in-game bugs because the XML parser the game uses itself isn't W3C compliant (and thus is able to ignore "quirks" like these) - this doesn't make the XML itself actually valid, and makes modding quite a bit more difficult to do how they're prefer (ie distributing as little actual game assets as possible) because without valid XML that can be parsed with external parsers, mod-makers will be forced to distribute entire events/chunks of shipBlueprint xml, etc.

If Justin or Matthew are interested, I already have a Ruby script which I can run across the entire base data.dat folder of XML files and will "repair" a large number of these errors (I didn't notice the mismatched start/end tags, so I can't guarantee those will be fixed, but running it on boss_1.xml appears to have fixed it on that file for me) and send a "repaired" file, but since that would obviously be a massive resource redistribution, I 1) don't want to even bother running it until I hear that they're interested, and 2) don't want to post the end-product publicly.

If you are, let me know, and I can either pass along the script (or, for that matter, I can already currently post the script itself if you'd rather see what it's doing and run it on your own stuff, rather than just accepting the post-processed data.dat).

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Thu Oct 04, 2012 8:13 pm
by Hissatsu
I had some experience with XML format and this is a wierd format, because it has, for some reason, zealous parsers that will not parse it even if it is parseable (for example, nothing prevents parsing <!----- ---> but since it violates schema, parser will spit error and stop instead of continuing). This is ridiculous because obviously, whenever a human input is used (and xml is by design a human-readable text format, thus human-editable as well in a basic text editor) there should be not used such a strict form of parsing - only fatal errors must stop parsing.

What this means is convinient popular xml parsers will spit errors, while often custom parsers built for specific project will ignore them. So not only this game has xml files that are not schema-compliant.

I was told to use something like XmlTidy (a library originally for fixing http document syntax, but can fix xml as well) to fix the document first before trying to open it with a parser. So yeah, if you want to be able to parse those bad files right now, probably this is the better direction. However, i dont think it can fix unmatching tags...

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Thu Oct 04, 2012 9:12 pm
by Icehawk78
Hissatsu wrote:I had some experience with XML format and this is a wierd format, because it has, for some reason, zealous parsers that will not parse it even if it is parseable (for example, nothing prevents parsing <!----- ---> but since it violates schema, parser will spit error and stop instead of continuing). This is ridiculous because obviously, whenever a human input is used (and xml is by design a human-readable text format, thus human-editable as well in a basic text editor) there should be not used such a strict form of parsing - only fatal errors must stop parsing.

What this means is convinient popular xml parsers will spit errors, while often custom parsers built for specific project will ignore them. So not only this game has xml files that are not schema-compliant.

I was told to use something like XmlTidy (a library originally for fixing http document syntax, but can fix xml as well) to fix the document first before trying to open it with a parser. So yeah, if you want to be able to parse those bad files right now, probably this is the better direction. However, i dont think it can fix unmatching tags...

Well, yes and no. XML is designed to be readable, but it does have a very strict syntax that is often ignored. Thus, while it is made to be readable, that doesn't mean it's made to be abused.

However, regardless of your feelings on how XML parsers "should" or "shouldn't" work, the simple fact of the matter is that a majority of them don't work with those in the game, despite it working in the game. Additionally, the game itself doesn't have a custom XML parser of any sort - it's simply using TinyXML which is built solely for speed in C++, and as a result has some non-standard "quirks" in how it handles code.

Also, I don't think anyone was saying it was just FTL that has non-compliant XML or even that there's anything inherently wrong with that. The problem only arises because XML-modding seems to be the big thing of the day that everyone wants to play with, and since there are no built-in modding tools (and I don't think there needs to be, either - the community seems perfectly fine developing our own), then having W3C compliant XML for the default game files would make working with the game files much easier and robust.

I've never used XmlTidy, but the Ruby script I'd mentioned does essentially what you suggested - fixes common problems in the raw strings, before passing it to the XML Parser. However, ideally it would be better (not to mention faster) if that didn't need to be done for all of the base game files.

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Thu Oct 04, 2012 10:22 pm
by boa13
Icehawk78 wrote:(for example, nothing prevents parsing <!----- ---> but since it violates schema, parser will spit error and stop instead of continuing). This is ridiculous because obviously (...)

No it not "obvious" and "ridiculous", it is the whole purpose of XML, and the reason of its success in business settings. An invalid file is invalid, there is no gray area and no temptation to reinvent that wheel. And it doesn't "violate schema" by the way, but the core XML specification. Sure, the text file could still be parsed (and is parsed by the game actually), but the fact that in this thread workarounds, converters, etc. are being discussed shows the utility of a strict standard.

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Thu Oct 04, 2012 11:47 pm
by Derakon
Let me put it this way: the reason why web development is such a colossal pain in the neck today is because 10-15 years ago web browsers were accepting HTML that violated the schema (using invalid syntax or not-technically-valid tags).

Partly this was because the standards weren't evolving fast enough to meet current needs, and partly this was because allowing noncompliant HTML to render made it possible for people who didn't really know what they were doing to make webpages. Arguably the latter reason is a misfeature -- if you want to make webpages, you should learn to do it properly rather than make a mess that will explode on you down the road; the former is a non-issue for XML as it doesn't have rapidly changing requirements.

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Fri Oct 05, 2012 5:26 am
by Hissatsu
Well, I disagree, lets leave it at that. I just had same problem with Artemis (great game btw!) that also parsed XML without being strict, while i was making an editor for that game, and since i used .net, it was strict, and i couldnt parse anything that came from mission makers because like every single one had violations of standart.

I'm an IT specialist myself, and I understand all the advantages of strict standarts, but I think computers are made to serve mankind, not the other way around. Therefore, no matter what, computer must do all it can to prase human command and serve, and only if it cannot stop and say "unable to comply". If i say <b>Bold<i>ItalicBold</b>Italic</i> it is obvious what formatting I want and no problem to understand that. If <!---- --> is parseable, computer should parse it, and so on.

Its like imagine if on Star Trek's bridge, the onboard computer, while the ship is in critical situation, will educate capitain in proper use of english pronounciation instead of complying to his orders, even if those were given hastingly (given the direness of the situation).

Re: [BUG] A Bazillion bugs in blueprint.xml

Posted: Fri Oct 05, 2012 4:04 pm
by Icehawk78
boa13 wrote:
Icehawk78 wrote:(for example, nothing prevents parsing <!----- ---> but since it violates schema, parser will spit error and stop instead of continuing). This is ridiculous because obviously (...)

For reference, that wasn't me - I quoted someone else who said that.