mercredi 1 juillet 2015

Invisible differences in XML files - breaking self-made XML parser

I have two programs that interact with an well-defined XML file. The first program (Model) reads it in, parses it, and uses content from the file to direct the running of a model. The second program (Controller) opens up and rewrites the XML file, allowing different settings to be run in the Model.

Model is written in C++, worked with in VS2010 and VS2012, has no GUI, and uses a home-made (is this the correct term?) XML parser that has worked for many years without fail - I just checked the SVN for revisions to the files that make it up - nothing since 2013. Controller is written in C#, in VS2012, with a GUI that has drop downs that set the content of the XML file, and uses the XmlDocument class to read in, edit, and print out the XML file .

Suddenly, the Controller no longer spits out XML files that can be read by Model. When Model tries to read the XML file, the first character it encounters it reads as '-17'. AS far as I have been able to tell this means that it doesn't recognize it as an UTF-8 character. This cause model to cout the error and then crash. Older XML file (which looks identical to the ones written by Controller) reads in fine.

Below are examples of the files - ignore the content inside the elements please.

Older file:

<?xml version="1.0" encoding="UTF-8" ?> 
<Config>
<Mode value="false" Id="Modeflag" />
<Timestep OutputTimestep="Hourly"  CalibrationTimestep="Daily" />
<InitialInput SubCatchmentNumber="1" ModelCalibration="true" SnowSimulation="false" VegSimulation="Method 1" CatchmentNumber="1" FractionalCatchmentArea="1" />
<InputResource Name="All" Location="C:\Hydro_Code\Hydro_Test_Files\Inputs\V5HarborBrookWeek_Rain" Id="Directory" />
<SimulationScheme SchemeForCatchmentNo="8" Infiltration="true" ChannelRouting="false" Saturation="true" TopographicIndex="true" KDecayWithSoilDepthExp="false" SoilTopoIndex="false" KDecayInPower="true" />
<SnowInput InputCatchmentNumber="1" TempIndexMethod_Hourly="false" RadiationTempIndex_With_SnowInterception="true" EnergyBudgetMethod_With_SnowInterception="false" />
<SnowInputResource Name="All" Location="C:\Hydro_Code\Hydro_Test_Files\Inputs\V5HarborBrookWeek_Rain" Id="SnowDirectory" />
<OutputDirectory Location="C:\Hydro_Code\Hydro_Test_Files\Inputs\V5HarborBrookWeek_Rain\Outputs" Name="Toronto_Output" />
</Config>

Newer file:

<?xml version="1.0" encoding="UTF-8" ?>
<Config>
  <Mode value="false" Id="Modeflag" />
  <Timestep OutputTimestep="Hourly" CalibrationTimestep="Hourly" />
  <InitialInput SubCatchmentNumber="1" ModelCalibration="true" SnowSimulation="false" VegSimulation="Method 1" CatchmentNumber="1" FractionalCatchmentArea="1" />
  <InputResource Name="All" Location="C:\AutoRun_Newest\AutoRun" Id="Directory" />
  <SimulationScheme SchemeForCatchmentNo="8" Infiltration="true" ChannelRouting="false" Saturation="true" TopographicIndex="true" KDecayWithSoilDepthExp="false" SoilTopoIndex="false" KDecayInPower="true" />
  <SnowInput InputCatchmentNumber="1" TempIndexMethod_Hourly="false" RadiationTempIndex_With_SnowInterception="true" EnergyBudgetMethod_With_SnowInterception="false" />
  <SnowInputResource Name="All" Location="C:\AutoRun_Newest\AutoRun" Id="SnowDirectory" />
  <OutputDirectory Location="C:\AutoRun_Newest\Inputs\Output_Timestamp_07012015215112" Name="Toronto_Output" />
</Config>

Adding or taking away the indentation (proper formatting by the XmlDocument class in C#) changes nothing about the behavior of Model.

These files are visually identical, and I can see no odd characters or spacing. What invisible objects/forces/characters or other settings could be causing this new bug?

Is there some background encoding that the XML document class enforces that is new to my home made parser?

Aucun commentaire:

Enregistrer un commentaire