Converting IBM XML Extender to the Oracle XML port.

Recently, I was given a challenge to convert the data from IBM extensible markup language (“XML”) Extender to the Oracle XML port. Furthermore, the application had to convert 5 Million record in 10 minutes without running out of memory. The client had a program developed in Perl that took them 5 hours to process 5 million records. The purpose of the challenge was to replace the existing application with more efficient application to improve their productivity.

The I was given an offset file which consisted of list of number of character that need to be read from an input data file (IBM XML) and create an output file that consisted of the input XML data and the the offset value so that the data can be imported into oracle.

For example, a sample offset file may look as follows:

0003888
0000582
0000073
0000074
0000721

3888 character in the 1st row indicate that 3888 characted must be read from input data file (IBM XML), writen to the output file along with the offset data. Next the application would read next 582 character from the input file and print the offset value and the data into the output and so on…

The application was developed in Java. Not only we were able convert 5 million records within 10 minutes, but also convert more one file in parallel as long as the input and output file name were unique. Following is the java code that was used to convert the data:

//Open offset file
BufferedReader offsetFile =new BufferedReader(new FileReader(“Offset.txt”));
//Open data file
BufferedReader dataFile =new BufferedReader(new FileReader(“input_data_file.txt”));
// output file
try {

File ofile =new File(“Output_data_file.txt”);
if (!ofile.exists()) {
ofile.createNewFile();
}
FileWriter fw =new FileWriter(ofile.getAbsoluteFile());
BufferedWriter outFile =new BufferedWriter(fw);
// read data from the input file, add the offset and write to the output file.
int len = 0;
String s =“”;
boolean done = false;
char[] cbuf = newchar[1024];
while (true) {
do {
s = offsetFile.readLine();
if (s == null) {
done =true;
break;
}
len = Integer.parseInt(s);
outFile.write(s);
}while (len <= 0);
if (done) break;
cbuf =newchar[len];
dataFile.read(cbuf, 0, len);
outFile.write(cbuf);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {

if (offsetFile != null) offsetFile.close();
if (dataFile != null) dataFile.close();
if (outFile != null) outFile.close();
}
catch (IOException ex) {
ex.printStackTrace();
}
}

Vazi Okhandiar, PMP, MBA, MCT, MSCS
IT Trainer and Consultant
NR Computer Learning Center
www.nrclc.com

Post Tagged with , , , ,