This must be a dumb question, but I'm a bit stuck:
I have the an XML file which you can see a sample here:
<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE tmx SYSTEM "56.dtd">
<body>
<tu changedate="20130625T175037Z"">
<tuv xml:lang="pt-pt">
<prop type="x-context-pre"><seg>Some text.</seg></prop>
<prop type="x-context-post"><seg>Other text.</seg></prop>
<seg>The text I'm interested.</seg>
</tuv>
<tuv xml:lang="it">
<seg>And it's translation in italian.</seg>
</tuv>
</tu>
.... followed by other <tu>'s
</body>
Since it's a huge file I'm using XML::Twig to parse it and get the parts I'm interested in. I'm particulary interested in seg's node content aswell as the tu's node attribute.
Here's the code I've got so far:
use 5.010;
use strict;
use warnings;
use XML::Twig;
my $filename = 'filename.tmx';
my $out_filename = 'out.xml';
open my $out, '>', $out_filename;
binmode $out;
my $original_twig = new XML::Twig (pretty_print => 'nsgmls', twig_handlers => {tu => \&original_tu});
$original_twig->parsefile($filename);
sub original_tu {
my($twig, $original_tu) = @_;
my $original_seg = $original_tu-> first_child('./tuv/seg')->text;
}
Perl (or should I say XML::Twig) tells me that I've got: wrong navigation condition './tuv/seg' ()
Does anyone know how to access the seg node's text and , if you're not fed up of me already, how to access the changedate atribute of the tu's node?
Thank you very much.
Dasen