I've been using HTML::SimpleLinkExtor to extract links from this page: http://cpc.cs.qub.ac.uk/authorIndex/AUTHOR_index.html Although it works great for everyting, it doesn't when one link has 'Ç' as a character. What it does it changes it to %C7. Therefore when I use the link in the rest of my program I get a code 404 error. Here's my code:
#!/usr/bin/perl
use strict;
use warnings;
use HTML::SimpleLinkExtor;
use Time::HiRes qw(sleep);
use Test::WWW::Selenium;
use Test::More "no_plan"; #tests => 37; #
#use Test::Exception;
Test::More->builder->output ('result.txt');
Test::More->builder->failure_output ('errors.txt');
my $base = "http://cpc.cs.qub.ac.uk/authorIndex/AUTHOR_index.html";
my $sel = Test::WWW::Selenium->new( host => "localhost",
port => 4444,
browser => "*firefox",
browser_url => "http://cpc.cs.qub.ac.uk/" );
################################################
my $extor = HTML::SimpleLinkExtor->new($base);
$extor->parse_url($base);
my @all_links = $extor->a;
################################################
$sel->start();
$sel->open_ok($base);
$sel->open_ok($_) foreach (@all_links);
$sel->stop();
As well, are there any ideas how I can implement the click() function with the extracted links .
Thanks