Click here to Skip to main content
Rate this: bad
good
Please Sign up or sign in to vote.
See more: Perl HTML Writing CSV
The following Perl script cureently reads in an html file and strips off what I don't need. It also opens up a csv document which is blank.
My problem being is I want to import the stripped down results into the CSV's 3 fields using Name as field 1, Lives in as field 2 and commented as field 3.
The results are getting displayed in the cmd prompt but not in the CSV.
 
use warnings; 
use strict;  
use DBI;
use HTML::TreeBuilder;  
use Text::CSV;
open (FILE, 'punter.htm'); 
#open (my $fh, ">punter.csv") || die "couldn't open the file!";
 
 my $csv = Text::CSV->new (); 
 
$csv->column_names('field1', 'field2', 'field3'); 
open my $fh, ">", "punter.csv" or die "new.csv $!"; 
while ( my $l = $csv->getline_hr(my $fh)) { 
    next if ($l->{'field1'} =~ /xxx/); 
    printf "Field1: %s Field2: %s Field3: %s\n", $l->{'field1'}, $l->{'field2'}, $1->{'field3'}; 
$csv->print (my $fh, [my $name, my $location, my $comment]);
} 
close my $fh1 or die "$!"; 
my $tree = HTML::TreeBuilder->new_from_content(     do { local $/; <FILE> } ); 
 
for ( $tree->look_down( 'class' => 'postbody' ) ) 
{     
my $location = $_->look_down( 'class' => 'posthilit' )->as_trimmed_text;     
my $comment  = $_->look_down( 'class' => 'content' )->as_trimmed_text;     my $name     = $_->look_down( '_tag'  => 'h3' )->as_text;     
$name =~ s/^Re:\s*//;     
$name =~ s/\s*$location\s*$//;      
print "Name: $name\nLives in: $location\nCommented: $comment\n"; } 
 
An example of the html is -
<pre lang="xml"><div class="postbody"> <h3><a href "foo">Re: John Smith <span class="posthilit">England</span></a></h3> <div class="content">Is C# better than Visula Basic?</div> </div>
 
How can I get the results into a CSV?
Posted 7-Jul-11 5:40am
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

I believe your error is rooted in not really understanding the meaning of my. You are using it all over the place, but when you do that you are creating a new variable in the enclosing block. You should really go back and check every instance of my to see if that is what you really want to do.
 
Specifically in (but not limited to) the line:
$csv->print (my $fh, [my $name, my $location, my $comment]);
 
You are:
  • creating a new variable $fh (masking the $fh from your open) where print is expecting you to give it an IO handle
  • creating three new variables $name, $location, $comment where print is expecting to get an arrayref
and note that none of these new variables have values, so no wonder nothing is being printed. The only reason your close is not giving you a warning is that you mistyped the $fh as $fhl. Just fix that and you should see the warning "my" variable $fh masks earlier declaration in same scope.
 
The CSV section would be better as something like this (untested):
my @names = qw(name location comment)
$csv->column_names(@names); 
open my $fh, ">", "punter.csv" or die "new.csv $!"; 
while ( my $l = $csv->getline_hr($fh)) { 
    next if ($l->{'name'} =~ /xxx/); 
    for(@names) { print "$_: ",$l->{$_} }
    $csv->print ($fh, $l);
} 
close $fh or die "$!"; 
 
This takes advantage of the column naming feature also, and should cope with any number of fields. By the way, I'd suggest never using $l as a variable name in Perl as it looks too much like $1 in many fonts, which of course has a special regex meaning.
  Permalink  
v9
Comments
LamboLambo at 20-Jul-11 11:07am
   
Great stuff works a treat, thanks for the input about the usage of 'my' I understand how it can conflict if used too often.
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

If you want just want plain implementation
just use single print statement to a file .CSV file is nothing but fields seperated by comma.
open (MYFILE, ">>$tempFile");
print MYFILE  "field1,field2,field3\n";
 

If you want to use Text::CSV,
Then I don't think you should use that "my" in "my $fh" again.
here -
while ( my $l = $csv->getline_hr(my $fh)) {
  Permalink  
v4
Comments
Member 4749791 at 19-Jul-11 12:53pm
   
While I do sometimes use simple prints for CSVs too, it is a bad habit and will break easily. In this case, the second and third fields in particular could contain commas (e.g. location of "Dallas, TX"), and Text::CSV will quote that properly to avoid issues when reading it later.
harish85 at 20-Jul-11 19:42pm
   
Thanks. Yes what I said was if the OP (after seeing the redecarlation of variables with my everywhere) just require a plain implementation to CSV can go for normal printing to a file directly. But I don't consider its had habit to not to use "Text::CSV" , you can custom that writing with your implementation too.
You have explained it very well in your post. Neat work, My 5!
Thanks,-Harish

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Advertise | Privacy | Mobile
Web03 | 2.8.141220.1 | Last Updated 19 Jul 2011
Copyright © CodeProject, 1999-2014
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100