Click here to Skip to main content
12,750,710 members (38,242 online)
Click here to Skip to main content
Add your own
alternative version


12 bookmarked
Posted 5 Jul 2010

The Basics of Perl and Perl Programming

, 7 Jul 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
Programing Perl Basics - How It Works


Learning how to program can be difficult and is, perhaps, best done by reading well-written programs. Perl is a powerful programming language that is well known in its community for the development of security tools. A programming language is normally compiled, but Perl, the Practical Extraction and Report Language, is interpreted by using the Perl interpreter. Perl has many unique characteristics: anyone who has ever been involved with programming languages has dealt with variables. Variables, a data-type that are normally declared prior to being initialized (or assigned a value), do not have to be declared in Perl. Perl is an object-oriented, high-level programming language, meaning that its classes are the abstraction. The object members, whose name reflects a type that names the class, are concretely defined. A class is instantiated, meaning that the use of a library-defined class is used for an instance of its use. In Perl, these are called Modules. The object members comprise the class and define the methods and the data members worked on by those methods. Perl is unique in that rather than declaring exactly what type of data we use (an integer, a character, a string of a characters, a type of arithmetical and or logical unit of measurement), we just define the data as a scalar, an array, or an associative array (a hash). Perl has many built-in functions and mechanisms that empower it, and to use any of those (such as the built-in cryptography), you must first understand how these data types work. So we will go by example, and see how far of an understanding we can get. There are a plethora of downloadable Perl scripts at the script repository at

So How Do I Start?

The Perl interpreter can be downloaded at or, amongst others. When installed, you can set the environmental variables for its path by using this command (we assume a Win32, or Microsoft Windows, command prompt):

C:\Scripts\> set PATH=%PATH%;.;C:\Perl\bin 

When we set the path, the command “perl”, which is used to interpret a perl script, can be used anywhere in the command line. Before beginning, however, we should define a few terms. Perl can be written using 7-bit ASCII. One keyboard character is considered 8 bits, or one byte. When there is data compression, however, the printed character is actually 7 bits. But Perl can also be written using Unicode, in which the characters are two bytes. But what does that actually mean? Well, the Java language was developed to use two byte characters during a time in the world’s computing industry when global translations were necessary. Different characters, particularly those of the Arabic and Far Eastern nations, did not translate into a simple byte character. Making the characters into Unicode, or UTF-8 (any variation of the two byte encoding scheme), enabled those translations, and in most cases, the conversion is automatic.

Perl has commands, such as perldoc, pod2html, etc. which are used to read the generated site information and pod documents. For instance:

C:\> perldoc           
will open the document with the .pm file extension.
Pod2html will take a .pod document and convert into an HTML web page. 

A Perl Script Walkthrough

Here is an example of a Perl script. It will appear strange, because the techniques we are going to learn are not defined, but they will come into sharper focus:

print "What is your name? ";
chomp($name = <stdin />); # Program waits for user input from keyboard
print "Hello, $name, do you want to learn Perl right now? ";
chomp($response = <stdin />);
$response=lc($response); # response is converted to lowercase
if($response eq "yes" or $response eq "y"){
print "Good Decision! We'll start with this example.\n";
print "O.K. Try again later.\n";
$now = localtime; # Use a Perl function to get the date and time
print "$name, you ran this script on $now.\n";

We’ll name this script First, notice the .pl file extension. Now we run the script on the command line. We first notice that the script is asking for user input (making the script “interactive”):

What is your name? Joe
Hello, Joe, do you want to learn Perl right now? yes
Good Decision! We'll start with this example.
Joe, you ran this script on Mon Jul  5 19:44:09 2010.

Perl statements are always terminated with a semi-colon. The hash mark # indicates the writer’s comment, and is overlooked by the interpreter. We use the “print” function, but notice that there is no newline escape sequence. A computer’s screen on a command line has a cursor blinking to indicate where the next sentence is going to occur. If the cursor has reached the edge of the screen, or if the user has hit the Enter (return) key, then the cursor drops down one line and goes to the far left. This is called Carriage Return-Line Feed. In programming languages like C however, an escape sequence like “\n” is used:

Hello, what is your name?\n” # this prints “Hello, what is your name?” on the screen. We do not need to use this escape sequence, here because many times in Perl you can manipulate strings by using functions like chop() and/or chomp().


Most computer programming languages will contain three file handles. They are called Standard Input, Standard Output, and Standard Error. A file handle is merely a reference to a file in order to open that file. Standard input (STDIN) means any user input mechanism: the keyboard, the mouse, etc. Standard output means how the computer’s information is outputted to the user: the console screen, a printer, sent to a database, etc. Standard error is not in the scope of this topic.

Now consider this brief snippet:

C:\> print “Enter your name:”  ;
        $name = <> ;
        print “Enter your age: “  ;
        $age = <>;
        print “$name”;
        print “$age”;

The output will be:

Enter your name:  Joe
Enter your age: 44

Now it is time to introduce scalars, one of the three data-types used in Perl.


The snippet print “Hello, World\n” will print Hello, World to the screen, or standard output. The word print calls the print function, which takes the text from within the quotes and displays it on the screen. As we have seen, the \n symbolizes a new line or the carriage return. Now we introduce a scalar:

$myscalar = ‘Hello, World\n’
print  “$myscalar” ;  # prints the value of $myscalar

Scalars are declared by using the $ symbol followed by the variable name. The first line feeds the text within the quotes into the scalar, whose name is $myscalar. So why did I use single quotes in the first line and double quotes in the second line? Because Perl performs variable interpolation within double quotes—this means that it replaces the variable name with the value of the variable. This will become clear in the following examples:

$myscalar = ‘Hello\n’ ; 
# variable $myscalar has the value Hello\n
print  ‘$myscalar’  ;  
#because I used single quotes there is no variable interpolation. 

The output will be:


The following is an example of variable interpolation:

$myscalar = ‘Hello’ ;
print  “$myscalar”

The output will be:


This is the difference between single and double quotes. Getting user input is performed by using the diamond operator, <>. It basically grabs input from the user (standard input) to make a program more interactive. Consider the following example:

print ‘Enter your name:’ ;
$username = <> ;  # the user will enter a text which will be fed into the scalar
print “Hi, $username” ;
The output will be:
Enter your name:  Dave
Hi Dave

The program will print the text “Enter your name:” on the screen and will wait for user input. The text entered by the user will be fed into the scalar $username. Then the program will print Hi, followed by the text entered by the user. Now consider the following code:

print  “Enter your name:” ;
$name = <>;
Print “Enter your age:” ;
$age = <>;
print “$name”  ;
print “$age”;
The output will be:
Enter your name:  Joe
Enter your age: 44

But why did Perl print Joe and 44 on separate lines? There was no newline in the program (\n) character in the program. What happened is that when the user was given an input prompt, he needs to enter some input. Perl keeps on accepting input from the user until the cursor drops down and goes to the far left (the user provides the ending operator – a carriage return or Enter). When the user provides an ending operator, Perl stops taking user input and assigns the entered data input by the user to the variable specified, including the carriage return. This means that the value of the scalar $name is Joe, followed by the carriage-return, or \n. When we print the scalar $name, the value of $name is printed, followed by a carriage return, or \n. to avoid this problem, we use chop() or chomp(). The difference between these two functions will become clearer with the following example:

$variable = “Dave”;
print $variable
The output will be:

In this example:

$variable = “Dave”
print $variable ;
The output will be:

The difference between chop() and chomp() is that chop() will remove the last character of the string irrespective of what it is, whereas chomp() will remove the last character only if it is a newline character (\n). So if we put these facts together, we can write a Perl script that is a little more complicated:

 print "Hello there, and what is your name? ";
 $name = <stdin />;
 print "$name is a very high class name.\n";
 chop($name); # Removes the last character no matter what it is.
 print "$name is a very high class name.\n\n";
 print "$name has been chopped a little too much.\n";
 print "What is your age? ";
 chomp($age=<stdin />); # Removes the last character if
# it is the newline.
 chomp($age); # The last character is not removed
# unless a newline.
 print "For $age, you look so young!\n"; 

So far, we can see that Perl is a free form language, meaning you can place statements anywhere on the line of even cross lines. Whitespace refers to tabs, spaces, and newlines. So we know that the newline is represented as "\n" and must be enclosed in quotes. Whitespace is used to delimit words. Any number of blank spaces is allowed between symbols and words. Whitespace enclosed in single or double quotes is preserved; otherwise, it is ignored. The following expressions are the same:

5+4*2 is the same as 5 + 4 * 2;

And both of the following statements are correct even though the output will show that the whitespace is preserved when quoted:

print "This is a Perl statement.";
print "This
a Perl

The output will be:

This is a Perl statement. This
a Perl

Keep in mind that unlike a UNIX shell, a Win32 DOS prompt shell has its own way of parsing the command line. Since most of your programming will be done in script files, you will seldom need to worry about the shell's interaction, but when a script interfaces with the operating system, problems will occur unless you are aware of what commands you have and how it executes them on your behalf. For instance, assume we understand how to use perl modules--the downloadable modules and the built-in modules that come with a solid perl interpreter download. Consider this next script:

 use File::Find;
 use Win32::File;
# Works on both FAT and NTFS file systems.
 &File::Find::find(\&wanted,"C:\\program files", "C:\\windows");
 sub wanted{
 (Win32::File::GetAttributes($_,$attr)) &&
($attr & DIRECTORY) &&
print "$File::Find::name\n";

So how do we use the Win32 command line? This script will interface with the file system and output the contents of a directory in hierarchal form. We simply use Notepad or type C:\> type con >

Here is where there is no cursor, just a blank screen, so copy and paste the script onto screen, press Control-Z, and you'll be back at the command prompt. The output will be:

C:\program files
C:\program files/Adobe
C:\program files/Adobe/Acrobat_com
C:\program files/Adobe/Acrobat_com/locale
C:\program files/Adobe/Acrobat_com/locale/de_DE
C:\program files/Adobe/Acrobat_com/locale/de_DE/landingPages
C:\program files/Adobe/Acrobat_com/locale/de_DE/landingPages/images
C:\program files/Adobe/Acrobat_com/locale/en_GB
C:\program files/Adobe/Acrobat_com/locale/en_US
C:\program files/Adobe/Acrobat_com/META-INF
   and  so on ….

What happened here? The File::Find module is loaded from the standard Perl library. The first argument to find() is a reference to a subroutine called wanted followed by two directories to be found. The wanted function will check that each name is a directory and list the full pathname of all subdirectories found. $_ is assigned the name of the current directory in the search.

End Note: Conditional Statements

Like the C language, Perl will perform an action if some condition is true. IF some condition is true, THEN do this. If it is not, then do that. Also, take care to note that the operators that are used on numbers work differently than on strings. Consider the following example:

print "Enter your Name:";
$name = <>;
chomp $name ;
if ( $name eq 'Harry') {
print "Hi Harry";
else {
print "You do not have permission to use this computer";
Enter your Name: John
You  do not have permission to use this computer

The while loop not only can be used to repeat a code snippet, but it also allows us to validate the user input and perform a predefined task according to the result of the validation. In this manner, the while loop repeats a block of code as long as the condition is true. The first example will show its basic use, and the second will exemplify an infinite loop (a continuing execution because the condition is true):

while ($count < 10) {
$count += 2;
print "count is now $count\n"; # Gives values 2 4 6 8 10

And the output is:

count is now 2
count is now 4
count is now 6
count is now 8
count is now 10

The next example is used to validate user input, but creates an infinite loop. Just press Control-C to get out of the loop:

print 'Username:' ;
$user = <>;
chomp $user;
while ( $user eq "admin" ) {
print "System info goes here:";

And the output is:

ere:System info goes here:System info goes here:System info goes here:System inf
o goes here:System info goes here:System info goes here:System info goes here:Sy
stem info goes here:System info goes here:System info goes here:System info goes
 here:System info goes here:System info goes here:System info goes here:System i
nfo goes here:System info goes here:System info goes here:System info goes here:
System info goes here:System info goes here:System info goes here:System info go
es here:System info goes here:System info goes here:System info goes here:System
 info goes here:System info goes here:System info goes here:System info goes her
e:System info goes here:System info goes here:System info goes here:System info
goes here:System info goes here:System info goes here:System info goes here:Syst
em info goes here:System info goes here:System info goes here:System info goes h
ere:System info goes here:System info goes here:System info goes here:System inf
o goes here:System info goes here:System info goes here:System info goes here:Sy
stem info goes here:System info goes here:System info goes here:System info goes
here:System info goes here:System info goes here:System info goes here:System i
nfo goes here:System info goes here:System info goes here:System info goes here:


This next program introduces arrays. The opendir function opens the directory structure and assigns it to DIR, the directory file handle. The ..(parent) directory is opened for reading. The readdir function assigns all the rest of the entries in the directory to the array @parentfiles. The closedir function closes the directory. The files are printed (or inside to system, written to the screen) in the order they are stored in the directory structure. This may not be same as using the dir command in DOS:

opendir(DIR, "C:\\Windows") || die "Can't open: $!\n";
# Open parent directory
# Gets a list of the directory contents
closedir(DIR); # Closes the filehandle
foreach $file ( @parentfiles )
# Prints each element of the array
print "$file\n"; }

The output is as expected:

Downloaded Program Files
and  so on ..
etc . . .

Why did we use the @ mark? Because we used the second type of data type. If scalars are singular, then lists and arrays are plural. A list is an ordered collection of scalars. An array is a variable that contains the list. In Perl, sometimes these two terms are used interchangeably. However, the list is the data, and the array is the variable. You can have a list value that isn't an array, but every array holds a list. Note the figure below. Each element of an array or list is a separate scalar variable with an independent scalar value. These values are ordered—that is, they have a particular sequence from the first to the last element. The elements of an array or list are indexed by small integers starting at zero and counting by ones, so the first element of any array or list is always element zero.

When you have a collection of similar data elements, it is easier to use an array than to create a separate variable for each of the elements. The array name allows you to associate a single variable name with a list of data elements. Each of the elements in the list is referenced by its name and a subscript (also called an index). Perl, unlike C-like languages and managed code, doesn’t check whether the elements of an array are of the same data type. They can be a mix of numbers and strings. To Perl, an array is a named list containing an ordered set of scalars. The name of the array starts with a @ sign. The subscript follows the array name and is enclosed in square brackets ([]). Subscripts are simply integers and start at zero. So let’s examine this script:

@grades = (90,89,78,100,87);
print "The original array is: @grades\n";
print "The number of the last index is $#grades\n";
print "The array is truncated to 4 elements: @grades\n";
print "The array is completely truncated: @grades\n";

The output will be:

The original array is: 90 89 78 100 87
The number of the last index is 4
The array is truncated to 4 elements: 90 89 78 100 87
The array is completely truncated:

The array @grades is assigned a list of five numbers. We get the subscript (index) value of the last element in the array. The last subscript in the array has been shorted to 3 (recall an array is usually denoted by brackets [] and, again, is initialized to zero: [4] is 0 1 2 3). Using an empty list causes the array to be truncated to an empty list. Since an array starts counting with zero, the first element is referred to zero. This example should make this clearer. It is meant to output clearly, so it uses the newline:

@array1 = ('I am first', 'I am second', 'I am third', 'I am fourth');
print "$var1\n";
print "$var2\n";
print "$var3\n";
print "$var4\n";

And the output will be:

I am first
I am second
I am third
I am fourth

Let’s look at another example. Here the @names array is initialized with three strings: John, Joe, and Jake. The entire array is printed to standard output (STDOUT). The space between the elements is not printed. Each element of the array is printed, starting with an index number zero. The scalar variable $number is assigned the array @names. The value assigned is the number of elements in the array @names. The number of elements in the array @names is printed:

# Populating an array and printing its values
 @names=('John', 'Joe', 'Jake'); # @names=qw/John Joe Jake/;
 print @names, "\n"; # prints without the separator
 print "Hi $names[0], $names[1], and $names[2]!\n";
 $number=@names; # The scalar is assigned the number
 # of elements in the array
 print "There are $number elements in the \@names array.\n";
 print "The last element of the array is $names[$number - 1].\n";
 print "The last element of the array is $names[$#names].\n";
# Remember, the array index starts at zero!!
 @fruit = qw(apples pears peaches plums);
 print "The first element of the \@fruit array is $fruit[0];
the second element is $fruit[1].\n";
print "Starting at the end of the array; @fruit[-1, -3]\n";

The output will be:

Hi John, Joe, and Jake!
There are 3 elements in the @names array.
The last element of the array is Jake.
The last element of the array is Jake.
The first element of the @fruit array is apples;
the second element is pears.
Starting at the end of the array; plums pears

A Brief Look at WMI

Window Management Instrumentation is hierarchy of namespaces and is used mainly as a tool to empower VBScripts. Information about the state of a system, its page file size, it memory utilization (both virtual and physical), CPU usage, IP settings, etc. The perl script below returns the name and product code of binary information (such as bitmaps, icons, executable files, and so on) used by a Windows Installer application. It was fetched from TechNet and is available at Activestate. Notice that the scalar is set to a method that has the \root\cimv2\ namespace. This variable is made to form a collection of (items) objects, as is standard for any Windows programming. To run this program using an ActiveState interpreter, you must first install the Win32:OLE module:

c:\Perl\> ppm install Win32::OLE

Now the module will execute its routines within the program:

use Win32::OLE('in');
use constant wbemFlagReturnImmediately => 0x10;
use constant wbemFlagForwardOnly => 0x20;

$computer = ".";
$objWMIService = Win32::OLE->GetObject
    ("winmgmts:\\\\$computer\\root\\CIMV2") or die "WMI connection failed.\n";
$colItems = $objWMIService->ExecQuery
    ("SELECT * FROM Win32_Binary","WQL",wbemFlagReturnImmediately | wbemFlagForwardOnly);

foreach my $objItem (in $colItems)
      print "Caption: $objItem->{Caption}\n";
      print "Data: $objItem->{Data}\n";
      print "Description: $objItem->{Description}\n";
      print "Name: $objItem->{Name}\n";
      print "Product Code: $objItem->{ProductCode}\n";
      print "Setting ID: $objItem->{SettingID}\n";
      print "\n";

And the output is:

Caption: AbortMsiCA.dll
Description: AbortMsiCA.dll
Name: AbortMsiCA.dll
Product Code: {90120000-00A1-0409-0000-0000000FF1CE}
Setting ID: 

Caption: BIN_File_46001
Description: BIN_File_46001
Name: BIN_File_46001
Product Code: {90120000-00A1-0409-0000-0000000FF1CE}
Setting ID: 

Caption: BIN_File_107602
Description: BIN_File_107602
Name: BIN_File_107602
Product Code: {90120000-00A1-0409-0000-0000000FF1CE}
Setting ID: 

Caption: BIN_File_46002
Description: BIN_File_46002
Name: BIN_File_46002
Product Code: {90120000-00A1-0409-0000-0000000FF1CE}
Setting ID: 

Caption: OCFXCA
Description: OCFXCA
Product Code: {90120000-00A1-0409-0000-0000000FF1CE}
Setting ID:
and so on . . . 

Arrays Continued

Multidimensional arrays are sometimes called tables, or matrices. They consist of rows and columns and can be represented with multiple subscripts. In a two-dimensional array, the first subscript represents the row, and the second subscript represents the column. In Perl, each row in a two-dimensional array is enclosed in square brackets. The row is an unnamed list. An unnamed list is called an anonymous array and contains its own elements. The arrow operator, also called an infix operator, can be used to get the individual elements of an array. There is an implied –> between adjacent brackets. Consider the following example. The array @matrix is assigned four unnamed, or anonymous, arrays. Each of the arrays has three values. The addresses of the four anonymous arrays are printed. To access the individual elements of an anonymous array, double subscripts or the arrow operator must be used. The first element of the first anonymous array in the @matrix array is printed. The –> is called the arrow, or infix, operator. It is used to dereference array and hash references. $matrix[0][0] or $matrix[0]–>[0] is the first element of the first row, where subscripts start at zero. The second row, first element of the @matrix, is printed. $matrix[1]–>[0] is another way to say $matrix[1][0]. The outer for loop will iterate through each of the rows, starting at row zero. After the first iteration of the loop, the second for loop is entered. The inner for loop iterates faster than the outer loop. Each element of a row is printed, and then control returns to the outer for loop. Next, print each element in the matrix. The first index represents the row, and the second index represents the column.

@matrix=( [ 3 , 4, 10 ], # Each row is an unnamed list
[ 2, 7, 12 ],
[ 0, 3, 4 ],
[ 6, 5, 9 ],
) ;
 print "@matrix\n";
 print "Row 0, column 0 is $matrix[0][0].\n";
# can also be written - $matrix[0]->[0]
 print "Row 1, column 0 is $matrix[1][0].\n";
# can also be written - $matrix[1]->[0]
 for($i=0; $i < 4; $i++){
 for($x=0; $x < 3; $x++){
print "$matrix[$i][$x] ";
print "\n";

The output will be:

ARRAY(0x1b6f5c) ARRAY(0x1b6e3c) ARRAY(0x23b064c) ARRAY(0x23b07fc)
Row 0, column 0 is 3.
Row 1, column 0 is 2.
3 4 10
2 7 12
0 3 4
6 5 9


An associative array, more commonly called a hash, consists of one or more pairs of scalars—strings, numbers, or Booleans. The first set of scalars is associated with the second set of scalars. The first string in the pair of strings is called the key, and the second string is called the value. Whereas arrays are ordered lists with numeric indices starting at 0, hashes are unordered lists with string indices randomly distributed. The output to console will not usually appear as you have typed it. Hashes, then, are defined as an unordered list of key/value pairs, similar to a table where the keys are on the left-hand side and the values associated with those keys are on the right-hand side. The name of the hash is preceded by the %. Notice that this hash uses parenthesis:

% pet = ( "Name" => "Sneaky",
"Type" => "Cat",
"Owner" => "Carol",
"Color" => "yellow",);

Accessing the elements of a hash requires understanding accessing the elements of an array. This means that we must understand the functions used when working with arrays. Examine the following code. A hash slice is a list of hash keys whose corresponding values are assigned to another list of keys. The list consists of the hash name preceded by the @ symbol. The list of hash keys is enclosed in curly braces:

# Hash slices
 %officer= ("NAME"=> "John Doe",
"SSN" => "510-22-3456",
"DOB" => "05/19/66"
 @info=qw(Marine Captain 50000);
@officer{'BRANCH', 'TITLE', 'SALARY'}=@info;
# This is a hash slice
# This is also a hash slice
 print "The new values from the hash slice are: @sliceinfo\n\n";
print "The hash now looks like this:\n";
 foreach $key ('NAME', 'SSN', 'DOB', 'BRANCH', 'TITLE', 'SALARY'){
 printf "Key: %-10sValue: %-15s\n", $key, $officer{$key};

And the output is:

Key: NAME      Value: John Doe
Key: SSN       Value: 510-22-3456
Key: DOB       Value: 05/19/66
Key: BRANCH    Value: Marine
Key: TITLE     Value: Captain
Key: SALARY    Value: 50000

In conclusion, know that there are tools you can download on the internet in order to turn Perl scripts into Windows executables. Perl is also powerful in graphics and developing desktop applications. This article is solely meant to be a reference for one who wants to know a little of the Perl basics, programming experience or not, and has been referenced from Ankit Farad’s book “The Unofficial Guide to Ethical Hacking”, and “Programming Perl”, by Larry Wall, Tom Christiansen & Jon Orwant.


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Pref. Trust
United States United States
I started electronics training at age 33. I began studying microprocessor technology in an RF communications oriented program. I am 43 years old now. I have studied C code, opcode (mainly x86 and AT+T) for around 3 years in order to learn how to recognize viral code and the use of procedural languages. I am currently learning C# and the other virtual runtime system languages. I guess I started with the egg rather than the chicken. My past work would indicate that my primary strength is in applied mathematics.

You may also be interested in...


Comments and Discussions

GeneralMemories.... Pin
Josh Fischer6-Jul-10 6:39
mvpJosh Fischer6-Jul-10 6:39 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170215.1 | Last Updated 7 Jul 2010
Article Copyright 2010 by logicchild
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid