Privacy and Security Notice

Perl Notes

 

Data Types
Type
Identifier
Description
Example

scalar

$name

a single value (string, numeric, etc.)

$name=4.6;
$name2= "a_string";

list

()

ordered scalar data

@x= (1,2,"three",4.0);
@letters= (A..B);
@numbers=(0..9);

array

@name

a variable that holds a list

@name=(1,"two",3.14,$something);
$listlength= @name; #result is 4 (scalar context)

hash

%name

a set of keys and elements

$name{key} = value;
$fred_list= %fred;
%smooth= ("aaa","bbb","123.4",567.8);
slice: @score{"fred","barney","dino"}= (21,47,6);
@league{keys %team1}=values %team1;


Operators

symbol

example(s)

action

=

$a=$b; $a=4;

assignment

=~

$a=~/regex/

performs regular expression match, substitute

. or .=

$str = $str . "b"; $str.="b"; #same thing

string concatenate


Pragmas

prqagma

meaning

use strict;

generate error if varialbes not declared with my()


Perl Functions

Name

Argument(s) Type

Result Type

Action

qw

comma-separated strings

list

double-quotes arguments

push

list,scalar

appends 'scalar' to 'list'

pop

shift

unshift

reverse

list

list

reverses order of list elements

sort

list

list

places elements in ascending ascii order

chomp

array or scalar variable

removes last record separator from the scalar or from each element of the array

<STDIN>

n/a

scalar or array, depending on context

as scalar: returns next line of standard input, including newline.
as array: returns all lines of input to EOF.

<> (diamond)

"

as <STDIN> but refers to files listed in @argv

rand

scalar

scalar

returns a random number between zero and (argument-1)

srand

none

none

initializes the random number generator.

keys

hash

list

returns list consisting of keys in the hash. arbitrary order!

values

hash

list

returns list containing values in the hash. same arbitrary order

each

hash

2 element list

returns the next successive key/value pair in the hash

delete

hash element

n/a

removes the referenced element from the hash.
example: delete $hashname{"a_key"};

defined

anything

logical

returns FALSE if argument is undef. (usually redundant?)

split

regex,string

list

Returns list consisting of parts of string  delimited by regex.
Default regex is /\s+/ and default string  is $_.

join

string,list

string

makes a big string of the elements in list , separated by string
Example: $bigstring= join($glue,@list);

die "message\n";

terminate and print 'message' to STDERR. System error message appended unless \n included.

warn "message";

as 'die' but don't terminate.


Subroutines

sub name {
my(list of local variables );
(par1, par2, par3...) = @_; #copies of argument list
parn= $_[$#_]; #last argument
# or 'my(par1,par2,par3)=@_;'
expressions;
return value;
}

my(list of variables );

variables entirely private to subroutine

local(list of variables );

variables local to subroutine and decendants


Special Variables

Symbol

Meaning/Use

Example

$_

Default variable

default i/o target, foreach parameter, match string,...

@_

Subroutine argument list

reference arguments as $_[n] within the subroutine


I/O

open(HANDLE,">filename");

Opens a file for reading. Prefix '>' or '>>' opens to write.

close(HANDLE);

Closes file (happens automatically if re-opened or at termination of program)

die "message\n"; warn "mess";

output to STDERR. 'die' terminates program. system message if no '\n'

<STDIN>, <STDOUT>

default filehandles for IO

print [FILEHANDLE] item1,item2,...

write to FILEHANDLE

File Tests

$readable= -r $filename;

$age= -M "/usr/bin/perl";

-r

File or directory is readable

-w

File or directory iw writeable

-x

File or directory is executable

-o

File or directory is owned by user

-R

File or directory is readable by real user, not effective user

-W

File or directory is writable by real user, not effective user

-X

File or directory is executable by real user, not effective user

-O

File or directory is owned by real user, not effective user

-e

File or directory exists

-z

File exists and has zero size (directories are never empty)

-s

File or directory exists and has nonzero size (the returned value is the size in bytes)

-f

Entry is a plain file

-d

Entry is a directory

-l

Entry is a symlink

-S

Entry is a socket

-p

Entry is a named pipe (a 'fifo')

-b

Entry is a block-special file (like a mountable disk)

-c

Entry is a character-special file (like an I/O device)

-u

File or directory is setuid

-g

File or directory is setgid

-k

File or directory has the sticky bit set

-t

isatty() on the filehandle is true

-T

File is "text"

-B

File is "binary"

-M

Modification age in days

-A

Access age in days

-C

Inode-modification age in days


Flow Control

Statement

Format

Comment

if / unless

elsif

else

if (test) {
statement_block;
} elsif (test) {
statement_block;
} else {
statement_block;
}
#or
exp1  if exp2;
exp1 unless exp2;

Note it's spelled elsif, not elseif!

while/until

until (test) {
statement_block;
}
#or
exp1 while exp2;
exp1 until exp2;

do {} while

do {} until

do {
statement block;
} until expression;

for

for (initial_exp; test_exp; re-init_exp) {
statement_block;
}

foreach

foreach $i (@some_list) {
statement_block;
}

$i may be omitted in which case $_ is assumed.

last [LABEL]

last;

breaks out of the innermost enclosing loop block, or out of the block LABEL .

next [LABEL]

next;

skips remainder of innermost enclosing loop block without exiting the loop. If LABEL is present go to the beginning of the block LABEL.

redo

redo;

jumps to beginning of the current loop block


Regular Expressions

a

a single character - matches only itself

. (dot)

matches any single character except newline

/[aeiou]/, /[A-Z]/

character class. Matches any single mentioned character.

\n

matches newline

\000-\377

octal representation

/[^0-9]/

negated character class

\d \w \s

a digit, word-character, or space-character

\D \W \S

NOT a digit, word, or space character

* or {0,}

zero or more of the preceeding character

+ or {1,}

one or more of the preceeding character

? or {0,1}

zero or one of the preceeding character

/a{m,n}/

"m" to "n" a's.

/a{m}/

exactly "m" a's

/a{m,}/

"m" or more a's

/a(.)b(*)c\2d\1/

items matched by parenthesized patterns may be used later in the pattern by '\n' where 'n' refers to the nth parenthesized pattern match. '$n' is a scalar containing the matched string which can be used elsewhere in the script after the pattern match.

Use (?:pattern ) to prevent the memory function.

Precedence

Parentheses

( )  (?:)

Multipliers

? + * {m,n} ?? +? *? {m,n}?

Sequence & Anchoring

abc ^ $ \A \Z (?= )  (?! )

Alternation

|

a|b

match pattern 'a' or 'b'

\b (\B)

anchors to a word boundary (NOT a word boundary)

^a

requires 'a' as first character in string

a$

requires 'a' as last character in string

\A \Z (?=...) (?!...)

anchors -- not yet described

$a =~ regular exp

matches against $a (instead of $_)

m#pattern#

'm' indicates use '#' (or whatever follows) as delimeter instead of '/'.

$` / $& / $'

read-only variables containing the part of the string before/in/following the match

s/old/new/g

substitute 'new' for 'old'. If 'g' is present, do substitution everywhere.

suffices

g

perform substitution repeatedly

i

ignore case

\u, \U, (\l, \L)

uppercase (lowercase) the following letter or STRING.