Perl and PostgreSQL – Importing Data (CSV)

In order to import data into a PostgreSQL database, it must be in a suitable format. CSV files are one such format. A CSV file, or Comma Separated Values file, is a delimited file that uses commas to separate values. They store tabular data in plain text. The first row in a CSV file often contains headers for each column of data. Each row in the file thereafter is a record of related data. In the example contents of a CSV file below, each row contains information relating to a person, including their first name, last name, title and date of birth in the format YYYY-MM-DD (four digit year, two digit month and two digit day).

"firstname","lastname","title","dob"
"Zoe","Davis","Miss","1979-07-11"
"Tom","Ingram","Mr","1971-10-04"
"Karen","Thomas","Mrs","1969-03-08"
"Samantha","Yates","Miss","1995-08-27"

Below is an example of how Perl can be used to import data from a CSV file called ‘personimport.csv’, into a table called ‘person’, which was used in the examples for selecting, inserting, updating and deleting data.

Firstly, a connection to the database is established, the CSV file path is set and the file is opened. An error is output if the file cannot be located. If it does exist, a CSV object is created, and the rows of data are processed one by one. For the first row, the headers are checked to see if the file is in the right format. For all other rows, an ‘Insert‘ statement is constructed and then used to add the new record to the database. Finally, feedback is given as to the number of records added.

use strict;
use warnings;
use DateTime;
use DBI;
use Text::CSV;

# Database connection variable.
my $connect;

eval 
{

    # Connect to database.
    $connect = DBI->connect("DBI:Pg:dbname = demo; host = localhost; port = 5432", 
                            "DemoUN", "DemoPW", {RaiseError => 1});

} 
or do 
{

    # Message confirming unsuccessful database connection.
    print "Database connection unsuccessful.\n";

    # Stop program execution.
    exit(1);

};

# File path.
my $filename = "C:\\demo\\personimport.csv";

eval
{

    # Open the CSV file.
    open(my $csvFile, '<:encoding(utf8)', $filename) or die "Could not locate the CSV file";

    # Headers and record count.
    my $headers = "Yes";
    my $recordCount = 0;

    # Create a CSV object.
    my $csvObj = Text::CSV->new(
    {
        binary => 1,
        auto_diag => 1,
        sep_char => ','
    });

    # Read and parse each line of the CSV file.
    while (my $data = $csvObj->getline($csvFile))
    {

        # Check the file headers if it is the first line.
        if ($headers eq "Yes")
        {

            if (($data->[0] ne "firstname") or
                ($data->[1] ne "lastname") or
                ($data->[2] ne "title") or
                ($data->[3] ne "dob"))
            {

                # Message stating incorrect CSV file headers.
                print "Incorrect CSV file headers.";
                exit(1);

            }
            else
            {
                $headers = "No";
            }

        }
        else
        {

            # Query parameters.
            my $firstname = $data->[0];
            my $lastname = $data->[1];
            my $title = $data->[2];
            my $dob = DateTime->new(
                year => substr($data->[3], 0, 4),
                month => substr($data->[3], 5, 2),
                day => substr($data->[3], 8, 2),
                hour => 0,
                minute => 0,
                second => 0
            );

            # Query text.
            my $sqlText = " \
                INSERT INTO person \
                    (firstname, lastname, title, dob)  \
                VALUES (?, ?, ?, ?) \
            ";

            # Prepare the query.
            my $sqlInsert = $connect->prepare($sqlText);

            # Execute the query.
            $sqlInsert->execute($firstname, $lastname, $title, $dob);

            # Increment the record count.
            $recordCount += 1;

        }

    }

    # Check for the end of file.
    if (not $csvObj->eof)
    {
        $csvObj->error_diag();
    }

    # Provide feedback on the number of records added.
    if ($recordCount == 0)
    {
        print "No new person records added.\n";
    }
    elsif ($recordCount == 1)
    {
        print "$recordCount person record added.\n";
    }
    else
    {
        print "$recordCount person records added.\n";
    }

    # Close the CSV file.
    close $csvFile;

}
or do
{

    # Confirm error adding person information and exit.
    print "Error adding person information.\n";
    exit(1);

}

The contents of the ‘person’ table now looks as follows.

id firstname lastname title dob
1 Bob Smith Mr 1980-01-20
3 Fred Bloggs Mr 1975-05-07
4 Alan White Mr 1989-03-20
5 Fiona Bloggs Mrs 1985-05-19
6 Zoe Davis Miss 1979-07-11
7 Tom Ingram Mr 1971-10-04
8 Karen Thomas Mrs 1969-03-08
9 Samantha Yates Miss 1995-08-27

More