Archive

Posts Tagged ‘enterprise web services’

Downloading and Converting EWS BulkDownload TSV Files

April 28th, 2009 Comments off

For the longest time, I avoided using the BulkDownload service of Yahoo’s Enterprise Web Services (EWS) for managing our daily pay per click (PPC) account. Now this wasn’t because I did not want to use it, I just thought it was was going to be a severe pain to get it implemented. The reasoning for my hesitation in implementing the BulkDownload feature was because I was pretty ignorant to different encodings. Now that my knowledge of encoding types has blossomed over time, I felt I was ready to tackle the project of using and implementing Yahoo’s BulkDownload service in a complete daily account synchronization for avery large PPC account structure.

When using Yahoo’s EWS BulkDownload service, you can request one of two file formats to be returned to you. These two file formats are:

  • EXCEL_XML – an Excel 2003 XML format
  • TSV (tab seperated value)

Here is the trickey part: the EXCEL_XML files are UTF-8 encoded and the TSV files are returned in UTF-16LE encoding.  Normally, I like to use comma seperated value (CSV) file formats and use the LOAD DATA [LOCAL] INFILE  syntax of MySQL. (One can just as easily use TSV file formats with the LOAD DATA [LOCAL] INFILE syntax. You just need to make sure you specify

FIELDS TERMINATED BY '\t'

instead of

FIELDS TERMINATED BY ','

in your SQL statement.) Bacause of the TSV file coming in as UTF-16LE encoded, simply grabbing the tab seperated value file and loading it right into MySQL would not work. My solution would just involved a couple extra steps; once I retrieved the account structure file from Yahoo, I would simply loop back through the file and convert the data of the file from ‘utf-16′ to ‘ascii’ using PHPs iconv() function. This can obviously be done in multiple ways, but here would be one way to accomplish the task:

while (!feof($handleIn))
 {
    $content = iconv('utf-16', 'ascii', fread($handleIn, 8192));
    fwrite($handleOut,$content);
 }

Once the file has been successfully converted over to ascii, I can then run a LOAD DATA [LOCAL] INFILE to get the TSV file into a MySQL table:

LOAD DATA local INFILE '[file_name]'
INTO TABLE [table_name]
FIELDS TERMINATED BY '\t' optionally ENCLOSED BY '\"' LINES TERMINATED BY '\n'
IGNORE 2 LINES;

The reason for the IGNORE 2 LINES is because the first two lines of TSV file are basically header rows and I consider them to be junk rows.