Brulog

Words of occasional wisdom from Bruce Oakley

Parse Delimited Data

Links: Tips-Traps | Change Case | Parse Data | Search-Replace

Purpose: Parse data in a delimited text file and reformat.
Operates on: A selection in the frontmost window of MediaSpan Media Software’s NewsEdit application.
How it works: Steps through paragraphs (records) of a delimited text file, capturing and reformatting data in the fields and writing a revised paragraph back to the file.
Potential enhancements:

  • Do calculations (math, string or date) on the data fields.
  • Test for and ignore empty fields (this version adds a comma or space even if a field is empty, so it can give double spaces or double commas).
  • Add typography specs or do search-and-replace on the revised paragraphs.
  • Work on selected text within a file, or on multiple files selected in a NewsEdit directory.
  • Write results to another file or application, instead of or in addition to the source.

Caution: The code is presented so that it can be copied and pasted into a Script Editor window, but as with any copy and paste, you should check for unintended line breaks or mistranslated special characters. Some script lines may be longer than the browser window; stretch the display to be sure.
Inline comments are indicated by two hyphens:
-- This is an inline comment
Block comments are wrapped in parentheses and asterisks:
(* This is a block comment.
*)

The code:

--Declare variable scope
global thisRecord, newRecord
global theDate, thisDate, thisMonth, thisDay, thisYear

(*
Demo script to parse a collection of records and reformat for text report
This sample assumes a set of records with at least 10 variable-length fields,
  delimited by colon ":"

The pertinent fields, by field name and field number in parentheses, are:
  City(9), State(10), Name(3-6), Address(7-8), Date(2), TransactionCode(1).
The Name field combines four fields:
  FirstName, Middle, LastName, Suffix (Sr./Jr./III, etc.)
Similarly, Address is Address1, Address2

Individual fields are parsed and combined into larger fields as appropriate. The demo
 uses all available fields, but this method could extract these 10 fields
 from a record with 35 fields (the actual script does just that)

The record starts as
Code:Date:FirstName:Middle:LastName:Suffix:Address1:Address2:City:State

The record is revised to
City, State, FirstName Middle LastName Suffix, Address1, Address2, Date. Code

Two sample records (Record 2 has an empty Address2, Field 8):
Sale C345:04/05/2005:John:w:Doe:Sr.:123 Primrose:Apt. 3:Anytown:Anystate:
Sale C678:04/05/2005:John:w:Smith:Sr.:P.O. Box 456::AnyCity:ST:

The actual full script, working on a 35-field record, parses dates in MM/DD/YYYY format
  into Month DD, YYYY format, and does search-and-replace on addresses to abbreviate
  St. and Ave.
*)

display dialog "Working ..." buttons {"•"} giving up after 1 default button 1

--Take control of the application
tell application "NewsEditPro IQue"
  --A "try ... on error ... end try" loop captures errors
  try
  --Take control of the story in the frontmost window
  --Different applications will vary in the commands
  --for referring to window, story, text (word, paragraph, etc.)
  tell story 1
    repeat with i from 1 to count paragraphs
      if paragraph i is not "" then
        --create a blank new record
        set newRecord to ""
        --capture the text of each paragraph as the current record
        set thisRecord to paragraph i as text
        --preserve the default system delimiters
        set OldDelims to AppleScript's text item delimiters
        --set the system delimiters to match the field delimiters and fetch the items list
        set AppleScript's text item delimiters to ":"
        set theItems to text items of thisRecord
        --write particular items to the new record, with appropriate punctuation, formatting
        --Write "City, ST, "
        set newRecord to newRecord & text item 9 of thisRecord & ", " & text item 10 of thisRecord & ", "
        --Add "FirstName Middle LastName Suffix, "
        set newRecord to newRecord & text item 3 of thisRecord & " " & text item 4 of thisRecord & " " & text item 5 of thisRecord & " " & text item 6 of thisRecord & ", "
        --Add "Address1, Address2, "
        set newRecord to newRecord & text item 7 of thisRecord & ", " & text item 8 of thisRecord & ", "
        --Add "Date. "
        set newRecord to newRecord & text item 2 of thisRecord & ". "
        --Add "TransactionCode"
        set newRecord to newRecord & text item 1 of thisRecord
        --Restore system delimiters
        set AppleScript's text item delimiters to OldDelims
        --Write new record back to paragraph (can write to new file instead)
        set paragraph i to newRecord & return
      end if
    end repeat
  end tell
  on error
    display dialog "Paragraph " & i & " does not have enough data fields." & return with icon 2
  end try
end tell

display dialog "Done" buttons {"•"} giving up after 1 default button 1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: