<?xml version="1.0" encoding="UTF-8"?>
<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">
  <title>PaulBarry.com - Importing Data with Rails</title>
  <subtitle type="html">My thoughts, ideas, questions and concerns on technology, sports, music and life</subtitle>
  <id>tag:paulbarry.com,2007:Paulbarry.com</id>
  <generator uri="http://www.paulbarry.com" version="3.0">PaulBarry.com</generator>
  <link href="http://paulbarry.com/xml/atom/article/4888/feed.xml" rel="self" type="application/atom+xml"/>
  <link href="http://paulbarry.com/articles/2008/04/19/importing-data-with-rails" rel="alternate" type="text/html"/>

  <updated>2008-11-06T08:47:01-05:00</updated>
  <entry>
    <author>
      <name>Paul Barry</name>
      <email>mail@paulbarry.com</email>
    </author>
    <id>urn:uuid:d75d6d6a-f6e2-4cea-b5e4-714499506770</id>

    <published>2008-04-19T08:54:22-04:00</published>
    <updated>2008-04-19T08:54:22-04:00</updated>
    <title type="html">Importing Data with Rails</title>
    <link href="http://paulbarry.com/articles/2008/04/19/importing-data-with-rails" rel="alternate" type="text/html"/>

    <category term="technology" scheme="http://paulbarry.com/articles/category/technology" label="Technology"/>
        <category term="Rails" scheme="http://paulbarry.com/articles/tag/rails"/>
    <category term="Ruby" scheme="http://paulbarry.com/articles/tag/ruby"/>
        <summary type="html">&lt;p&gt;Often when working with Rails applications, you need to import data from other sources.  A common source is an excel spreadsheet.  A simple import consists of reading each line in the spreadsheet and creating a record in the database for each line.  You could do this is a small Ruby script with SQL, you wouldn&apos;t need Rails.  But sometimes the import is more complicated.  For example, you may want to run your application validation logic on each record.  Also, maybe you need to create associated record for each row.&lt;/p&gt;

&lt;p&gt;To handle this kind of thing, it can be helpful to use your ActiveRecord data model.  To do that, you can simply create a Ruby script and add these few lines at the top:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;boot&quot;)
require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;environment&quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This will boot up the Rails environment when your script starts, and then you have full access to your Rails models.  You could write a procedural script to handle that, but I&apos;ve found that creating an object-oriented class gives you a little bit cleaner, more re-usable framework.  So let&apos;s just get right to the code.  Here is the code for a base class for your data import:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;class DataImport

  attr_reader :file, :fields, :row_map, :default_e

  #Create DSL methods for subclasses
  class &amp;lt;&amp;lt; self
    def default_environment(env)
      self.send(:define_method, :default_environment) do
        env
      end
    end
    def default_file(file_name)
      self.send(:define_method, :default_file) do
        file_name
      end
    end
  end

  def initialize(env, file)
    load_rails(env || respond_to?(:default_environment) ? 
      default_environment : &quot;development&quot;)
    @file = file || default_file
    raise &quot;You must specify a file&quot; unless @file
  end

  def self.run(env, file)
    new(file, env).run
  end

  def run
    open(file).each_with_index do |line, i|
      initialize_row!(line, i)
    end
  end

  def initialize_row!(line, i)
    tokenize_row!(line)
    if i &amp;lt; 1
      initialize_fields!
    else
      initialize_row_map!
      process_row
    end
  end

  def process_row
    puts row_map.inspect
  end

  private

    def tokenize_row!(line)
      @row = line.split(&apos;|&apos;)
    end

    def initialize_fields!
      @fields = @row.map{|e| e.chomp.to_sym}
    end

    def initialize_row_map!
      @row_map = {}
      @row.each_with_index do |c, i|
        @row_map[fields[i]] = c.blank? ? nil : c.strip
      end
    end

    def load_rails(env)
      ENV[&apos;RAILS_ENV&apos;] = env
      require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;boot&quot;)
      require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;environment&quot;)        
    end

end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Alright, that&apos;s a pretty big chunk of code, but this is the implementation of a base class that you will reuse.  Don&apos;t worry, your actual import class will be much shorter.  In other words, you can copy and paste this right into your app and use it as is, but if you are interested to find out how it works, read the next few paragraphs.&lt;/p&gt;

&lt;p&gt;So the first interesting thing you&apos;ll encounter in this code is the DSL-ish methods.  To understand how this works, you really need to read Why The Lucky Stiff&apos;s &lt;a href=&quot;http://whytheluckystiff.net/articles/seeingMetaclassesClearly.html&quot;&gt;Seeing Metaclasses Clearly&lt;/a&gt;.  The &lt;a href=&quot;http://paulbarry.com/articles/2008/04/17/the-rules-of-ruby-self&quot;&gt;talk Dave Thomas gave just the other day at the NovaRUG&lt;/a&gt; would help too.  But basically what it does is define 2 class methods that are intended to be used by subclasses during class definition.  When called, they will define methods that the base class can then use.  This the &lt;a href=&quot;http://paulbarry.com/articles/2008/04/17/calling-methods-during-class-definition&quot;&gt;concept I blogged about the other day in action&lt;/a&gt;.  They are conceptually the same thing as the definition of the &lt;code&gt;belongs_to&lt;/code&gt; and &lt;code&gt;has_many&lt;/code&gt; methods in ActiveRecord.  It will make more sense when you see an implementation.&lt;/p&gt;

&lt;p&gt;Next up is the constructor which handles setting the file instance variable for our data import class, as well as loading up rails with the right environment specified.  After that are class and instance methods both called &lt;code&gt;run&lt;/code&gt;.  The idea here is that we want to work with an instance of the data import class, but it will be convenient to just call &lt;code&gt;OurDataImport.run&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The work happens in the &lt;code&gt;run&lt;/code&gt; instance method.  This opens up the file and starts processing it line by line.  In this method I&apos;m trying to employ a technique, or more of a style I guess, that Marcel Molina spoke about at the &lt;a href=&quot;http://www.dcrug.org/2008/3/20/march-26-2008-meeting&quot;&gt;DC Ruby Users Group&lt;/a&gt;.  The idea is that you should strive to as much as possible have all of the code within a method be at the same level of abstraction.  If you look at this whole method:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;def initialize_row!(line, i)
  tokenize_row!(line)
  if i &amp;lt; 1
    initialize_fields!
  else
    initialize_row_map!
    process_row
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It&apos;s easy to read it and understand what it is going to do.  First we are going to tokenize the row, then if it is the first row, we will initialize the fields, otherwise, we will initialize the row map and process the row.  For example, this method could be written like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;def initialize_row!(line, i)
  tokenize_row!(line)
  if i &amp;lt; 1
    initialize_fields!
  else
    @row_map = {}
    @row.each_with_index do |c, i|
      @row_map[fields[i]] = c.blank? ? nil : c.strip
    end
    process_row
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But there is an abstraction-level switching that you have to go through mentally once you get to the first line after the else.  The rest of the method is composed of intent-revealing methods, but then we just have this lower-level chunk of code that deals with setting instance variables.  So don&apos;t do that, the other implementation is cleaner, leads to code that is composed well and is easier to test and extend.&lt;/p&gt;

&lt;p&gt;So the meat of what happens here is that the run method reads in the file row by row.  It assumes the data will be pipe-separated (that is, records separated with the &quot;|&quot; character), because I find that to be easiest to parse.  It&apos;s trival to convert an excel spreadsheet to a pipe-separated text file using OpenOffice.  If your data is not pipe-separated, you could override &lt;code&gt;tokenize_row&lt;/code&gt; to split up the row some other way.  It assumes the first row contains the field names that each column will map to, so if we are on the first row, it just stores away the field names.  Then, on each subsequent row it constructs a map (a.k.a hash) containing the column name and values.  Then it calls the &lt;code&gt;process_row&lt;/code&gt; method.  The implementation of the &lt;code&gt;process_row&lt;/code&gt; doesn&apos;t do anything interesting in this base class because the intent is for you to override that in your subclass.&lt;/p&gt;

&lt;p&gt;Ok, so now let&apos;s put this to use.  Create a rails app with a simple user model:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ rails myapp
$ cd myapp
$ script/generate model user name:string email:string
$ rake db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now copy the whole base DataImport class from above into &lt;code&gt;db/data/data_import.rb&lt;/code&gt;.  Then create a data file at &lt;code&gt;db/data/users.txt&lt;/code&gt; with something like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;name|email
Paul Barry|mail@paulbarry.com
Someone Else|someone_else@example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And then finally we&apos;ll create an implementation of our data import at &lt;code&gt;db/data/user_data_import.rb&lt;/code&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;require &apos;data_import&apos;
class UserDataImport &amp;lt; DataImport

  default_file &quot;users.txt&quot;

  def process_row
    user = User.create!(row_map)
    puts &quot;Created =&amp;gt; #{user.inspect}&quot;
  end

end

UserDataImport.run(ARGV[0], ARGV[1])
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So now we have a pretty clear, concise file that explains what we are doing.  You can see the call to &lt;code&gt;default_file&lt;/code&gt; that allows us to set our default file name using a clean, DSL-ish syntax.  We could also call &lt;code&gt;default_environment&lt;/code&gt; there as well if we wanted to, but we don&apos;t have to.  This is a very simple import where we just create a user for each row.  The last line of the script runs the import, passing in the command line arguments.  If you pass no arguments, it will work, using &quot;development&quot; for the environment and &quot;users.txt&quot; for the file name.  A real data import is likely to do some more interesting work with the data, but at least this gets all the plumbing of processing the data file out of the way for you and allows you to focus on the logic of what you need to do with the data.  All that&apos;s left to do is simply run the &lt;code&gt;db/data/user_data_import.rb&lt;/code&gt; script.&lt;/p&gt;

&lt;p&gt;Sidenote: I&apos;ve found that if your want to run the script from textmate, you need to add this line top of your script, due to a conflict in the ruby libraries provides with TextMate.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$:.reject! { |e| e.include? &apos;TextMate&apos; }
&lt;/code&gt;&lt;/pre&gt;
</summary>
    <content type="html">&lt;p&gt;Often when working with Rails applications, you need to import data from other sources.  A common source is an excel spreadsheet.  A simple import consists of reading each line in the spreadsheet and creating a record in the database for each line.  You could do this is a small Ruby script with SQL, you wouldn&apos;t need Rails.  But sometimes the import is more complicated.  For example, you may want to run your application validation logic on each record.  Also, maybe you need to create associated record for each row.&lt;/p&gt;

&lt;p&gt;To handle this kind of thing, it can be helpful to use your ActiveRecord data model.  To do that, you can simply create a Ruby script and add these few lines at the top:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;boot&quot;)
require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;environment&quot;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This will boot up the Rails environment when your script starts, and then you have full access to your Rails models.  You could write a procedural script to handle that, but I&apos;ve found that creating an object-oriented class gives you a little bit cleaner, more re-usable framework.  So let&apos;s just get right to the code.  Here is the code for a base class for your data import:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;class DataImport

  attr_reader :file, :fields, :row_map, :default_e

  #Create DSL methods for subclasses
  class &amp;lt;&amp;lt; self
    def default_environment(env)
      self.send(:define_method, :default_environment) do
        env
      end
    end
    def default_file(file_name)
      self.send(:define_method, :default_file) do
        file_name
      end
    end
  end

  def initialize(env, file)
    load_rails(env || respond_to?(:default_environment) ? 
      default_environment : &quot;development&quot;)
    @file = file || default_file
    raise &quot;You must specify a file&quot; unless @file
  end

  def self.run(env, file)
    new(file, env).run
  end

  def run
    open(file).each_with_index do |line, i|
      initialize_row!(line, i)
    end
  end

  def initialize_row!(line, i)
    tokenize_row!(line)
    if i &amp;lt; 1
      initialize_fields!
    else
      initialize_row_map!
      process_row
    end
  end

  def process_row
    puts row_map.inspect
  end

  private

    def tokenize_row!(line)
      @row = line.split(&apos;|&apos;)
    end

    def initialize_fields!
      @fields = @row.map{|e| e.chomp.to_sym}
    end

    def initialize_row_map!
      @row_map = {}
      @row.each_with_index do |c, i|
        @row_map[fields[i]] = c.blank? ? nil : c.strip
      end
    end

    def load_rails(env)
      ENV[&apos;RAILS_ENV&apos;] = env
      require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;boot&quot;)
      require File.join(File.dirname(__FILE__), &quot;..&quot;, &quot;..&quot;, &quot;config&quot;, &quot;environment&quot;)        
    end

end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Alright, that&apos;s a pretty big chunk of code, but this is the implementation of a base class that you will reuse.  Don&apos;t worry, your actual import class will be much shorter.  In other words, you can copy and paste this right into your app and use it as is, but if you are interested to find out how it works, read the next few paragraphs.&lt;/p&gt;

&lt;p&gt;So the first interesting thing you&apos;ll encounter in this code is the DSL-ish methods.  To understand how this works, you really need to read Why The Lucky Stiff&apos;s &lt;a href=&quot;http://whytheluckystiff.net/articles/seeingMetaclassesClearly.html&quot;&gt;Seeing Metaclasses Clearly&lt;/a&gt;.  The &lt;a href=&quot;http://paulbarry.com/articles/2008/04/17/the-rules-of-ruby-self&quot;&gt;talk Dave Thomas gave just the other day at the NovaRUG&lt;/a&gt; would help too.  But basically what it does is define 2 class methods that are intended to be used by subclasses during class definition.  When called, they will define methods that the base class can then use.  This the &lt;a href=&quot;http://paulbarry.com/articles/2008/04/17/calling-methods-during-class-definition&quot;&gt;concept I blogged about the other day in action&lt;/a&gt;.  They are conceptually the same thing as the definition of the &lt;code&gt;belongs_to&lt;/code&gt; and &lt;code&gt;has_many&lt;/code&gt; methods in ActiveRecord.  It will make more sense when you see an implementation.&lt;/p&gt;

&lt;p&gt;Next up is the constructor which handles setting the file instance variable for our data import class, as well as loading up rails with the right environment specified.  After that are class and instance methods both called &lt;code&gt;run&lt;/code&gt;.  The idea here is that we want to work with an instance of the data import class, but it will be convenient to just call &lt;code&gt;OurDataImport.run&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The work happens in the &lt;code&gt;run&lt;/code&gt; instance method.  This opens up the file and starts processing it line by line.  In this method I&apos;m trying to employ a technique, or more of a style I guess, that Marcel Molina spoke about at the &lt;a href=&quot;http://www.dcrug.org/2008/3/20/march-26-2008-meeting&quot;&gt;DC Ruby Users Group&lt;/a&gt;.  The idea is that you should strive to as much as possible have all of the code within a method be at the same level of abstraction.  If you look at this whole method:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;def initialize_row!(line, i)
  tokenize_row!(line)
  if i &amp;lt; 1
    initialize_fields!
  else
    initialize_row_map!
    process_row
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It&apos;s easy to read it and understand what it is going to do.  First we are going to tokenize the row, then if it is the first row, we will initialize the fields, otherwise, we will initialize the row map and process the row.  For example, this method could be written like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;def initialize_row!(line, i)
  tokenize_row!(line)
  if i &amp;lt; 1
    initialize_fields!
  else
    @row_map = {}
    @row.each_with_index do |c, i|
      @row_map[fields[i]] = c.blank? ? nil : c.strip
    end
    process_row
  end
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But there is an abstraction-level switching that you have to go through mentally once you get to the first line after the else.  The rest of the method is composed of intent-revealing methods, but then we just have this lower-level chunk of code that deals with setting instance variables.  So don&apos;t do that, the other implementation is cleaner, leads to code that is composed well and is easier to test and extend.&lt;/p&gt;

&lt;p&gt;So the meat of what happens here is that the run method reads in the file row by row.  It assumes the data will be pipe-separated (that is, records separated with the &quot;|&quot; character), because I find that to be easiest to parse.  It&apos;s trival to convert an excel spreadsheet to a pipe-separated text file using OpenOffice.  If your data is not pipe-separated, you could override &lt;code&gt;tokenize_row&lt;/code&gt; to split up the row some other way.  It assumes the first row contains the field names that each column will map to, so if we are on the first row, it just stores away the field names.  Then, on each subsequent row it constructs a map (a.k.a hash) containing the column name and values.  Then it calls the &lt;code&gt;process_row&lt;/code&gt; method.  The implementation of the &lt;code&gt;process_row&lt;/code&gt; doesn&apos;t do anything interesting in this base class because the intent is for you to override that in your subclass.&lt;/p&gt;

&lt;p&gt;Ok, so now let&apos;s put this to use.  Create a rails app with a simple user model:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ rails myapp
$ cd myapp
$ script/generate model user name:string email:string
$ rake db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now copy the whole base DataImport class from above into &lt;code&gt;db/data/data_import.rb&lt;/code&gt;.  Then create a data file at &lt;code&gt;db/data/users.txt&lt;/code&gt; with something like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;name|email
Paul Barry|mail@paulbarry.com
Someone Else|someone_else@example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And then finally we&apos;ll create an implementation of our data import at &lt;code&gt;db/data/user_data_import.rb&lt;/code&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;require &apos;data_import&apos;
class UserDataImport &amp;lt; DataImport

  default_file &quot;users.txt&quot;

  def process_row
    user = User.create!(row_map)
    puts &quot;Created =&amp;gt; #{user.inspect}&quot;
  end

end

UserDataImport.run(ARGV[0], ARGV[1])
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So now we have a pretty clear, concise file that explains what we are doing.  You can see the call to &lt;code&gt;default_file&lt;/code&gt; that allows us to set our default file name using a clean, DSL-ish syntax.  We could also call &lt;code&gt;default_environment&lt;/code&gt; there as well if we wanted to, but we don&apos;t have to.  This is a very simple import where we just create a user for each row.  The last line of the script runs the import, passing in the command line arguments.  If you pass no arguments, it will work, using &quot;development&quot; for the environment and &quot;users.txt&quot; for the file name.  A real data import is likely to do some more interesting work with the data, but at least this gets all the plumbing of processing the data file out of the way for you and allows you to focus on the logic of what you need to do with the data.  All that&apos;s left to do is simply run the &lt;code&gt;db/data/user_data_import.rb&lt;/code&gt; script.&lt;/p&gt;

&lt;p&gt;Sidenote: I&apos;ve found that if your want to run the script from textmate, you need to add this line top of your script, due to a conflict in the ruby libraries provides with TextMate.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$:.reject! { |e| e.include? &apos;TextMate&apos; }
&lt;/code&gt;&lt;/pre&gt;
</content>
  </entry>
  <entry>
    <author>
      <name>James Edward Gray II</name>
    </author>
    <id>urn:uuid:054caf2b-fd57-474d-8de4-47ee0d065017</id>
    <published>2008-05-15T09:09:59-04:00</published>
    <updated>2008-05-15T09:09:59-04:00</updated>
    <title type="html">Comment on "Importing Data with Rails" by James Edward Gray II</title>
    <link href="http://paulbarry.com/articles/2008/04/19/importing-data-with-rails#comment-5137" rel="alternate" type="text/html"/>
    <content type="html">I write import scripts a lot for Rails, but I do it very differently.  First, I defined them as Rake tasks.&lt;br/&gt;&lt;br/&gt;I usually just create a data.rake file in lib/tasks.  Any task defined this way can load the whole Rails environment by adding a dependency on the environment task Rails ships with (just add =&amp;gt; :environment after your task name).  You can pass data into your tasks, which file to import for example, using environment variables.&lt;br/&gt;&lt;br/&gt;Finally, I would just use FasterCSV (or the standard CSV library) to handle the parsing.  It has all of the features demoed here and much more.&lt;br/&gt;&lt;br/&gt;In short, I feel this is just a little too much boilerplate code.  You are going through too much effort to recreate tools Rails ships with or that are easily downloaded.&lt;br/&gt;&lt;br/&gt;Also, the environment.rb file included with Rails loads boot.rb first thing, so that step in unneeded.  Have a peek at the code.</content>
  </entry>
  <entry>
    <author>
      <name>Paul Barry</name>
    </author>
    <id>urn:uuid:ff9b659c-f10c-41ed-9fc3-9e93e3c1f306</id>
    <published>2008-05-15T10:34:39-04:00</published>
    <updated>2008-05-15T10:34:39-04:00</updated>
    <title type="html">Comment on "Importing Data with Rails" by Paul Barry</title>
    <link href="http://paulbarry.com/articles/2008/04/19/importing-data-with-rails#comment-5140" rel="alternate" type="text/html"/>
    <content type="html">@James&lt;br/&gt;&lt;br/&gt;Thanks for the feedback.  Rake tasks might be a better way to go, I&apos;ll have to try that.  I like using the pipe-separated, because it doesn&apos;t require any library at all and I kind of just got used to doing in other languages, like Perl and Java.  It seems pragmatic to me because doing line.split(&amp;quot;|&amp;quot;) is so trivial and you can do that in any language without relying on a library that may or may not be installed.  There might be some cases where FasterCSV would be a good option, but using pipe-separated works well in simple cases for me.  &lt;br/&gt;&lt;br/&gt;But as far as the boilerplate code, you are right, less code is better code, so I&apos;ll look into ways of eliminating the unnecessary complication.</content>
  </entry>
  <entry>
    <author>
      <name>Alderete</name>
    </author>
    <id>urn:uuid:4b597c2d-531c-41d6-bcd8-1fdce145a230</id>
    <published>2008-05-25T23:52:42-04:00</published>
    <updated>2008-05-25T23:52:42-04:00</updated>
    <title type="html">Comment on "Importing Data with Rails" by Alderete</title>
    <link href="http://paulbarry.com/articles/2008/04/19/importing-data-with-rails#comment-5143" rel="alternate" type="text/html"/>
    <content type="html">@James: For some import routines, a CSV import is going to be fine. But there are lots of cases where a simple CSV import isn&apos;t going to work, and in those cases, having a more controllable method for doing imports, like this one, will be quite useful.&lt;br/&gt;&lt;br/&gt;For example, I am importing a data file provided by a vendor. One line per record is what they provide, but I need to break out the data into multiple associated records. E.g., each line in the vendor-provided file is a catalog item, but I need item, manufacturer, and individual sku (think variation or edition) records. Worse, sometimes there is more than one manufacturer, all in the same field, separated by &amp;quot; and &amp;quot; when there are two, and separated by &amp;quot;, &amp;quot; when there are more than two. &lt;br/&gt;&lt;br/&gt;In other words, sometimes you get clean data that can cleanly import. Other times, you gotta massage the data before you can create records for it. In that case, this method is going to work, and the CSV libraries are going to cry like babies.</content>
  </entry>
  </feed>