Lazy import of csv file in Clojure
I’ve been looking at Clojure now and then for a while… really wanna learn this stuff :) The other day I needed to import a very large CSV file… and it seemed like a good problem to try solve in Clojure.
I’m using Leiningen and started with the following project setup.
project.clj:
(defproject bbr "1.0.0"
:description "Import BBR data"
:dependencies [[org.clojure/clojure "1.3.0"]
[org.clojure/clojure-contrib "1.2.0"]
[org.clojure/java.jdbc "0.0.6"]
[mysql/mysql-connector-java "5.1.6"]
[clojure-csv "2.0.0-alpha2"]])
core.clj:
(ns bbr.core
(:require [clojure-csv.core :as csv]
[clojure.java.jdbc :as sql])
(:use [clojure.java.io :only [reader]]))
(def db {:classname "com.mysql.jdbc.Driver"
:subprotocol "mysql"
:subname "//localhost:3306/mydb"
:user "root"})
(defn parse-row [row]
(let [v (first (csv/parse-csv row :delimiter \;))]
(zipmap [:kvhx :road :no :floor :door :zip_code :city] v)))
(defn parse-file
[filename]
(with-open [file (reader filename :encoding "ISO-8859-1")]
(sql/with-connection db
(doseq [line (line-seq file)]
(let [record (parse-row line)]
(sql/update-or-insert-values "bbr" ["kvhx=?" (:kvhx record)] record))))))
The above should be able to import VERY large CSV files into the database because it reads one line from the file at the time.
Lazy import of csv file in Clojure
© 2012 by Jacob Emcken is licensed under CC BY-SA 4.0