Reading Excel using Ruby

There’re some libraries dealing with Excel, but they didn’t work in some conditions. So I’ve just written the code for this.

ExcelData uses win32ole module. It means this work only on Windows and Excel needs to be installed.
There’s only a public class method Load returns row array. Each row consists of hash with column name and value string pair.
Load method requires minimum three parameters. Excel filename(excelFilename), sheet name(worksheetName) and first header column name(firstHeaderColumnName). ExcelData finds first row with first header column name and read cell from there. If empty cell is found(both row and column), it stops reading the file.

require 'win32ole'

class ExcelData
public
  def	ExcelData.Load(excelFilename, worksheetName, firstHeaderColumnName, keySearcher = nil)
    xls       = nil
		ws        = nil
    excelData = []

    begin
      xls         = WIN32OLE.new('Excel.Application')
      xls.visible = false
      ws          = xls.Workbooks.Open(excelFilename).Worksheets(worksheetName)

      excelData   = _CollectData(ws, firstHeaderColumnName, keySearcher)
    rescue
      puts "Failed to open excel file : #{excelFilename}"
    ensure
      xls.quit if xls != nil
    end

    return excelData
  end

private
  def	ExcelData._FindFirstDataRowNum(worksheet, firstHeaderColumnName)
    rowNum  = 1

    while (true)
      row = worksheet.Range("a#{rowNum}")
      break if (row['Value'] != nil && row['Value'] == firstHeaderColumnName)
      rowNum  += 1
    end

    return rowNum + 1
  end
	
  def	ExcelData._FindLastDataRowNum(worksheet, firstDataRowNum)
    rowNum  = firstDataRowNum

    while (true)
      row = worksheet.Range("a#{rowNum}")
      break if (row['Value'] == nil || row['Value'] == "")
      rowNum  += 1
    end

    return rowNum - 1
  end

  def	ExcelData._FindLastDataColChar(worksheet, headerDataRowNum)
    colChar	= 'a'

    while (true)
      row = worksheet.Range("#{colChar}#{headerDataRowNum}")
      break if (colChar == 'z' || row['Value'] == nil || row['Value'] == "")
      colChar.succ!
    end

    return colChar
  end

  def ExcelData._CollectData(worksheet, firstHeaderColumnName, keySearcher)

    firstDataRowNum = _FindFirstDataRowNum(worksheet, firstHeaderColumnName)		
    lastDataRowNum  = _FindLastDataRowNum(worksheet, firstDataRowNum)
    lastDataColChar = _FindLastDataColChar(worksheet, firstDataRowNum - 1)

    puts "Collecting excel data : Total #{lastDataRowNum - firstDataRowNum + 1} rows ..."

    # Column Names
    colNames	= []
    row = worksheet.Range("a#{firstDataRowNum - 1}:#{lastDataColChar}#{firstDataRowNum - 1}")
    row.each do |cell| colNames << cell['Value'] end

    # Collect excel data
    excelData	= []
    for rowNum in firstDataRowNum..lastDataRowNum
      colIndex  = 0
      rowHash   = {}

      row = worksheet.Range("a#{rowNum}:#{lastDataColChar}#{rowNum}")

      row.each do |cell|
        if (!cell['Value'].to_s.empty? && cell['Value'].to_s =~ /^[0-9.]+/)
          rowHash.store(colNames[colIndex], cell['Value'].to_i.to_s)
        elsif
          rowHash.store(colNames[colIndex], cell['Value'].to_s)
        end

        colIndex	+= 1
      end

      excelData << rowHash

      keySearcher.store(rowHash[colNames[0]].to_s, excelData.length - 1) if keySearcher != nil

      print "." if excelData.length % 10 == 0
    end

    puts 

    return excelData
  end
end

You can simply use this code like bellow.

xls = ExcelData.Load('myfile.xls', 'sheet1', 'ITEMID')

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.