#7831: Reading rows from excel will skip blank leading cells

NielsGron reported 2012-10-20T18:33:44Z · last modified 2012-10-30T20:14:39Z

Reading rows from excel will skip blank leading cells

Dev	funfun
QA	NielsGron

Priority	Minor
Complexity	Unknown

Component	App - FluidShell - Command
Version	12.0

Reading rows from excel will skip blank leading cells.

2 attachments

Excel Test

2012-10-20T18:35:30Z

10 KB
Excel file containing blank rows

2012-10-20T21:50:50Z

8 KB

All Comments (6) Change History

NielsGron 2012-10-20T18:35:01Z · (edited)

I've checked in a fix as SVN #29858 for ExcelTransformer.writeWorksheetToCSV(). Fung just needs to code review.

NielsGron 2012-10-20T18:36:11Z

You can use attached file with this command :

$:1> excel2csv -c 'utf-8' -d '|' -l 'unix' -DQ 'square-bracket' -ED false -s Excel-Test.xls
"ID"|"First Name"|"Last Name"|"Country"|"Age"|"Salary"|"Hourly"|"Hire Date"
1|"Bob"|"送り仮名"|"USA"|20|1000|1.23|[20120101 00:00:00.000]
2|"John"|"L2"|"Germany"|21|2000|2.45|[20120501 00:00:00.000]
3|"Lance"|"L3"|"France"|25|3000|5.786|[20121202 00:00:00.000]
4|"Gangas"|"L4"|""|30|4000|5.44444|[20000120 00:00:00.000]
5|"Hilary"|"L5"|"Canada"|35|5000|3.21111|[19970315 00:00:00.000]
6|"Mark"|"L6"|"South Africa"|40|60000|100.11|[19910707 00:00:00.000]
7|"Derek"|"L7"|"Brazil"|45|100000|500|[20000405 00:00:00.000]
8|"Jo"|"L8"|"Chile"|50|500000|50.01|[19800808 00:00:00.000]
9|"Wane"|"送り仮名"|"Mexico"|55|1000000|1123.45|[20200209 00:00:00.000]
""|"Derek"|"L7"|"Brazil"|45|100000|500|""
""|""|"L7"|"Brazil"|45|100000|""|""
12|"Derek"|"L7"|"Brazil"|45|""|500|[20000405 00:00:00.000]
13|"Derek"|"L7"|""|45|100000|500|[20000405 00:00:00.000]

funfun 2012-10-20T22:06:36Z · (edited)

I did some test and looked into Aspose APIs. It seems to me Aspose's Iterator APIs always skip blank cells which implies if you have a blank row setting between 2 non-blank rows, then that blank row won't be included in the row iterator. If you agree the blank row described above should be included in the output then your change did not address this problem.

Attached Excel file, test-7832.xlsx, would demonstrate the 'blank-row' problem. If Case (2) below is the expected output, then we need to use other Aspose APIs other than Iterator to locate the cell that we are interested. I think we can replace the outer row loop with something like:
   int maxDataRow = wksheet.getCells().getMaxDataRow(); // maxDataRow is an index, zero based
   for( int i=0; i<=maxDataRow; i++ ) {
      // Row row = (Row)rowIterator.next();
      Row row = rows.get(i);
      ....
   }
This is how I generate Case (2) output below.

Case (1) - using Apsose Iterator APIs:
--------------------------------------
> excel2csv test-7832.xlsx -s -SQ none
ID,First Name,Last Name,Country,Age,Salary,Hourly,Hire Date
1,Bob,????,USA,20,1000,1.23,[20120101 00:00:00.000]
ID
First Name,,,,,,
,Last Name
,,Country
,,,Age
,,,,Salary
,,,,,Hourly
,,,,,,Hire Date
,,,,,,
ID,,,Age
First Name,,,,Salary
,Last Name,,,,Hourly
,,Country,,,,Hire Date
ID,Last Name,,Age
First Name,,Country,,Salary
,Last Name,,Age,,Hourly
,,Country,,Salary,,Hire Date

Case (2) - should this be the expected result?
----------------------------------------------
,,,,,,,
ID,First Name,Last Name,Country,Age,Salary,Hourly,Hire Date
1,Bob,????,USA,20,1000,1.23,[20120101 00:00:00.000]
,,,,,,,
ID,,,,,,,
,First Name,,,,,,
,,Last Name,,,,,
,,,Country,,,,
,,,,Age,,,
,,,,,Salary,,
,,,,,,Hourly,
,,,,,,,Hire Date
,,,,,,,
ID,,,,Age,,,
,First Name,,,,Salary,,
,,Last Name,,,,Hourly,
,,,Country,,,,Hire Date
,,,,,,,
,,,,,,,
ID,,Last Name,,Age,,,
,,,,,,,
,First Name,,Country,,Salary,,
,,,,,,,
,,Last Name,,Age,,Hourly,
,,,,,,,
,,,Country,,Salary,,Hire Date

funfun 2012-10-23T03:41:54Z

One thing I forgot to mention is SVN #29858 cannot handle the Excel file similar to the one I attached last time (test-7832.xlsx); SVN #29858 would generate rows with different number of columns which will cause csv2json and csv2xml throw exception.

We probably should replace Iterator and #29858 with other Aspose APIs:
wksheet.getCells().getMaxDataColumn();
Cell cell = Row.get(columnIndex);

funfun 2012-10-25T04:30:42Z

> SVN #29858 would generate rows with different number of columns which will cause csv2json and csv2xml throw exception.

Fixed, all of generated CSV rows should have the same number of columns, SVN r29915. Everything else remains unchanged.

tariqrahiman 2012-10-30T20:14:39Z

Verified in 12.0.3-7

Search Tips

Aqua Data Studio / nhilam

Reading rows from excel will skip blank leading cells

2 attachments

Issue #7831

Completion

Search Tips

Aqua Data Studio / nhilam

Title

Reading rows from excel will skip blank leading cells

2 attachments

Issue #7831

Completion