Reading and Writing CSV Files - Apache POI Tutorial

Apache POI, a popular Java library for working with Microsoft Office documents, can also be used to read and write CSV (Comma-Separated Values) files. CSV files are commonly used for storing tabular data in a plain text format. This tutorial will guide you through the steps of reading and writing CSV files using Apache POI.

Example Code

Let's consider an example that demonstrates how to read and write CSV files using Apache POI:


import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.*;

import java.io.*;
import java.util.*;

public class CSVFileExample {
  public static void main(String[] args) {
    // Reading CSV File
    try (BufferedReader br = new BufferedReader(new FileReader("input.csv"))) {
      Workbook workbook = new XSSFWorkbook();
      Sheet sheet = workbook.createSheet("Sheet1");
      String line;

      int rowNumber = 0;
      while ((line = br.readLine()) != null) {
        Row row = sheet.createRow(rowNumber++);
        String[] data = line.split(",");
        int columnNumber = 0;

        for (String value : data) {
          Cell cell = row.createCell(columnNumber++);
          cell.setCellValue(value);
        }
      }

      // Writing CSV File
      try (FileOutputStream fos = new FileOutputStream("output.csv")) {
        workbook.write(fos);
      }
    } catch (IOException e) {
      e.printStackTrace();
    }
  }
}
  

In this example, we read data from an existing CSV file named "input.csv" and create a new Excel workbook using Apache POI. Each line of the CSV file represents a row in the workbook, and the values are split by commas. Then, we write the data to a new CSV file named "output.csv" using the created workbook.

Steps to Read and Write CSV Files

Follow these steps to read and write CSV files using Apache POI:

  1. Import the necessary Apache POI classes and packages for working with Excel files.
  2. Create an instance of the Workbook class, which represents the Excel workbook.
  3. Create a Sheet object within the workbook to store the data.
  4. Read the CSV file line by line.
  5. Split each line into an array of values using the appropriate delimiter.
  6. Create a new Row object within the sheet for each line of the CSV file.
  7. Create Cell objects within the row and set the cell values based on the values in the CSV file.
  8. Write the workbook data to a new CSV file.

Common Mistakes

  • Using the wrong delimiter to split CSV values, leading to incorrect data mapping.
  • Not handling cases where the CSV file contains special characters or escaped values.
  • Forgetting to close the input and output streams, causing resource leaks.

Frequently Asked Questions (FAQs)

  1. Can Apache POI read and write CSV files with different delimiters?

    Yes, Apache POI can handle CSV files with different delimiters. You need to specify the correct delimiter when splitting the values and ensure consistency when reading and writing CSV files.

  2. How can I handle CSV files with headers using Apache POI?

    To handle CSV files with headers, you can skip the first line while reading the file to avoid treating the header as data. When writing a CSV file, you can write the header row separately before writing the data rows.

  3. Are there any performance considerations when reading or writing large CSV files?

    When working with large CSV files, it's important to consider memory usage and optimize the code for efficient processing. You can use techniques such as streaming or batch processing to minimize memory footprint.

  4. Can Apache POI handle CSV files with complex structures, such as nested data?

    Apache POI is primarily designed for working with structured data in Excel files. While it can read and write basic CSV files, handling complex structures or nested data may require additional processing or using specialized libraries.

  5. What are some alternative libraries for reading and writing CSV files in Java?

    Some popular alternative libraries for working with CSV files in Java include OpenCSV, Super CSV, and Commons CSV. These libraries offer specific features and flexibility for CSV manipulation.

Summary

Apache POI provides the capability to read and write CSV files, allowing you to work with tabular data in a plain text format. By following the provided steps, you can effectively read data from existing CSV files and write data to new CSV files using the Apache POI library. Be aware of common mistakes such as incorrect delimiter usage and resource leaks. Additionally, consider the specific requirements of your CSV files and explore alternative libraries if needed. With Apache POI, you can incorporate CSV file processing into your Java applications and perform various data manipulation tasks.