Sunday, October 20, 2024

How to Read a Large Text File Line by Line Using Java Sample Code

Reading a large text file efficiently in Java is a common task in day to day software development. When dealing with files too large to load into memory at once, it's essential to read them line by line to avoid memory overflow. The topic is also discussed during interviews specically focusing on underlying implementation of below classes. In this post, we will explore different approaches for reading a large file line by line using Java:

  1. Using BufferedReader (Pre-Java 8)
  2. Using the Files class with Java 8 and above
  3. Using java.util.Scanner
  4. Using Files.newBufferedReader()
  5. Using SeekableByteChannel
  6. Using FileUtils.lineIterator (Apache Commons IO)

We will provide sample code for reading large files using each of these methods, making it easy for you to choose the approach that best fits your project.

1. Reading a Large File Line by Line Using BufferedReader


import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

public class ReadLargeFileUsingBufferedReader {
    public static void main(String[] args) {
        String samplePath = "example/path/to-large-file.txt";

        try (BufferedReader br = new BufferedReader(new FileReader(samplePath))) {
            String line;
            while ((line = br.readLine()) != null) {
                System.out.println(line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

2. Reading a Large File Line by Line Using Java 8 Files.lines()


import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Stream;

public class ReadLargeFileUsingJava8 {
    public static void main(String[] args) {
        String samplePath = "example/path/to-large-file.txt";

        try (Stream<String> lines = Files.lines(Paths.get(samplePath))) {
            lines.forEach(System.out::println);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

3. Reading a Large File Line by Line Using java.util.Scanner


import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class ReadLargeFileUsingScanner {
    public static void main(String[] args) {
        String samplePath = "example/path/to-large-file.txt";

        try (Scanner scanner = new Scanner(new File(samplePath))) {
            while (scanner.hasNextLine()) {
                String line = scanner.nextLine();
                System.out.println(line);
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }
}

4. Reading a Large File Line by Line Using Files.newBufferedReader()


import java.io.BufferedReader;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class ReadLargeFileUsingNewBufferedReader {
    public static void main(String[] args) {
        String samplePath = "example/path/to-large-file.txt";

        try (BufferedReader reader = Files.newBufferedReader(Paths.get(samplePath))) {
            String line;
            while ((line = reader.readLine()) != null) {
                System.out.println(line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

5. Reading a Large File Line by Line Using SeekableByteChannel


import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.SeekableByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

public class ReadLargeFileUsingSeekableByteChannel {
    public static void main(String[] args) {
        Path samplePath = Paths.get("example/path/to-large-file.txt");

        try (SeekableByteChannel sbc = Files.newByteChannel(samplePath)) {
            ByteBuffer buffer = ByteBuffer.allocate(1024);
            while (sbc.read(buffer) > 0) {
                buffer.flip();
                while (buffer.hasRemaining()) {
                    System.out.print((char) buffer.get());
                }
                buffer.clear();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

6. Reading a Large File Line by Line Using FileUtils.lineIterator (Apache Commons IO)


import org.apache.commons.io.FileUtils;
import org.apache.commons.io.LineIterator;

import java.io.File;
import java.io.IOException;

public class ReadLargeFileUsingLineIterator {
    public static void main(String[] args) {
        String samplePath = "example/path/to-large-file.txt";

        try (LineIterator it = FileUtils.lineIterator(new File(samplePath), "UTF-8")) {
            while (it.hasNext()) {
                String line = it.nextLine();
                System.out.println(line);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Key Differences Between These Methods

  • BufferedReader is one of the most memory-efficient options for reading large files.
  • Files.lines() and Files.newBufferedReader() provide modern, concise approaches but may consume more memory for larger files.
  • Scanner is flexible but slower for very large files.
  • SeekableByteChannel allows random access to file data, useful when working with specific sections of a file.
  • FileUtils.lineIterator() is perfect for handling extremely large files while keeping memory usage low.

Interview Questions: File Reading in Java

If you're preparing for a Java interview, understanding file handling in Java is crucial. Below are some common interview questions related to reading large files in Java:

1. What is the difference between BufferedReader and Files.lines()?

  • BufferedReader reads the file line by line without loading the entire file into memory, making it more memory efficient for large files.
  • Files.lines() uses a stream to read all lines from a file and can be easily processed with Java Streams API. However, it might load more data into memory, especially for very large files.

2. When would you choose Scanner over BufferedReader?

  • Scanner is better suited for parsing input with custom delimiters and working with different types of data (e.g., integers, floats). However, for reading large files line by line efficiently, BufferedReader is a better option because of its lower memory consumption and faster performance.

3. What is the advantage of using SeekableByteChannel?

  • SeekableByteChannel allows random access to file data. You can jump to a specific position in the file to read or write, which is not possible with BufferedReader or Scanner.

4. How does FileUtils.lineIterator() handle very large files?

  • FileUtils.lineIterator() is part of Apache Commons IO and allows reading large files with very low memory usage. It loads and processes the file line by line without consuming excessive memory.

5. What are the advantages of using Files.newBufferedReader() in Java 8?

  • Files.newBufferedReader() is a modern API that simplifies file handling and integrates with the Path class, offering a concise and readable way to work with files. It's an efficient alternative to BufferedReader for reading files in Java 8 and above.

Learnings:

Java provides various ways to read a large text file line by line. You can choose the right approach based on your project’s requirements, file size, and memory considerations. For most cases, BufferedReader or Files.newBufferedReader() is the best option for efficiency and simplicity. If you need to work with very large files, FileUtils.lineIterator() or SeekableByteChannel may offer better performance with minimal memory usage.

Related Keywords:

  • Sample code for reading a large file using Java
  • Example code for reading large file in Java
  • Java BufferedReader example
  • Java 8 Files.lines() example
  • Apache Commons IO lineIterator example

No comments:

Post a Comment

Dont SPAM