Class IndexPDFFiles


  • public final class IndexPDFFiles
    extends java.lang.Object
    Index all pdf files under a directory.

    This is a command-line application demonstrating simple Lucene indexing. Run it with no command-line arguments for usage information.

    It's based on a demo provided by the lucene project.

    Important: The pom.xml uses an outdated lucene version. Replace that version with the latest version to avoid security risks like CVE-2024-45772.

    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      private IndexPDFFiles()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      (package private) static void indexDocs​(org.apache.lucene.index.IndexWriter writer, java.io.File file)
      Indexes the given file using the given writer, or if a directory is given, recurses over files and directories found under the given directory.
      static void main​(java.lang.String[] args)
      Index all text files under a directory.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • IndexPDFFiles

        private IndexPDFFiles()
    • Method Detail

      • main

        public static void main​(java.lang.String[] args)
        Index all text files under a directory.
        Parameters:
        args - command line arguments
      • indexDocs

        static void indexDocs​(org.apache.lucene.index.IndexWriter writer,
                              java.io.File file)
                       throws java.io.IOException
        Indexes the given file using the given writer, or if a directory is given, recurses over files and directories found under the given directory. NOTE: This method indexes one document per input file. This is slow. For good throughput, put multiple documents into your input file(s). An example of this is in the benchmark module, which can create "line doc" files, one document per line, using the WriteLineDocTask.
        Parameters:
        writer - Writer to the index where the given file/dir info will be stored
        file - The file to index, or the directory to recurse into to find files to index
        Throws:
        java.io.IOException - If there is a low-level I/O error