Class ByteSequenceFinder


  • public class ByteSequenceFinder
    extends java.lang.Object
    Finds a byte sequence in a stream of bytes. Uses the Knuth–Morris–Pratt search algorithm for better performance.

    The algorithm remembers the past matched characters in order to avoid restarting from the beginning of the pattern when a mismatch is found. This is obtained by building a lookup of a partial match table called failure function.

    Algorithm reference.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      private int[] failure  
      private byte[] sequence  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      byte[] getSequence()
      Retrieves the current sequence used by a search method.
      int search​(byte[] bytes)
      Finds the first occurrence of the pattern in the specified byte array.
      int search​(java.io.InputStream stream)
      Finds the first occurrence of the pattern in the specified stream.
      void setSequence​(byte[] sequenceIn)
      Sets the sequence to be used by the next invocation of a search method.
      private static int[] updateFailureFunction​(byte[] sequence)
      Computes the failure function.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • sequence

        private byte[] sequence
      • failure

        private int[] failure
    • Constructor Detail

      • ByteSequenceFinder

        public ByteSequenceFinder()
        Default constructor
      • ByteSequenceFinder

        public ByteSequenceFinder​(byte[] sequenceIn)
        Create a new instance.
        Parameters:
        sequenceIn - the sequence to find
    • Method Detail

      • search

        public int search​(byte[] bytes)
                   throws java.io.IOException
        Finds the first occurrence of the pattern in the specified byte array.
        Parameters:
        bytes - the byte array where the sequence is searched
        Returns:
        the index where the sequence starts within the array, -1 if the sequence was not found.
        Throws:
        java.io.IOException - when an error processing the stream of bytes occurs
      • search

        public int search​(java.io.InputStream stream)
                   throws java.io.IOException
        Finds the first occurrence of the pattern in the specified stream. The stream will be consumed up to end of the sequence matched. It will be fully consumed if no match is found.
        Parameters:
        stream - the input stream to search
        Returns:
        the position where the sequence starts within the stream and the stream will be positioned AFTER the end of the sequence. Otherwise, -1 if the sequence was not found.
        Throws:
        java.io.IOException - when an error processing the stream occurs
      • getSequence

        public byte[] getSequence()
        Retrieves the current sequence used by a search method.
        Returns:
        the current sequence to be searched
      • setSequence

        public void setSequence​(byte[] sequenceIn)
        Sets the sequence to be used by the next invocation of a search method.
        Parameters:
        sequenceIn - the new sequence to search
      • updateFailureFunction

        private static int[] updateFailureFunction​(byte[] sequence)
        Computes the failure function. Evaluates the sequence and find repeating prefixes.