Class WebcrawlerConnector.ProcessActivityLinkHandler
- java.lang.Object
-
- org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector.ProcessActivityLinkHandler
-
- All Implemented Interfaces:
IDiscoveredLinkHandler
- Direct Known Subclasses:
WebcrawlerConnector.ProcessActivityHTMLHandler,WebcrawlerConnector.ProcessActivityRedirectionHandler,WebcrawlerConnector.ProcessActivityXMLHandler
- Enclosing class:
- WebcrawlerConnector
protected class WebcrawlerConnector.ProcessActivityLinkHandler extends java.lang.Object implements IDiscoveredLinkHandler
This class is the handler for links that get added into a IProcessActivity object.
-
-
Field Summary
Fields Modifier and Type Field Description protected org.apache.manifoldcf.crawler.interfaces.IProcessActivityactivitiesprotected java.lang.StringbaseDocumentIdentifierprotected java.lang.StringcontextDescriptionprotected java.lang.StringdocumentIdentifierprotected WebcrawlerConnector.DocumentURLFilterfilterprotected java.lang.StringlinkType
-
Constructor Summary
Constructors Constructor Description ProcessActivityLinkHandler(java.lang.String documentIdentifier, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, WebcrawlerConnector.DocumentURLFilter filter, java.lang.String contextDescription, java.lang.String linkType)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidnoteDiscoveredBase(java.lang.String rawURL)Inform the world of a new base HREF.voidnoteDiscoveredLink(java.lang.String rawURL)Inform the world of a discovered link.
-
-
-
Field Detail
-
documentIdentifier
protected java.lang.String documentIdentifier
-
baseDocumentIdentifier
protected java.lang.String baseDocumentIdentifier
-
activities
protected org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities
-
filter
protected WebcrawlerConnector.DocumentURLFilter filter
-
contextDescription
protected java.lang.String contextDescription
-
linkType
protected java.lang.String linkType
-
-
Constructor Detail
-
ProcessActivityLinkHandler
public ProcessActivityLinkHandler(java.lang.String documentIdentifier, org.apache.manifoldcf.crawler.interfaces.IProcessActivity activities, WebcrawlerConnector.DocumentURLFilter filter, java.lang.String contextDescription, java.lang.String linkType)Constructor.
-
-
Method Detail
-
noteDiscoveredBase
public void noteDiscoveredBase(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionDescription copied from interface:IDiscoveredLinkHandlerInform the world of a new base HREF.- Specified by:
noteDiscoveredBasein interfaceIDiscoveredLinkHandler- Parameters:
rawURL- is the new base HREF, in raw form. This may be relative, malformed, etc.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
noteDiscoveredLink
public void noteDiscoveredLink(java.lang.String rawURL) throws org.apache.manifoldcf.core.interfaces.ManifoldCFExceptionInform the world of a discovered link.- Specified by:
noteDiscoveredLinkin interfaceIDiscoveredLinkHandler- Parameters:
rawURL- is the raw discovered url. This may be relative, malformed, or otherwise unsuitable for use until final form is acheived.- Throws:
org.apache.manifoldcf.core.interfaces.ManifoldCFException
-
-