Package it.unimi.dsi.parser.callback
Class LinkExtractor
- java.lang.Object
-
- it.unimi.dsi.parser.callback.DefaultCallback
-
- it.unimi.dsi.parser.callback.LinkExtractor
-
- All Implemented Interfaces:
Callback
@Deprecated public class LinkExtractor extends DefaultCallback
Deprecated.This class is obsolete and kept around for backward compatibility only.A callback extracting links.This callbacks extracts links existing in the web page. The links are then accessible in
urls(a set ofStrings). Note that we guarantee that the iteration order in the set is exactly the order in which links have been met (albeit copies appear just once).
-
-
Field Summary
Fields Modifier and Type Field Description java.util.Set<java.lang.String>urlsDeprecated.The URLs resulting from the parsing process.-
Fields inherited from interface it.unimi.dsi.parser.callback.Callback
EMPTY_CALLBACK_ARRAY
-
-
Constructor Summary
Constructors Constructor Description LinkExtractor()Deprecated.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description java.lang.Stringbase()Deprecated.Returns the URL specified by theBASEelement.voidconfigure(BulletParser parser)Deprecated.Configure the parser to parse elements and certain attributes.java.lang.StringmetaLocation()Deprecated.Returns the URL specified byMETAHTTP-EQUIVelements of location type.java.lang.StringmetaRefresh()Deprecated.Returns the URL specified byMETAHTTP-EQUIVelements of refresh type.voidstartDocument()Deprecated.Receive notification of the beginning of the document.booleanstartElement(Element element, java.util.Map<Attribute,MutableString> attrMap)Deprecated.Receive notification of the start of an element.-
Methods inherited from class it.unimi.dsi.parser.callback.DefaultCallback
cdata, characters, endDocument, endElement, getInstance
-
-
-
-
Method Detail
-
configure
public void configure(BulletParser parser)
Deprecated.Configure the parser to parse elements and certain attributes.The required attributes are
SRC,HREF,HTTP-EQUIV, andCONTENT.- Specified by:
configurein interfaceCallback- Overrides:
configurein classDefaultCallback
-
startDocument
public void startDocument()
Deprecated.Description copied from interface:CallbackReceive notification of the beginning of the document.The callback must use this method to reset its internal state so that it can be resued. It must be safe to invoke this method several times.
- Specified by:
startDocumentin interfaceCallback- Overrides:
startDocumentin classDefaultCallback
-
startElement
public boolean startElement(Element element, java.util.Map<Attribute,MutableString> attrMap)
Deprecated.Description copied from interface:CallbackReceive notification of the start of an element.For simple elements, this is the only notification that the callback will ever receive.
- Specified by:
startElementin interfaceCallback- Overrides:
startElementin classDefaultCallback- Parameters:
element- the element whose opening tag was found.attrMap- a map fromAttributes toMutableStrings.- Returns:
- true to keep the parser parsing, false to stop it.
-
metaLocation
public java.lang.String metaLocation()
Deprecated.Returns the URL specified byMETAHTTP-EQUIVelements of location type. More precisely, this method returns a non-nullresult iff there is at least oneMETA HTTP-EQUIVelement specifying a location URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
METAHTTP-EQUIVelements of location type, ornull.
-
base
public java.lang.String base()
Deprecated.Returns the URL specified by theBASEelement. More precisely, this method returns a non-nullresult iff there is at least oneBASEelement specifying a derelativisation URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
BASEelement, ornull.
-
metaRefresh
public java.lang.String metaRefresh()
Deprecated.Returns the URL specified byMETAHTTP-EQUIVelements of refresh type. More precisely, this method returns a non-nullresult iff there is at least oneMETA HTTP-EQUIVelement specifying a refresh URL (if there is more than one, we keep the first one).- Returns:
- the first URL specified by a
METAHTTP-EQUIVelements of refresh type, ornull.
-
-