Package org.apache.uima.cas.impl
Class CasSerializerSupport.CasDocSerializer
- java.lang.Object
-
- org.apache.uima.cas.impl.CasSerializerSupport.CasDocSerializer
-
- Enclosing class:
- CasSerializerSupport
public class CasSerializerSupport.CasDocSerializer extends java.lang.ObjectUse an inner class to hold the data for serializing a CAS. Each call to serialize() creates its own instance. package private to allow a test case to access not static to share the logger and the initializing values (could be changed)
-
-
Field Summary
Fields Modifier and Type Field Description CASImplcasprivate CasSerializerSupport.CasSerializerSupportSerializecsssprivate java.util.Set<TOP>enqueued_multiRef_arrays_or_listsSet of array or list FSs referenced from features marked as multipleReferencesAllowed, - which have previously been serialized "inline" - which now need to be serialized as separate items Set during enqueue scanning, to handle the case where the "visited_not_yet_written" set may have already recorded that this FS is already processed for enqueueing, but it is an array or list item which was being put "in-line" and no element is being written.private org.xml.sax.ErrorHandlererrorHandler2TypeSystemImplfilterTypeSystem_innerjava.util.List<TOP>[]indexedFSsArray of Lists of all FS that are indexed in some view (other than sofas).booleanisDeltaWhether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified.booleanisDynamicMultiRefSet to true for JSON configuration of using dynamic multi-ref detection for arrays and listsbooleanisFilteringWhether the serializer needs to check for filtered-out types/features.booleanisFormattedOutput_innerMarkerImplmarkerUsed to tell if a FS was created before or after mark.java.util.List<TOP>modifiedEmbeddedValueFSsjava.util.Set<TOP>multiRefFSsSet of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature.booleanneedNameSpacesjava.util.Set<java.lang.String>nsPrefixesUsedthe set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back injava.util.Map<java.lang.String,java.lang.String>nsUriToPrefixMapmap from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace stringjava.util.List<TOP>previouslySerializedFSsprivate java.util.Deque<TOP>queueFSs not in an index, but only being serialized becaused they're referenced.XmiSerializationSharedDatasharedDatafor Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serializationprivate TypeImpl[]sortedUsedTypesjava.util.Comparator<TOP>sortFssByTypeCalled for JSon Serialization Sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by idTypeSystemImpltsiXmlElementName[]typeCode2namespaceNamesprivate java.util.BitSettypeUsedprivate java.util.Map<java.lang.String,java.lang.String>uniqueStringsjava.util.Set<TOP>visited_not_yet_writtenset of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized.
-
Constructor Summary
Constructors Constructor Description CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss)CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidencodeFS(TOP fs)Encode an individual FS.private voidencodeFSs(java.util.List<TOP> fss)voidencodeIndexed()voidencodeQueued()(package private) intenqueueCommon(TOP fs)private intenqueueCommon(TOP fs, boolean doDeltaAndFilteringCheck)(package private) intenqueueCommonWithoutDeltaAndFilteringCheck(TOP fs)private voidenqueueFeatures(TOP fs)Enqueue all FSs reachable from features of the given FS.private voidenqueueFeaturesOfFSs(java.util.List<TOP> fss)private voidenqueueFeaturesOfIndexed()Enqueue everything reachable from features of indexed FSs.private voidenqueueFsAndMaybeFeatures(TOP fs)Enqueue an FS, and everything reachable from it.private voidenqueueFSArrayElements(FSArray fsArray)Enqueues all FS reachable from an FSArray.private voidenqueueFSListElements(FSList<TOP> node)Enqueues all Head values of FSList reachable from an FSList.private voidenqueueIncoming()Enqueues all FS that are stored in the sharedData's id map.private voidenqueueIndexed()add the indexed FSs onto the indexedFSs by view.(package private) voidenqueueIndexedFs_only_not_features(int viewNumber, TOP fs)private voidenqueueNonsharedMultivaluedFS()When serializing Delta CAS, enqueue encompassing FS of nonshared multivalued FS that have been modified.(package private) intgetElementCountForSharedData()java.lang.StringgetNameSpacePrefix(java.lang.String uimaTypeName, java.lang.String nsUri, int lastDotIndex)SofagetSofa(int sofaNum)TypeImpl[]getSortedUsedTypes()java.lang.StringgetTypeNameFromXmlElementName(XmlElementName xe)java.lang.StringgetUniqueString(java.lang.String s)private java.lang.Iterable<TypeImpl>getUsedTypesIterable()java.lang.StringgetXmiId(TOP fs)Get the XMI ID to use for an FS.intgetXmiIdAsInt(TOP fs)private booleanisListElementsMultiplyReferenced(TOP listNode)For lists, see if this is a plain list - no loops - no other refs to list elements from outside the list -- if so, return false; add all the elements of the list to visited_not_yet_written, noting if they've already been added -- this indicates either a loop or another ref from outside, -- in either case, return true - tprivate booleanisMultiRef_enqueue(FeatureImpl fi, TOP featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat)ordinary FSs referenced as features are not checked by this routine; this is only called for FSlists of various kinds, and fs arrays of various kinds Not all featValues should be enqueued; list or array features which are marked **NOT** multiple-refs-allowed are serialized in-line for JSON, when using dynamicMultiRef (the default), list / array FSs are serialized by ref (not in-line) if there are multiple refs to them for XMI and JSON, any FS ref marked as multiple-refs-allowed forces the item onto the ref "queue".booleanisStaticMultiRef(FeatureImpl fi)private voidreportMultiRefWarning(FeatureImpl fi)voidserialize()Starts serializationvoidwriteViewsCommons()
-
-
-
Field Detail
-
cas
public final CASImpl cas
-
tsi
public final TypeSystemImpl tsi
-
visited_not_yet_written
public final java.util.Set<TOP> visited_not_yet_written
set of FSs that have been visited and enqueued to be serialized - exception: arrays and lists which are "inline" are put into this set, but are not enqueued to be serialized. - FSs added to this, during "enqueue" phase, prior to encoding uses: - for Arrays and Lists, used to detect multi-refs - for Lists, used to detect loops - during enqueuing phase, prevent multiple enqueuings - during encoding phase, to prevent multiple encodings Public for use by JsonCasSerializer
-
enqueued_multiRef_arrays_or_lists
private final java.util.Set<TOP> enqueued_multiRef_arrays_or_lists
Set of array or list FSs referenced from features marked as multipleReferencesAllowed, - which have previously been serialized "inline" - which now need to be serialized as separate items Set during enqueue scanning, to handle the case where the "visited_not_yet_written" set may have already recorded that this FS is already processed for enqueueing, but it is an array or list item which was being put "in-line" and no element is being written. It has array or list elements where the item needs to be enqueued onto the "queue" list. Use: limit the put-onto-queue list to one time
-
multiRefFSs
public final java.util.Set<TOP> multiRefFSs
Set of FSs that have multiple references Has an entry for each FS (not just array or list FSs) which is (from some point on) being serialized as a multi-ref, that is, is **not** being serialized (any more) using the special notation for arrays and lists or, for JSON, **not** being serialized using the embedded notation This is for JSON which is computing the multi-refs, not depending on the setting in a feature. This is also for xmi, to enable adding to "queue" (once) for each FSs of this kind. Used: - limit the number of times this is put onto the queue to 1. - skip encoding of items on "queue" if not in this Set (maybe not needed? 8/2017 mis) - serialize if not in indexed set, dynamic ref == true, and in this set (otherwise serialize only from ref)
-
isDynamicMultiRef
public final boolean isDynamicMultiRef
Set to true for JSON configuration of using dynamic multi-ref detection for arrays and lists
-
previouslySerializedFSs
public java.util.List<TOP> previouslySerializedFSs
-
modifiedEmbeddedValueFSs
public java.util.List<TOP> modifiedEmbeddedValueFSs
-
indexedFSs
public final java.util.List<TOP>[] indexedFSs
Array of Lists of all FS that are indexed in some view (other than sofas). Array indexed by view.
-
queue
private final java.util.Deque<TOP> queue
FSs not in an index, but only being serialized becaused they're referenced. Exception: the sofa's are here.
-
typeCode2namespaceNames
public XmlElementName[] typeCode2namespaceNames
-
typeUsed
private final java.util.BitSet typeUsed
-
needNameSpaces
public boolean needNameSpaces
-
nsUriToPrefixMap
public final java.util.Map<java.lang.String,java.lang.String> nsUriToPrefixMap
map from a namespace expanded form to the namespace prefix, to identify potential collisions when generating a namespace string
-
nsPrefixesUsed
public final java.util.Set<java.lang.String> nsPrefixesUsed
the set of all namespace prefixes used, to disallow some if they are in use already in set-aside data (xmi serialization) being merged back in
-
marker
public final MarkerImpl marker
Used to tell if a FS was created before or after mark.
-
sharedData
public final XmiSerializationSharedData sharedData
for Delta serialization, holds the info gathered from deserialization needed for delta serialization and for handling out-of-type-system data for both plain and delta serialization
-
isDelta
public final boolean isDelta
Whether the serializer needs to serialize only the deltas, that is, new FSs created after mark represented by Marker object and preexisting FSs and Views that have been modified. Set to true if Marker object is not null and CASImpl object of this serialize matches the CASImpl in Marker object.
-
isFiltering
public final boolean isFiltering
Whether the serializer needs to check for filtered-out types/features. Set to true if type system of CAS does not match type system that was passed to constructor of serializer.
-
sortedUsedTypes
private TypeImpl[] sortedUsedTypes
-
errorHandler2
private final org.xml.sax.ErrorHandler errorHandler2
-
filterTypeSystem_inner
public TypeSystemImpl filterTypeSystem_inner
-
uniqueStrings
private final java.util.Map<java.lang.String,java.lang.String> uniqueStrings
-
isFormattedOutput_inner
public final boolean isFormattedOutput_inner
-
csss
private final CasSerializerSupport.CasSerializerSupportSerialize csss
-
sortFssByType
public final java.util.Comparator<TOP> sortFssByType
Called for JSon Serialization Sort a view, by type and then by begin/end asc/des for subtypes of Annotation, then by id
-
-
Constructor Detail
-
CasDocSerializer
public CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss)- Parameters:
ch- -cas- -sharedData- -marker- -csss- -
-
CasDocSerializer
public CasDocSerializer(org.xml.sax.ContentHandler ch, CASImpl cas, XmiSerializationSharedData sharedData, MarkerImpl marker, CasSerializerSupport.CasSerializerSupportSerialize csss, boolean trackMultiRefs)
-
-
Method Detail
-
reportMultiRefWarning
private void reportMultiRefWarning(FeatureImpl fi) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
serialize
public void serialize() throws java.lang.ExceptionStarts serialization- Throws:
java.lang.Exception- -
-
getSofa
public Sofa getSofa(int sofaNum)
- Parameters:
sofaNum- - starts at 1- Returns:
- the sofa FS, or null
-
writeViewsCommons
public void writeViewsCommons() throws java.lang.Exception- Throws:
java.lang.Exception
-
getSortedUsedTypes
public TypeImpl[] getSortedUsedTypes()
-
getUsedTypesIterable
private java.lang.Iterable<TypeImpl> getUsedTypesIterable()
-
enqueueIncoming
private void enqueueIncoming()
Enqueues all FS that are stored in the sharedData's id map. This map is populated during the previous deserialization. This method is used to make sure that all incoming FS are echoed in the next serialization. It is required if there are out-of-type FSs that are being merged back into the serialized form; those might reference some of these.
-
enqueueIndexed
private void enqueueIndexed()
add the indexed FSs onto the indexedFSs by view. add the SofaFSs onto the by-ref queue
-
enqueueNonsharedMultivaluedFS
private void enqueueNonsharedMultivaluedFS()
When serializing Delta CAS, enqueue encompassing FS of nonshared multivalued FS that have been modified. The embedded nonshared-multivalued item could be a list or an array
-
enqueueFeaturesOfIndexed
private void enqueueFeaturesOfIndexed() throws org.xml.sax.SAXExceptionEnqueue everything reachable from features of indexed FSs.- Throws:
org.xml.sax.SAXException
-
enqueueFeaturesOfFSs
private void enqueueFeaturesOfFSs(java.util.List<TOP> fss) throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
enqueueCommon
int enqueueCommon(TOP fs)
-
enqueueCommonWithoutDeltaAndFilteringCheck
int enqueueCommonWithoutDeltaAndFilteringCheck(TOP fs)
-
enqueueCommon
private int enqueueCommon(TOP fs, boolean doDeltaAndFilteringCheck)
- Parameters:
fs- -doDeltaAndFilteringCheck- -- Returns:
- true to have enqueue put onto "queue" and enqueue features
-
enqueueIndexedFs_only_not_features
void enqueueIndexedFs_only_not_features(int viewNumber, TOP fs)
-
enqueueFsAndMaybeFeatures
private void enqueueFsAndMaybeFeatures(TOP fs) throws org.xml.sax.SAXException
Enqueue an FS, and everything reachable from it. This call is recursive with enqueueFeatures, \ and an arbitrary long chain can get stack overflow error. Probably should fix this someday. See https://issues.apache.org/jira/browse/UIMA-106- Parameters:
addr- The FS address.- Throws:
org.xml.sax.SAXException
-
isListElementsMultiplyReferenced
private boolean isListElementsMultiplyReferenced(TOP listNode)
For lists, see if this is a plain list - no loops - no other refs to list elements from outside the list -- if so, return false; add all the elements of the list to visited_not_yet_written, noting if they've already been added -- this indicates either a loop or another ref from outside, -- in either case, return true - t- Parameters:
curNode- -featCode- -- Returns:
- false if no list element is multiply-referenced, true if there is a loop or another ref from outside the list, for one or more list element nodes
-
isMultiRef_enqueue
private boolean isMultiRef_enqueue(FeatureImpl fi, TOP featVal, boolean alreadyVisited, boolean isListNode, boolean isListFeat) throws org.xml.sax.SAXException
ordinary FSs referenced as features are not checked by this routine; this is only called for FSlists of various kinds, and fs arrays of various kinds Not all featValues should be enqueued; list or array features which are marked **NOT** multiple-refs-allowed are serialized in-line for JSON, when using dynamicMultiRef (the default), list / array FSs are serialized by ref (not in-line) if there are multiple refs to them for XMI and JSON, any FS ref marked as multiple-refs-allowed forces the item onto the ref "queue". (not handled here: ordinary FSs are serialized in-line in JSON with isDynamicMultiRef)- Parameters:
fi- - the feature, to look up the multiRefAllowed flagfeatVal- - the List or array elementalreadyVisited- true if visited_not_yet_written contains the featValisListNode- -isListFeat- -- Returns:
- false if should skip enqueue because this array or list is being serialized inline
- Throws:
org.xml.sax.SAXException- -
-
enqueueFeatures
private void enqueueFeatures(TOP fs) throws org.xml.sax.SAXException
Enqueue all FSs reachable from features of the given FS.- Parameters:
addr- address of an FStypeCode- type of the FSinsideListNode- true iff the enclosing FS (addr) is a list type- Throws:
org.xml.sax.SAXException
-
enqueueFSArrayElements
private void enqueueFSArrayElements(FSArray fsArray) throws org.xml.sax.SAXException
Enqueues all FS reachable from an FSArray.- Parameters:
addr- Address of an FSArray- Throws:
org.xml.sax.SAXException
-
enqueueFSListElements
private void enqueueFSListElements(FSList<TOP> node) throws org.xml.sax.SAXException
Enqueues all Head values of FSList reachable from an FSList. This does NOT include the list nodes themselves.- Parameters:
addr- Address of an FSList- Throws:
org.xml.sax.SAXException
-
encodeIndexed
public void encodeIndexed() throws java.lang.Exception- Throws:
java.lang.Exception
-
encodeFSs
private void encodeFSs(java.util.List<TOP> fss) throws java.lang.Exception
- Throws:
java.lang.Exception
-
encodeQueued
public void encodeQueued() throws java.lang.Exception- Throws:
java.lang.Exception
-
encodeFS
public void encodeFS(TOP fs) throws java.lang.Exception
Encode an individual FS. Json has 2 encodings For type: "typeName" : [ { "@id" : 123, feat : value .... }, { "@id" : 456, feat : value .... }, ... ], ... For id: "nnnn" : {"@type" : typeName ; feat : value ...} For cases where the top level type is an array or list, there is a generated feature name, "@collection" whose value is the list or array of values associated with that type.- Parameters:
fs- the FS to be encoded.- Throws:
org.xml.sax.SAXException- passthrujava.lang.Exception
-
getElementCountForSharedData
int getElementCountForSharedData()
-
getXmiId
public java.lang.String getXmiId(TOP fs)
Get the XMI ID to use for an FS.- Parameters:
fs- the FS- Returns:
- XMI ID or null
-
getXmiIdAsInt
public int getXmiIdAsInt(TOP fs)
-
getNameSpacePrefix
public java.lang.String getNameSpacePrefix(java.lang.String uimaTypeName, java.lang.String nsUri, int lastDotIndex)
-
getUniqueString
public java.lang.String getUniqueString(java.lang.String s)
-
getTypeNameFromXmlElementName
public java.lang.String getTypeNameFromXmlElementName(XmlElementName xe)
-
isStaticMultiRef
public boolean isStaticMultiRef(FeatureImpl fi)
-
-