FP Messages
From ETaxonomy
General Concepts
Informal description of atomic messages: One message = one purpose!!!
What is "message originator" (person? client?) Perhaps both, thus Header element <xsd:element name="Originator"/> as a complex type consisting of a GUID for the node that originates a message and a string that represents an (untrusted) assertion by the client software of the name of the person who is logged in and using the client to generate the message.
Multiple clients at same node? Probably easier than not
Atomicity of Message filtering? All or none for now.
Message arguments
- Network object ID (or list of same[?])
Tradeoff - send message multiple times, or send large complex messages. Implementation level decision?
- Operation ID
What kind of message is this (FP_TAXON_COMMENT). Extensible and schema based? Portal to community is another edge where messages apply - privileged on one side, not privileged on the other (relevance of xml access controls).
- Originator of Message
- machine
- person
- Message Signature
- Signature by the client code that generated the message, Public key available in the network to validate signature and source of message as known client.
- Destination of message ??
Do messages get filtered on origination (symmetrical to filter on reception?) Are there messages where the destination is other than broadcast on network? Message destination other than access point may be irrelevant as this level of message handling is handled by the messaging system.
- Message content - still needs elucidation.
- See message types below
Sets
Is a Set immutable? One app is stuff that is seemingly a dup, but isn't, or isn't but something determines later that it is.
Are sets non-intersecting?
What are relations between Sets.
How is a Set defined? Hopefully some mix of automatic and people-originated. Replace "Set" with "Cataloged (Virtual)Container" ?
Other General Comments
Sites as collections
Primitive is cataloged collection object?
Composite objects: what investigates passing messages down to composite pieces?
"Cataloged" = "GUIDable"?
Value to an FP network of "I want to know if these investigators ever begin working on these network objects (or objects defined by these properties). Also, "What are the stuff other people say the above is met; subscribe me to those")
Need authentication of agents (people or software), whether or not through a node, perhaps only through a portal.
Potential Client to Network Messages Listed in Use Cases
Use Case Find Duplicates
- findDuplicates
- makeQuery
- findSets
- makeAssertion FP_ANNOTATION
- addToSet
Use Case AnnotateSpecimen
- makeAnnotation FP_ANNOTATION
- makeCorrection FP_ANNOTATION + rules ApplePieRules
- addNewInformation FP_ANNOTATION + rules ApplePieRules
- makeNewDetermination FP_ANNOTATION + rules for new determination ApplePieRules.
- makeQuery
- inventory
- findSets
- makeAssertion FP_ANNOTATION
- acceptAnnotation FP_ANNOTATION
- rejectAnnotation FP_ANNOTATION
- createFilter
- expressInterest FP_SUBSCRIBE
Use Case Quality Control New Record
- makeQuery
- makeAnnotation FP_ANNOTATION
- injectWorkflow
- invokeWorkflow
- discoverWorkflows
- (also) annotateWorkflow FP_ANNOTATION
- analyzeGroups
- addRule
- applyRule
Use_Cases_from_Web_Client_Scenarios
- queryForParticularDuplicate
- makeAnnotation FP_ANNOTATION
- addToSet FP_ANNOTATION
- listInterests
- makeQuery See: Potential Query Scenarios
- makeAssertion FP_ANNOTATION
- acceptAnnotation FP_ANNOTATION
- rejectAnnotation FP_ANNOTATION
- reportAccessPointStatus FP_PING
- reportNetworkStatus
Use Case Researcher_Seeks_DwC_Metadata
- queryForData (DarwinCore Metadata)
- queryForAnnotations
- subscribe FP_SUBSCRIBE
Potential Network to Client Messages listed in Use Cases
Use Case Annotate Specimen Use Case Ingest Annotation
- recieveNotification
- listenForEvents
The Network_Monitoring_Use_Cases could also call for event notification of administrative clients.
- reportSuspiciousActivity
- reportNetworkProblems
Messages
FP_PING
FP Client wishes to know if a FP Access point is listening.
FP_SUBSCRIBE
Semantics: originator registers interest in <something>
FP_ANNOTATION
Wraps a AO/AOD annotation.
The semantics of what is being annotated is delegated to the annotation.
Annotation typing is delegated to rules: See ApplePieRules for typing of annotations.
Potential Messages
Note: These retain substantial bagage from the prototype and need to be reworked.
General
FP_NOTIFICATON
E.g.
- An asynchronous message has a reply waiting for you (network notifies client)
- one of your subscriptions has a new publication
- I am a data provider with new data available to the network (client notifies network)
- Network has a new subscription that people might be interested in (network broadcasts?)(what about authorization?)
FP_DATAHASCHANGED
Semantics: A data provider is indicating that data they have available for query or harvest has changed.
FP_ASSERT (depreciated)
Generalization is FP_Messages#FP_ANNOTATION
args: (true, false, accept, not-accept). Semantics: originator is asserting that something is true (and thus accepting it), false (and thus rejecting it), or accepting (or rejecting) it without agreeing or disagreeing with its validity. Fourth case of not-accept emerged in discussions at TDWG 2008, including with Mark Mayfield who indicated a desire to not accept some subset of new determinations that might be true but which reflected a new combination that their institution might not want to store in their database or record as an annotation on the specimen. The value not-accept is essentially a formal mechanism for ignoring the message (possibly distinguishing institutions that review incomming annotations from those that ignore them). Examples: James accepts all annotations made by Tony. James says that this determination made by Anne is correct. James says that this determination made by Henry is incorrect.
Queries
FP_QUERY
General, or specific subtypes (inventory, find sets, get data)?
FP_INVENTORY
Semantics: How many sets do you know about with property X?
FP_FIND_SETS
Semantics: Which sets do you know about with property X?
FP_GET_DATA
Given a set, retrieve all associated data.
Annotations
Note: Typed annotation messages are leftovers from FP Prototype design.
FP_CORRECTION_ASSERTION (deprecated)
arguments: SetID, list of one or more {schema, key, value} Semantics: assertion that something needs correction, e.g. "Darwin:Collector should be J.Macklin". Message originator is offering an annotation consisting of an an arbitrary set of concepts (and values for those concepts) to be applied to a particular set, where there are existing values for those concepts present, and those existing values should be replaced with the correction. Contains schema, concept in schema, assertion of value.
Possible semantics follow (expressed largely in terms of ABCD concepts, but suggesting semantic groupings a FP network might require, along with other concepts not described in ABCD, also with some references to the TDWG LSID ontology). Note that these are general descriptions of possible semantics for discussion.
<appliesTo> Required, can't be empty.
<GUID> Required if applicable
Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID, or FP: SetID) Required (two elements?)
Value: Required
</GUID>
<darwincoretriplet> Required if applicable
ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID Required
ABCD: Datasets/Dataset/Units/Unit/SourceID Required
ABCD: Datasets/Dataset/Units/Unit/UnitID Required
</darwincoretriplet>
</appliesTo>
<corrections> Required, list of one or more correction.
<correction> Required
Schema Required
Key Required
Value Required
</correction>
</corrections>
<correctionBy> Required.
ABCD: Datasets/Dataset/Units/Unit/Identifications/Identification/Identifiers/IdentifiersText Required
Alternately, TDWGOntology/Base/Person http://rs.tdwg.org/ontology/Core#Person
ABCD: Datasets/Dataset/Units/Unit/Identifications/Identification/Date Required
</correctionBy>
<discussion> Optional
ABCD: Datasets/Dataset/Units/Unit/Notes
</discussion>
The use cases seem to require that a correction come from a person, rather than a software agent. Determinations that something is incorrect by a software agent (including analysis by the network) should be quality control assertions rather than corrections. The distinction is probably not so much human agent or software agent, but whether the source of the correction is something entirely outside of the realm of the network (such as knowledge inside a person's head), or whether it is an inference based on examination of data in the network. Software agents able to read handwriting and correlate data in scanned field notes, publications, specimen ledgers, and specimen label data will blur this distinction, but they should probably get a new class of messages, and the correction (and new data) assertions should be required to originate from people.
May require launching another message defining a new set. For example if new collector, then need a message moving this from one message set to another.
FP_NEW_DATA_ASSERTION (deprecated)
arguments: SetID, list of one or more {schema, key, value} Semantics: This provides new data not in the set e.g. "Here is a locality assertion", e.g. "Here is a Concept Schema and an instance or part of it." e.g. adding georeferences not already in the metadata. Message originator is offering an annotation consisting of an an arbitrary set of concepts (and values for those concepts) to be applied to a particular set, where there may or may not be existing values for those concepts present. Should existing values be present, the message originator is offerring the new values as additional values. Contains schema, concept in schema, assertion of value.
May require that acceptors launch a message to notify that this should be part of duplicate set message.
Possible semantics:
<appliesTo> Required, can't be empty.
<GUID> Required if applicable
Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID, or FP: SetID) Required (two elements?)
Value: Required
</GUID>
<darwincoretriplet> Required if applicable
ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID Required
ABCD: Datasets/Dataset/Units/Unit/SourceID Required
ABCD: Datasets/Dataset/Units/Unit/UnitID Required
</darwincoretriplet>
</appliesTo>
<newData> Required, list of one or more newDatum
<newDatum> Required
Schema Required
Key Required
Value Required
</newDatum>
</newData>
<discussion> Optional
ABCD: Datasets/Dataset/Units/Unit/Notes
</discussion>
FP_NEW_DETERMINATION (deprecated)
Message: FP_TAXON_COMMENT args: SetID, Taxon_Name (or TaxonConcept) Semantics: message originator is offering opinion of name that should be attached to specimen; SetID represents network ID of a set of putative duplicates; Special case of FP_NEW_DATA_ASSERTION with a set of required elements.
<PJM:Question>Are these generalizable to a single annotation assertion that has a type as an argument, where different types of assertions can have different required elements and can activate different behaviors?</PJM>
Possible semantics:
For an example new determination see IPT Demo
<appliesTo> Required, can't be empty.
<GUID> Required if applicable
Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID or FP: SetID) Required (two elements?)
Value: Required
</GUID>
<darwincoretriplet> Required if applicable
ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID Required
ABCD: Datasets/Dataset/Units/Unit/SourceID Required
ABCD: Datasets/Dataset/Units/Unit/UnitID Required
</darwincoretriplet>
</appliesTo>
<determination> Required.
ABCD: Datasets/Dataset/Units/Unit/Identifications/Identification/TaxonIdentified/ScientificName/FullScientificNameString Optional
ABCD: Datasets/Dataset/Units/Unit/Identifications/Identification/TaxonIdentified/ScientificName/NameAtomised Required
ABCD: Datasets/Dataset/Units/Unit/Identifications/Identification/TaxonIdentified/IdentificationQualifier Optional
ABCD: Datasets/Dataset/Units/Unit/Sex Optional
DarwinCore1.4: LifeStage or http://rs.tdwg.org/ontology/voc/Specimen#lifeStage Optional
ABCD: Datasets/Dataset/Units/Unit/Identifications/Identification/Identifiers/IdentifiersText Required
ABCD: Datasets/Dataset/Units/Unit/Identifications/Identification/Date Required
</determination>
<discussion> Optional
ABCD: Datasets/Dataset/Units/Unit/Notes
</discussion>
FP_QUALITY_ISSUE_ASSERTION (deprecated)
Generated by network analysis of data, indicating that one or more elements of a data set appear to be outliers. arguments: SetID, list of one or more {schema, key, value} Semantics: assertion that something has been identified as potentially problematic in a quality control review and may need correction, e.g. "Darwin:Collector:J.Macklin;collected before collectors birth; Darwin:YearCollected:1830; collected before collectors birth".
Contains problematic schema, concept in schema, assertion of value, identification of the problem, and optionally a correction.
Might reference another FP_MESSAGE.
Possible semantics:
<appliesTo> Required, can't be empty.
<GUID> Required if applicable
Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID, or FP: SetID) Required (two elements?)
Value: Required
</GUID>
<darwincoretriplet> Required if applicable
ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID Required
ABCD: Datasets/Dataset/Units/Unit/SourceID Required
ABCD: Datasets/Dataset/Units/Unit/UnitID Required
</darwincoretriplet>
</appliesTo>
<problems> Required, list of one or more correction.
<problem> Required
Schema Required
Key Required
Value Required
WhyThisIsAProblem Required.
StatisticalResults Optional
<correction> Optional
Schema Required
Key Required
Value Required
</correction>
</problem>
</problems>
<QCBy> Required.
TDWGOntology/Base/Actor Required. http://rs.tdwg.org/ontology/Base#BaseActor (includes both people and software agents).
</QCBy>
<discussion> Required
ABCD: Datasets/Dataset/Units/Unit/Notes
</discussion>
FP_ASSOCIATION_ASSERTION (deprecated)
An annotation asserting that one data object has a particular relationship to another data object. This species eats this other species, this insect specimen was found on this plant specimen. Suggested by descriptions of classes of annotations under consideration by the Atlas of Living Australia in meeting at TDWG 2008 on 2008Oct19. Can likely use controlled vocabulary of annotations in TDWG Ontology.
Set Operators
These look like they can be generalized, and may be two sorts of operation with some (add/remove) being expressed as annotations, and others (build sets/add generation rule) be analysis instructions.
FP_ADD_SHEET
Message: FP_ADD_SHEET args: SetID; SpecimenID Semantics: message originator is asserting a specimen belongs in the given Set (Is this an annotation, or should there be a way to enforce it?)
FP_REMOVE_SHEET
Semantics: reverse of FP_ADD_SHEET
FP_ADD_NEW_SET
Semantics: ???
FP_ADD_SET_GENERATION_RULE
Semantics: message originator is describing a novel set of rules for creating sets and determining set membership (e.g. a new rule for building sets of collection objects that have determinations within the same taxonomic concept).
FP_BUILD_SETS
Semantics: given a rule and optionally limiting criteria, build sets with that rule (find all sets of duplicate specimens of Rubus, find all sets of duplicate specimens known to the network).
Community Messages
Need further discussion and elucidation.
FP_WIP
Semantics: Notification that work is in progress on a network-identifiable object. Up to client to interpret what to do with the object, e.g. if it is decomposable at the client side. Does this entail producing new, transient(?) identifiable objects?
Might include.
- Mark as Work In Progress
- Release Work In Progress
- Query for Work In Progress
- Inventory Work In Progress
Next to do
Spell out specifics of these messages. Link to use cases.