FP Messages

From ETaxonomy

Jump to: navigation, search

General Concepts

Informal description of atomic messages: One message = one purpose!!!

What is "message originator" (person? client?) Perhaps both, thus Header element <xsd:element name="Originator"/> as a complex type consisting of a GUID for the node that originates a message and a string that represents an (untrusted) assertion by the client software of the name of the person who is logged in and using the client to generate the message.

Multiple clients at same node? Probably easier than not

Atomicity of Message filtering? All or none for now.

Message arguments

  • Network object ID (or list of same[?])

Tradeoff - send message multiple times, or send large complex messages. Implementation level decision?

  • Operation ID

What kind of message is this (FP_TAXON_COMMENT). Extensible and schema based? Portal to community is another edge where messages apply - privileged on one side, not privileged on the other (relevance of xml access controls).

  • Originator of Message
    • machine
    • person
  • Message Signature
    • Signature by the client code that generated the message, Public key available in the network to validate signature and source of message as known client.
  • Destination of message ??

Do messages get filtered on origination (symmetrical to filter on reception?) Are there messages where the destination is other than broadcast on network? Message destination other than access point may be irrelevant as this level of message handling is handled by the messaging system.

  • Message content - still needs elucidation.
    • See message types below

Sets

Is a Set immutable? One app is stuff that is seemingly a dup, but isn't, or isn't but something determines later that it is.

Are sets non-intersecting?

What are relations between Sets.


How is a Set defined? Hopefully some mix of automatic and people-originated. Replace "Set" with "Cataloged (Virtual)Container" ?

Other General Comments

Sites as collections

Primitive is cataloged collection object?

Composite objects: what investigates passing messages down to composite pieces?

"Cataloged" = "GUIDable"?

Value to an FP network of "I want to know if these investigators ever begin working on these network objects (or objects defined by these properties). Also, "What are the stuff other people say the above is met; subscribe me to those")

Need authentication of agents (people or software), whether or not through a node, perhaps only through a portal.


Contents


Potential Client to Network Messages Listed in Use Cases

Use Case Find Duplicates

  • findDuplicates
    • makeQuery
    • findSets
  • makeAssertion FP_ANNOTATION
  • addToSet

Use Case AnnotateSpecimen

Use Case Quality Control New Record

Use Case Overview

  • analyzeGroups
  • addRule
  • applyRule

Use_Cases_from_Web_Client_Scenarios

Network_Monitoring_Use_Cases

  • reportAccessPointStatus FP_PING
  • reportNetworkStatus

Use Case Researcher_Seeks_DwC_Metadata

  • queryForData (DarwinCore Metadata)
  • queryForAnnotations
  • subscribe FP_SUBSCRIBE

Potential Network to Client Messages listed in Use Cases

Use Case Annotate Specimen Use Case Ingest Annotation

  • recieveNotification

Use_cases#Overview

  • listenForEvents

The Network_Monitoring_Use_Cases could also call for event notification of administrative clients.

  • reportSuspiciousActivity
  • reportNetworkProblems

Messages

 

FP_PING

FP Client wishes to know if a FP Access point is listening.

FP_SUBSCRIBE

Semantics: originator registers interest in <something>

FP_ANNOTATION

Wraps a AO/AOD annotation.

The semantics of what is being annotated is delegated to the annotation.

Annotation typing is delegated to rules: See ApplePieRules for typing of annotations.

Potential Messages

Note: These retain substantial bagage from the prototype and need to be reworked.

General

 

FP_NOTIFICATON

E.g.

  • An asynchronous message has a reply waiting for you (network notifies client)
    • one of your subscriptions has a new publication
  • I am a data provider with new data available to the network (client notifies network)
  • Network has a new subscription that people might be interested in (network broadcasts?)(what about authorization?)

FP_DATAHASCHANGED

Semantics: A data provider is indicating that data they have available for query or harvest has changed.

FP_ASSERT (depreciated)

Generalization is FP_Messages#FP_ANNOTATION

args: (true, false, accept, not-accept). Semantics: originator is asserting that something is true (and thus accepting it), false (and thus rejecting it), or accepting (or rejecting) it without agreeing or disagreeing with its validity. Fourth case of not-accept emerged in discussions at TDWG 2008, including with Mark Mayfield who indicated a desire to not accept some subset of new determinations that might be true but which reflected a new combination that their institution might not want to store in their database or record as an annotation on the specimen. The value not-accept is essentially a formal mechanism for ignoring the message (possibly distinguishing institutions that review incomming annotations from those that ignore them). Examples: James accepts all annotations made by Tony. James says that this determination made by Anne is correct. James says that this determination made by Henry is incorrect.

Queries

 

FP_QUERY

General, or specific subtypes (inventory, find sets, get data)?

FP_INVENTORY

Semantics: How many sets do you know about with property X?

FP_FIND_SETS

Semantics: Which sets do you know about with property X?

FP_GET_DATA

Given a set, retrieve all associated data.

Annotations

Note: Typed annotation messages are leftovers from FP Prototype design.

FP_CORRECTION_ASSERTION (deprecated)

arguments: SetID, list of one or more {schema, key, value} Semantics: assertion that something needs correction, e.g. "Darwin:Collector should be J.Macklin". Message originator is offering an annotation consisting of an an arbitrary set of concepts (and values for those concepts) to be applied to a particular set, where there are existing values for those concepts present, and those existing values should be replaced with the correction. Contains schema, concept in schema, assertion of value.

Possible semantics follow (expressed largely in terms of ABCD concepts, but suggesting semantic groupings a FP network might require, along with other concepts not described in ABCD, also with some references to the TDWG LSID ontology). Note that these are general descriptions of possible semantics for discussion.


<appliesTo>  Required, can't be empty.
   <GUID> Required if applicable
      Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID, or FP: SetID)  Required (two elements?) 
      Value:  Required
   </GUID>
   <darwincoretriplet>  Required if applicable
      ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID  Required
      ABCD: Datasets/Dataset/Units/Unit/SourceID  Required
      ABCD: Datasets/Dataset/Units/Unit/UnitID    Required
   </darwincoretriplet>
</appliesTo>
<corrections>  Required, list of one or more correction.
  <correction> Required
     Schema  Required
     Key     Required
     Value   Required
  </correction>
</corrections>
<correctionBy> Required.
   ABCD:  Datasets/Dataset/Units/Unit/Identifications/Identification/Identifiers/IdentifiersText  Required
   Alternately,  TDWGOntology/Base/Person http://rs.tdwg.org/ontology/Core#Person
   ABCD:  Datasets/Dataset/Units/Unit/Identifications/Identification/Date  Required
</correctionBy>
<discussion>  Optional
   ABCD:  Datasets/Dataset/Units/Unit/Notes  
</discussion>


The use cases seem to require that a correction come from a person, rather than a software agent. Determinations that something is incorrect by a software agent (including analysis by the network) should be quality control assertions rather than corrections. The distinction is probably not so much human agent or software agent, but whether the source of the correction is something entirely outside of the realm of the network (such as knowledge inside a person's head), or whether it is an inference based on examination of data in the network. Software agents able to read handwriting and correlate data in scanned field notes, publications, specimen ledgers, and specimen label data will blur this distinction, but they should probably get a new class of messages, and the correction (and new data) assertions should be required to originate from people.

May require launching another message defining a new set. For example if new collector, then need a message moving this from one message set to another.

FP_NEW_DATA_ASSERTION (deprecated)

arguments: SetID, list of one or more {schema, key, value} Semantics: This provides new data not in the set e.g. "Here is a locality assertion", e.g. "Here is a Concept Schema and an instance or part of it." e.g. adding georeferences not already in the metadata. Message originator is offering an annotation consisting of an an arbitrary set of concepts (and values for those concepts) to be applied to a particular set, where there may or may not be existing values for those concepts present. Should existing values be present, the message originator is offerring the new values as additional values. Contains schema, concept in schema, assertion of value.

May require that acceptors launch a message to notify that this should be part of duplicate set message.

Possible semantics:

<appliesTo>  Required, can't be empty.
   <GUID> Required if applicable
      Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID, or FP: SetID)  Required (two elements?)
      Value:  Required
   </GUID>
   <darwincoretriplet>  Required if applicable
      ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID  Required
      ABCD: Datasets/Dataset/Units/Unit/SourceID  Required
      ABCD: Datasets/Dataset/Units/Unit/UnitID    Required
   </darwincoretriplet>
</appliesTo>
<newData>  Required, list of one or more newDatum
  <newDatum> Required
     Schema  Required
     Key     Required
     Value   Required
  </newDatum>
</newData>
<discussion>  Optional
   ABCD:  Datasets/Dataset/Units/Unit/Notes  
</discussion>

FP_NEW_DETERMINATION (deprecated)

Message: FP_TAXON_COMMENT args: SetID, Taxon_Name (or TaxonConcept) Semantics: message originator is offering opinion of name that should be attached to specimen; SetID represents network ID of a set of putative duplicates; Special case of FP_NEW_DATA_ASSERTION with a set of required elements.

<PJM:Question>Are these generalizable to a single annotation assertion that has a type as an argument, where different types of assertions can have different required elements and can activate different behaviors?</PJM>

Possible semantics:

For an example new determination see IPT Demo

<appliesTo>  Required, can't be empty.
   <GUID> Required if applicable
      Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID or FP: SetID)  Required (two elements?)
      Value:  Required
   </GUID>
   <darwincoretriplet>  Required if applicable
      ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID  Required
      ABCD: Datasets/Dataset/Units/Unit/SourceID  Required
      ABCD: Datasets/Dataset/Units/Unit/UnitID    Required
   </darwincoretriplet>
</appliesTo>
<determination> Required.
   ABCD:  Datasets/Dataset/Units/Unit/Identifications/Identification/TaxonIdentified/ScientificName/FullScientificNameString  Optional
   ABCD:  Datasets/Dataset/Units/Unit/Identifications/Identification/TaxonIdentified/ScientificName/NameAtomised  Required
   ABCD:  Datasets/Dataset/Units/Unit/Identifications/Identification/TaxonIdentified/IdentificationQualifier      Optional
   ABCD:  Datasets/Dataset/Units/Unit/Sex  Optional
   DarwinCore1.4: LifeStage  or http://rs.tdwg.org/ontology/voc/Specimen#lifeStage Optional 
   ABCD:  Datasets/Dataset/Units/Unit/Identifications/Identification/Identifiers/IdentifiersText  Required
   ABCD:  Datasets/Dataset/Units/Unit/Identifications/Identification/Date  Required
</determination>
<discussion>  Optional
   ABCD:  Datasets/Dataset/Units/Unit/Notes  
</discussion>

FP_QUALITY_ISSUE_ASSERTION (deprecated)

Generated by network analysis of data, indicating that one or more elements of a data set appear to be outliers. arguments: SetID, list of one or more {schema, key, value} Semantics: assertion that something has been identified as potentially problematic in a quality control review and may need correction, e.g. "Darwin:Collector:J.Macklin;collected before collectors birth; Darwin:YearCollected:1830; collected before collectors birth".

Contains problematic schema, concept in schema, assertion of value, identification of the problem, and optionally a correction.

Might reference another FP_MESSAGE.

Possible semantics:


<appliesTo>  Required, can't be empty.
   <GUID> Required if applicable
      Which GUID: (e.g. ABCD: Datasets/Dataset/Units/Unit/UnitGUID, or FP: SetID)  Required (two elements?) 
      Value:  Required
   </GUID>
   <darwincoretriplet>  Required if applicable
      ABCD: Datasets/Dataset/Units/Unit/SourceInstitutionID  Required
      ABCD: Datasets/Dataset/Units/Unit/SourceID  Required
      ABCD: Datasets/Dataset/Units/Unit/UnitID    Required
   </darwincoretriplet>
</appliesTo>
<problems>  Required, list of one or more correction.
 <problem> Required
     Schema  Required
     Key     Required
     Value   Required
     WhyThisIsAProblem Required.
     StatisticalResults Optional
     <correction> Optional
        Schema  Required
        Key     Required
        Value   Required
     </correction>
  </problem>
</problems>
<QCBy> Required.
   TDWGOntology/Base/Actor   Required.  http://rs.tdwg.org/ontology/Base#BaseActor (includes both people and software agents).
</QCBy>
<discussion> Required
   ABCD:  Datasets/Dataset/Units/Unit/Notes  
</discussion>


FP_ASSOCIATION_ASSERTION (deprecated)

An annotation asserting that one data object has a particular relationship to another data object. This species eats this other species, this insect specimen was found on this plant specimen. Suggested by descriptions of classes of annotations under consideration by the Atlas of Living Australia in meeting at TDWG 2008 on 2008Oct19. Can likely use controlled vocabulary of annotations in TDWG Ontology.

Set Operators

These look like they can be generalized, and may be two sorts of operation with some (add/remove) being expressed as annotations, and others (build sets/add generation rule) be analysis instructions.

FP_ADD_SHEET

Message: FP_ADD_SHEET args: SetID; SpecimenID Semantics: message originator is asserting a specimen belongs in the given Set (Is this an annotation, or should there be a way to enforce it?)

FP_REMOVE_SHEET

Semantics: reverse of FP_ADD_SHEET

FP_ADD_NEW_SET

Semantics: ???

FP_ADD_SET_GENERATION_RULE

Semantics: message originator is describing a novel set of rules for creating sets and determining set membership (e.g. a new rule for building sets of collection objects that have determinations within the same taxonomic concept).

FP_BUILD_SETS

Semantics: given a rule and optionally limiting criteria, build sets with that rule (find all sets of duplicate specimens of Rubus, find all sets of duplicate specimens known to the network).

Community Messages

Need further discussion and elucidation.

FP_WIP

Semantics: Notification that work is in progress on a network-identifiable object. Up to client to interpret what to do with the object, e.g. if it is decomposable at the client side. Does this entail producing new, transient(?) identifiable objects?

Might include.

  1. Mark as Work In Progress
  2. Release Work In Progress
  3. Query for Work In Progress
  4. Inventory Work In Progress

Next to do

Spell out specifics of these messages. Link to use cases.

XML Schema

File:Message.xsd File:MessageReturned.xsd

Personal tools
All Hands Meeting