All the documents and tutorials present in this repository are protected by the Creative Commons license CC BY-ND 4.0:
For the license detail, please visit: http://creativecommons.org/licenses/by-nd/4.0/
You are free to:
Under the following terms:
The collection of VAMDC documents and softwares is published on the VAMDC website (http://www.vamdc.org) under the supervision of the VAMDC Document/Software Coordinator (hereafter refered as VDSC).
The VAMDC Document/Software Coordinator is currently L. Nenadovic (CNRS).
Version numbers of software and schemas can be independent of version numbers of the related documents describing those softwares and schemas.
Nevertheless auto-generated documentation from schema or software is directly linked to the product they document and their version number is the same as the documented product.
The version number for schemas is given by [#.#] starting at [0.1] and increasing by 0.1 for each new update of the schemas.
The various softwares depend upon several standard procedures and schemas, that are released within VAMDC at regular time. Whether softwares are updated or not at each release, the software version adopts the Release number, given by [Year.Months] (Year being the last two digits of the running year).
Between the official VAMDC releases, the software version numbering would be constructed with the previous Release number [Year.Months] and a revision number giving [Year.MonthsR#]
Auto-generated Documentation from schemas and from Softwares have the version number of the corresponding schemas and softwares.
All other documents have a version number given by their time of release. Whether the documents are updated or not at each release, the document version adopts the Release number, given by [Year.Months] (Year being the last two digits of the running year).
Between the official VAMDC releases, the document version numbering would be constructed with the previous Release number [Year.Months] and a revision number giving [Year.MonthsR#]
Authors are strongly encouraged to start from one of the VAMDC document templates (put a link to the place where templates can be found), available in either Word or TeX. These help to ensure a common style and enables the VDSC to perform a lossless conversion to other common formats like PDF with minimum effort.
The VAMDC WP6 working group has adopted the use of SPHINX (put a link) in order to create easily either PDF or HTML pages, and authors may want to use the same software in order to produce homogeneous documentation for VAMDC.
Documents are entered into the VAMDC document collection by the VDSC in response to a request from the Work Packages Leader or the person primarily responsible for editing a particular document.
Documents are sent via zip directories to the VDC and publication occurs within 5 working days.
VAMDC official documents begin as Working Drafts. Working Drafts are under the responsability a Work Package working group. Working Drafts may undergo numerous revisions during their development.
Working Drafts will not be included in the formal VAMDC document collection, but rather will be maintained by the responsible working group on the VAMDC Document repository (GitHub).
Such works-in-progress should carry the version number of the previous recommendation with the revision number [#.#r1], the status is set to “Working Draft” and the date is compulsory.
Working Drafts should be developed, and utilize the above version numbering scheme, within the standard VAMDC document templates.
Once a Work package working group has reached internal agreement on a document, the document becomes a Proposed Recommendation keeping its last labelling [#.#r#], having a status set to “Proposed Recommendation” and including the date. The chair of the working group submits the Proposed Recommendation to the Document Coordinator for publication in the VAMDC Document Collection. If the Proposed Recommendation is amended, the version does not change, only the revision number and the date are updated. During this amendement period, the different revisions are carried out on GitHub.
Once the Proposed Recommendation is accepted, the document becomes a Recommendation document with the new version number, increased by 0.1 compared to previous version, no revision number, a new date. Its status is “Recommended”.
The Recommended Documents are published at the the time of Release with Html and PDF versions.
The VAMDC document/software collection is the primary source for VAMDC documents. VAMDC users, especially from outside the core collaboration, should always be directed to the document/software collection rather than be sent private copies of documents/softwares.
The VAMDC document/software collection keep track of the different Releases and Recommended Documents. The Proposed Recommendations are maintained on the VAMDC document/software collection prior to recommendation, and are then removed when the document/software is approved. All other revisions are kept on GitHub.
Many data-sets in VAMDC include information that can be rendered in the VAMDC-XSAMS data model. Data in that common model could be transformed to and from a table model which uses the same columns for all data-sets. If all the data-sets had this table model as part of the schemata of their databases, then a SQL query to that model would work on all data-sets, and the results could be written in a common format.
VAMDC-TAP is a protocol for data-access services that provide the common table model matching VAMDC-XSAMS and which can return the results of queries in VAMDC-XSAMS. VAMDC-TAP services accept queries in a restricted form of SQL (VSS2: VAMDC SQL Sub-set #2) and return results in VAMDC-XSAMS or in certain tabular formats. Implementations of VAMDC-TAP map queries from the common table-model to the actual schemata of their databases.
VAMDC-TAP provides “virtual data”. I.e., it associates data selection criteria, defined by a query text, with an archived data-set, defined by the address to which the query is sent, the two combined in one URL. Each such URL represents the results of the query as if they had been pre-computed and stored on a web server. The data URLs are semi-permanent; they can be copied between application, bookmarked, emailed to colleagues, etc.
VAMDC-TAP is based on IVOA ‘s Table Access Protocol (TAP). TAP already provides virtual data and allows us to plug in our query language VSS2 and our data model VAMDC-XSAMS.
VAMDC-TAP is defined as a web-service protocol. That means that VAMDC-TAP services are driven by GET and POST requests to HTTP (or HTTPS) URIs. Low-level details of the protocol are defined by the HTTP RFCs. Further, the service can be implemented in any language and on any database engine without breaking interoperability.
Section 1.2 lists the features required in a conforming VAMDC-TAP service. Sections 1.3 to 1.9 inclusive define details of these features. Section 1.10 is not part of the specification but instead explains how to form and execute a VAMDC-TAP query.
A VAMDC-TAP implementation must be organized as a TAP service. This means that the implementation must be a RESTful web-service and must provide web resources in the pattern mandated by TAP.
In the TAP standard, some features are mandatory and some are optional. VAMDC-TAP uses a sub-set of the mandatory features and some of the optional ones.
A VAMDC-TAP service must support at least the following features:
If the service provides all of these features then it may be registered as VAMDC-TAP.
A VAMDC-TAP service should support all the mandatory features of TAP. If it does so, it may be registered as both VAMDC-TAP and as TAP. (See section below for an explanation of the distinction.)
If a VAMDC-TAP service is to be registered as both TAP and VAMDC-TAP, then it should provide the VOSI output that specifies the details of tables and columns for use in queries. Without these metadata TAP is very hard to use.
A VAMDC-TAP service may support output formats other than XSAMS. The most-useful formats are tabular: VOTable for virtual-observatory integration and CSV for use in spreadsheets. If the service is to be registered as a TAP service, the TAP standard says that the service must support VOTable.
A VAMDC-TAP service may support query languages other than VSS2. If the service is to be registered as a TAP service, the TAP standard says that the service must support Astronomical Data Query Language (ADQL).
A VAMDC-TAP service must support VAMDC SQL sub-set #2 or VSS2 as its primary query-language. This language is specified in a query by setting the HTTP parameter LANG to VSS2. The service must not be sensitive to the case of the parameter value, but clients should use upper case for this value.
A VAMDC-TAP service must support XSAMS as its primary output-format. This format is specified in a query by setting the HTTP parameter FORMAT to XSAMS. The service must not be sensitive to the case of the parameter value, but clients should use upper case for this value.
When the format is XSAMS, the service must return the results with the MIME type application/x-xsams+xml. The service must not use text/xml for these results.
TAP allows the value of the FORMAT parameter to be either the MIME type of the results or a keyword denoting the format. The MIME type for XSAMS is ambiguous and can easily be confused with VOTable or other XML formats. Clients must not use the MIME type when specifying XSAMS output.
A VAMDC-TAP service must provide a relational view of its database that can be queried using the SQL-sub-set language VSS2. VSS2 has no FROMs clause, so VSS2 queries implicitly address the database as a single table. A VAMDC-TAP service must be arranged to support this.
When implementing a VAMDC-TAP service for a particular database, the implementor must define the restrictable quantities: these are the columns of the standard view on which constraints may be placed in the WHERE clause of a query. The restrictables must be taken from the set defined in VAMDC’s standard dictionary, in which the names, data types, units and semantics are specified. A given implementation need not support every item in the dictionary. The restrictables for a service must be noted in the service’s registration.
The implementor of a service must also define the returnables: these are the terms from the VAMDC dictionary which will appear in the results of a query. The returnables must also be noted in the registration.
If a service supports both TAP and VAMDC-TAP, the tables available for a TAP query need not be related to the standard view. One of the main reasons for supporting TAP is to give access to a wider range of tables.
Notes on service availability, current and planned, are provided in an XML document. The availability metadata help in monitoring the VAMDC system and in managing downtime. A service installation may use the availability document to announce a number of conditions, including planned down-time for maintenance and unavailability of the database when the web service itself is available.
A VAMDC-TAP service must provide the availability document, in the form defined by IVOA’s Virtual Observatory Support Interfaces (VOSI) standard, at the location mandated by the TAP standard.
A service “capability” is an XML element describing the use of one aspect of the service. The capability states the URL for accessing that aspect and may add other metadata. A VAMDC-TAP installation will have a sequence of capabilities for different aspects, including a primary capability for the query protocol itself; the capabilities are distinguished by their standardID attributes. This sequence of capabilities is combined into the capabilities (XML) document and that document is copied from a URL on the VAMDC-TAP service into the VAMDC registry to form the machine-readable part of the registration.
A VAMDC-TAP service must provide the capabilities document as defined by Virtual Observatory Support Interfaces (VOSI) standard, at the location mandated by the TAP standard.
A VAMDC-TAP service must include the following capabilities in its capabilities document. (The notation {x}y for an XML type indicates the type x in the namespace y.)
In the capabilities document, structural types must be stated using the xsi:type attribute, except where the default type, {http://www.ivoa.net/xml/VOResource/v1.0}Capability, is used.
The XML schemata defining the parts of the registration are available on-line.
The capabilities document should refer to these schemata using the xsi:schemaLocation attribute on the document element. This makes it easier to validate the document. However, the registration process will still work in the absence of xsi:schemaLocation.
The following information must be included in VAMDC-TAP capability registration block:
Service access URLs, including mirrors addresses, in interface/accessURL elements
Implemented version of standards in versionOfStandards element, 12.07 for current standards.
Node software implementation used, including version, in versionOfSoftware element
Collection of sample queries in sampleQuery elements, that may be used for node monitoring task or to give an overview of contained data and query strategies for the node.
Sample queries must correspond to the following criterias:
- must be executed quickly (within seconds)
- must result in valid XSAMS documents provided as the response
- must result in documents that return a portion of all the specific database content, i.e. all the elements that may be returned in documents for any query should be returned in at least one document returned as a response to sample query.
A set of returnable keywords, indicating the major elements filled in XSAMS. If node software is not using returnable keywords internally, only a brief set of most important keywords that are specific to this database may be returned.
A set of restrictable keywords that may be used to query the node. See the Dictionary document for supported restrictable and returnable element values.
A VAMDC-TAP service must be registered. The registration document must be of type CatalogService (v1.0) or CatalogService (v1.1) as defined by IVOA (i.e. it must use the VODataService standard in either of two versions).
The registration must include the capability elements copies from the capabilities document described above.
A VAMDC-TAP service should provide information/statistics about the amount of data that will be returned for a specific query in the HTTP headers of the reply to the query. This allows a clent (e.g. the portal) to use the HEAD method (instead of GET) on the same query-URL to gather information before the query is acutally executed and the data transferred.
The names of the headers to be used are
Their values should be the count of the corresponding blocks in the XSAMS schema, e.g. the number of radiative transitions that will be returned for this query. With a reasonable database layout the nodes should easily be able to gather these numbers by running COUNT queries on their corresponding tables.
A VAMDC-TAP service can limit the amount of data it returns via the synchronous interface, for example to prevent the fetching of the whole database or for performance reasons. The service must then fill the HTTP-header of the response with the field VAMDC-TRUNCATED that contains the percentage that the returned data represent with respect to the total amount available for that query. It is up to each service to decide both where and if to put the limit and how to implement it, for example the number of states or processes. Response documents for the queries that lead to volume limitation must remain valid XSAMS, including all references between elements.
VAMDC-APPROX-SIZE HTTP header is intended to provide the estimation of the size of the response document. It should return an integer value, representing estimate uncompressed document size in megabytes.
A VAMDC-TAP service may add the Last-Modified header to a response. This header has the syntax and general semantics specified for HTTP 1.1, but also has special meaning within VAMDC-TAP.
If this header is used, it must refer to the modification time of the database supplying the data extract. The value of the header should be the time of the last change to the data actually used in forming the response. If the service installation cannot supply this specific time it may use instead the time of last modification to any relevant part of the database.
Following HTTP1.1 Status Codes must be implemented by the node software for the SYNC TAP endpoint:
HTTP Code | meaning | Content type | Response body |
---|---|---|---|
200 | Request processed normally, data is present. | application/x-xsams+xml | XSAMS instance document |
204 | Request processed, but no matching data found | none | none |
400 | bad request with malformed query string or missing restrictable | unspecified, may be application/x-votable+xml | unspecified, may be a votable with error message |
404 | not used, will be encountered if the endpoint is wrong | unspecified, may be application/x-votable+xml | unspecified, may be a votable with error message |
500 | internal crash | unspecified, may be application/x-votable+xml | unspecified, may be a votable with error message |
The base URL for a TAP service can be discovered from the registry. From this, the access URL for the query can be derived: add /sync [1] to the base URL and then add parameters to define the specific query.
Parameter name | Meaning | Supported values in VAMDC-TAP |
---|---|---|
REQUEST | Requested operation | doQuery |
LANG | Name of query language | VSS2 |
FORMAT | Format for results of query | XSAMS, VOTABLE, application/xml [2] |
QUERY | Text of query | As per query language |
Parameter names are insensitive to case: service implementations must accept any mix of case.
The parameter values are URL-escaped to replace illegal characters with hexadecimal codes (e.g. each space is replaced by %20). In practice, only the QUERY parameter needs to be escaped. Clients of the service must escape the parameters before sending the request.
This is a plausible example of a query URL, fully decorated with parameters:
http://some.server/some/path/sync?REQUEST=doQuery&LANG=VSS2&FORMAT=XSAMS&QUERY=select%20*
Here, the base URL, found in the registry, is http://some.server/some/path. The query is SELECT * .
The query is initiated by an HTTP-GET request to the access URL. The HTTP response carries the results of the query in the specified format.
Volume example is implemented for the Django-based prototypes and activated for the VALD node which now returns max 1000 transitions (plus corresponding states and sources, of course). Similar limits are easily done for the other nodes in a few lines of code. In addition to the HTTP-header, the VAMDC-XSAMS generator also puts a comment into the beginning of the XML-document which also notifies of the truncation.
For example, a query like this:
wget -S -O bla.xml "http://vamdc.fysast.uu.se:8888/node/vald//tap/sync/?REQUEST=doQuery&LANG=VSS2&FORMAT=XSAMS&QUERY=SELECT+*+WHERE+RadTransWavelengthExperimentalValue+%3E%3D+4000.0+AND+RadTransWavelengthExperimentalValue+%3C%3D+5002.0"
will show the HTTP-header as:
VAMDC-TRUNCATED: 2.9 %
In Django node software implementation you will also find the following section at the top of the returned XML:
<!--
ATTENTION: The amount of data returned has been truncated by the node.
The data below represent 2.9 percent of all available data at this node that
matched the query.
-->
Each VAMDC node implementing VAMDC-TAP protocol must pass the requirements of the following checklist to be fully interoperable within VAMDC infrastructure and do not interfere with other nodes. This list is not exclusive, but if a node does not comply with any of the checklist items, it can not be included into 12.07 public system.
Footnotes
[1] | This access-URL identifies the web-resource for synchronous queries. Asynchronous queries are sent to a separate web-resource. |
[2] | Implies VOTable. |
The details of VOSI capabilities and registration were clarified. The XML schema with namespace http://www.vamdc.org/xml/VAMDC-TAP/v1.0 is now explicitly in force for the VAMDC-TAP registration. This formalizes the approach used in the 11.09 system.
The details of VOSI capabilities and registration were clarified. The XML schema with namespace http://www.vamdc.org/xml/VAMDC-TAP/v1.0 is now explicitly in force for the VAMDC-TAP registration. This formalizes the approach used in the 11.09 system.
VAMDC SQL Subset 1 (VSS1) is a query language designed for the VAMDC-TAP web-services. A query in VSS1 defines an extract of an archive that a data service can return in an VAMDC-XSAMS document or in a tabular format.
VSS1 is based syntactically on ISO SQL92 but discards almost all the features of that language. The only features remaining are those relevant to selecting data to build an VAMDC-XSAMS document; VSS1 contains no feature for modifying a database. VSS1 assumes VAMDC’s standard view of the database as a single table. This latter point means that there is only one visible table (which therefore does not need to be named in a query) and the column names are terms taken from the VAMDC dictionary. This is an example of a VSS1 query:
SELECT ALL WHERE AtomNuclearCharge = 25 AND AtomIonCharge < 2
In the following definitions, VSS1 queries are assumed to be submitted by a client application to a web-service. That service contains a query processor that converts the VSS1 query into a SQL query or set of queries, suitable for the relational database managed by the service. A request to the service contains the VSS1 text of the query and other parameters such as the desired formats of the returned results.
VSS1 is intended to be interoperable across all VAMDC databases. The same query can be accepted at all database, even though different data are raised from each.
Making VSS1 fully interoperable and capable of easy implementation would exclude all the higher functions of SQL (mainly features of the WHERE clause, such as sub-queries). Where these features can easily be added to a particular query-processor it would be a beneficial to have them. Therefore, certain parts of SQL92 are noted as extensions to VSS1. A query containing extensions is not strictly valid VSS1 and therefore not interoperable across all database; it may work on particular database. Users and applications should normally use only valid VSS1, but may use extensions on queries written specially for individual databases.
To be a valid in VSS1, a query must satisfy the syntactic rules of the SQL92 language.
VSS1 queries never alter the database to which they are applied.
VSS1 queries must not contain the JOIN keyword.
Column names used as operands in a VSS1 query must be terms taken from the VAMDC Restrictables dictionary. These columns names must not be qualified by the name of a schema or database.
All the terms in the dictionary are valid as column names on all databases with a VSS1 processor. The query processor must implement the translation of the dictionary terms to names of real columns in the underlying database.
VSS1 processors may accept only a sub-set of the dictionary keywords, corresponding to the content of the underlying database. This sub-set naturally varies between databases and the set of restrictables for a given database is normally made available to the clients of the database. Where a query includes restrictables not supported by a given VSS1 processor, the processor must reject the query; it must not process the query while ignoring the unsupported constraints.
When the results of a query are to be returned in VAMDC-XSAMS format, VSS1 queries should begin with SELECT ALL ...; the set of columns from which data are returned is implicitly chosen by the choice of VAMDC-XSAMS format. If such a query does specify a set of columns (e.g. SELECT AtomIonCharge WHERE ...), then the query processor should ignore that set and proceed as if the query were SELECT ALL. However, where the results of a query are to be returned in a tabular format, the query processor must respect the query’s selection of columns. In the latter case, if the query specifies a returnable not supported by the particular database then the processor should reject the query.
When processing a query that contains valid VSS1 plus extensions, the behaviour is defined by the implementation of the query processor. The processor may reject the query, or it may ignore the extensions that it does not support.
The following parts of SQL92 constitute VSS1 extensions: EXISTS, GROUP BY, HAVING, UNION, INTERSECT, EXCEPT, MINUS, ORDER BY, LIMIT, DECLARE, FETCH, CLOSE.
The SQL92 standard [SQL92] should be consulted for the normative rules of syntax. These notes are for easy reference. VSS1 excludes so much of SQL that only the low-level aspects of the syntax are relevant.
SQL queries are written as text strings containing keywords, operators and operands separated by white space. Operands are names of tables and columns, sometimes called SQL identifiers or literal values. Identifiers and literals are sensitive to case; keywords and operators are not. There is a convention of writing keywords in upper case.
Queries can contain any Unicode character, but the keywords can be written using only ASCII characters. In VSS1, the valid identifiers also use only ASCII characters.
White space is required between keywords and operands but not between operators and operands. A typical (simple) VSS1 query looks like this:
SELECT ALL WHERE AtomIonCharge>6
This query would be equally valid:
SELECT * WHERE AtomIonCharge > 6
Here, data are selected from the columns AtomIonCharge and AtomNuclearCharge (note the use of a comma-separated list to specify the columns) of the table States according to a criterion on the electronic charge of the ions. String literals are delimited by single quotes (the ASCII apostrophe character) thus:
... WHERE AtomSymbol='Fe' ...
To include an apostrophe in a string, write two consecutive apostrophe-characters. If an identifier contains ‘special characters’ (typically white space), it must be protected with double quotes thus:
SELECT * WHERE "silly column name" > 0...
VAMDC SQL Subset 2 (VSS2) is a query language designed for the VAMDC-TAP web-services. A query in VSS2 defines an extract of an archive that a data service can return in an VAMDC-XSAMS document or in a tabular format.
VSS2 is based syntactically on ISO SQL92 but discards almost all the features of that language. The only features remaining are those relevant to selecting data to build an VAMDC-XSAMS document; VSS2 contains no feature for modifying a database. VSS2 assumes VAMDC’s standard view of the database as a single table. This latter point means that there is only one visible table (which therefore does not need to be named in a query) and the column names are terms taken from the VAMDC dictionary. This is an example of a VSS2 query:
SELECT Species WHERE AtomNuclearCharge = 25 AND AtomIonCharge < 2
In the following definitions, VSS2 queries are assumed to be submitted by a client application to a web-service. That service contains a query processor that converts the VSS2 query into a SQL query or set of queries, suitable for the relational database managed by the service. A request to the service contains the VSS2 text of the query and other parameters such as the desired formats of the returned results.
Keywords, defining the desired branches, may be specified in SELECT part of the query, like:
SELECT molecules where MoleculeStoichiometricFormula = "C10H20"this query should return only a list of molecules, including references and excluding all the state-related or process-related information.
If query specifies a states or quantum numbers branch, parent species information should also be provided. Contrary the selection of species should not cause the output of states or quantum numbers.
For the full list of possible branch selectors, their behaviour and relations see the Requestables section of the VAMDC dictionary
Another addition is the prefixes for restrictables keywords, used for processes selection refinement.
When selecting transition information, prefices to the state-related restrictables are grouping them by upper or lower state. Examples:
Select * where lower.StateEnergy = 0 and upper.StateEnergy > 1000
select * where StateEnergy = 0 and upper.StateEnergy > 1000
Note: query
select * where StateEnergy < 100 and lower.StateEnergy>100
will (and must) return no results.
If the prefix is omitted in the query, constraints apply to both states of a transition.
When selecting collision information, prefices are reactantX and productX, where X is a case-insensitive alphanumeric symbol [0-9A-Z] , defining a group of keywords applying to the single reactant. For the special case of two-particle collisions, in which it is necessary to distinguish the incident particle, the prefices target and collider may be used instead.
Example:
Select collisions,states where reactantA.AtomSymbol = "O" and reactantA.AtomIonCharge = 1 and reactant2.MoleculeStoichiometricFormula in ("HN","HC","C2")
would return all reactions data between atomic ion and NH, CH or C2 molecules
Another example:
Select collisions,states where reactantA.AtomSymbol = "O" and ( reactantA.AtomIonCharge = 1 OR reactantA.AtomIonCharge = 2)
should return all reactions data of one or two times ionized oxygen atom.
VSS2 is intended to be interoperable across all VAMDC databases. The same query can be accepted at all database, even though different data are raised from each.
Making VSS2 fully interoperable and capable of easy implementation would exclude all the higher functions of SQL (mainly features of the WHERE clause, such as sub-queries). Where these features can easily be added to a particular query-processor it would be a beneficial to have them. Therefore, certain parts of SQL92 are noted as extensions to VSS2. A query containing extensions is not strictly valid VSS2 and therefore not interoperable across all database; it may work on particular database. Users and applications should normally use only valid VSS2, but may use extensions on queries written specially for individual databases.
To be a valid in VSS2, a query must satisfy the syntactic rules of the SQL92 language.
VSS2 queries never alter the database to which they are applied.
VSS2 queries must not contain the JOIN keyword.
Column names used as operands in a VSS2 query must be terms taken from the VAMDC dictionary. Column names may be written in any mix of upper and lower case and query processors must treat all variations of case as equivalent.
Column names appearing in the WHERE clause must be taken from the Restrictables dictionary. These names may be qualified by a context, e.g. to distinguish between the upper and lower states of an electronic transition. The qualified name is written with the context name as a prefix to the restrictable name, separated by a full stop, e.g. upper.StateEnergy. The following contexts are defined in VSS2.
Context prefices may be written in any mix of upper and lower case and query processors must treat all variations of case as equivalent. This includes the final character in the reactantx and productx prefices: reactantA must be treated as equivalent to reactanta.
The list of column names following the SELECT keyword, which specify the columns from which data are to be returned, must be taken from the Requestables dictionary, or must contain only the single keyword ALL (that keyword having its normal meaning in SQL92). Note that the ‘columns’ in this dictionary are composites. In a tabular representation of the results a requestable ‘column’ may produce multiple output-columns. In an XSAMS representation, a requestable ‘column’ may produce an XML fragment with significant sub-structure. Column names may be written in any mix of upper and lower case and query processors must treat all variations of case as equivalent.
The query processor must implement the translation of the dictionary terms to names of real columns in the underlying database.
VSS2 processors may accept only a sub-set of the dictionary keywords, corresponding to the content of the underlying database. This sub-set naturally varies between databases and the set of restrictables and requestables for a given database is normally made available to the clients of the database. Where a query includes restrictables or requestables not supported by a given VSS2 processor, the processor must reject the query; it must not process the query while ignoring the unsupported items.
A VSS2 processor should accept only the (possibly empty) sub-set of the context prefices that apply to its database. E.g. processors that have no data on reactions should reject the reactantx prefix.
When processing a query that contains valid VSS2 plus extensions, the behaviour is defined by the implementation of the query processor. The processor may reject the query, or it may ignore the extensions that it does not support.
The following parts of SQL92 constitute VSS2 extensions: EXISTS, GROUP BY, HAVING, UNION, INTERSECT, EXCEPT, MINUS, ORDER BY, LIMIT, DECLARE, FETCH, CLOSE.
The SQL92 standard [SQL92] should be consulted for the normative rules of syntax. These notes are for easy reference. VSS2 excludes so much of SQL that only the low-level aspects of the syntax are relevant.
SQL queries are written as text strings containing keywords, operators and operands separated by white space. Operands are names of tables and columns, sometimes called SQL identifiers or literal values. Identifiers and literals are sensitive to case; keywords and operators are not. There is a convention of writing keywords in upper case.
Queries can contain any Unicode character, but the keywords can be written using only ASCII characters. In VSS2, the valid identifiers also use only ASCII characters.
White space is required between keywords and operands but not between operators and operands. A typical (simple) VSS2 query looks like this:
SELECT ALL WHERE AtomIonCharge>6
This query would be equally valid:
SELECT ALL WHERE AtomIonCharge > 6
Here, data are selected from the columns AtomIonCharge and AtomNuclearCharge (note the use of a comma-separated list to specify the columns) of the table States according to a criterion on the electronic charge of the ions. String literals are delimited by single quotes (the ASCII apostrophe character) thus:
... WHERE AtomSymbol='Fe' ...
To include an apostrophe in a string, write two consecutive apostrophe-characters. If an identifier contains ‘special characters’ (typically white space), it must be protected with double quotes thus:
SELECT "silly column name" WHERE...
[SQL92] Information Technology - Database Language SQL (Proposed revised text of DIS 9075) July 1992 http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
The query-language “VAMDC SQL Sub-set 1” (VSS1) was defined. This language was the only one formally supported by VAMDC data-services in the 11.05 release of the system.
No new version of the standard was issued for this system release, but informal support for a new query language, VSS2, was implemented in some nodes.
VSS2 is VAMDC SQL Sub-set 2 and is a super-set of VSS1. Nodes that can process VSS2 queries can process VSS1 queries using the same code.
The features in VSS2 but not in VSS1 are:
VSS2 was formally defined. Its specification document is independent of that for VSS1, even though the two languages are related.
The prefices on restrictables were changed. “Upper” and “lower” replaced “initial” and “final”.
In v11.12 of the system, VAMDC nodes are expected to accept either VSS1 or VSS2.
VSS1 query language becomes obsolete, all nodes must support VSS2 queries now.
VSS2 specification stays unchanged. More keywords for restrictables and requestables are defined in VAMDC dictionary document
The query-language “VAMDC SQL Sub-set 1” (VSS1) was defined. This language was the only one formally supported by VAMDC data-services in the 11.05 release of the system.
No new version of the standard was issued for this system release, but informal support for a new query language, VSS2, was implemented in some nodes.
VSS2 is VAMDC SQL Sub-set 2 and is a super-set of VSS1. Nodes that can process VSS2 queries can process VSS1 queries using the same code.
The features in VSS2 but not in VSS1 are:
VSS2 was formally defined. Its specification document is independent of that for VSS1, even though the two languages are related.
The prefices on restrictables were changed. “Upper” and “lower” replaced “initial” and “final”.
In v11.12 of the system, VAMDC nodes are expected to accept either VSS1 or VSS2.
VSS1 query language becomes obsolete, all nodes must support VSS2 queries now.
VSS2 specification stays unchanged. More keywords for restrictables and requestables are defined in VAMDC dictionary document
The Restrictables, Requestables and Returnables continue to evolve to reflect the changes to the XML Schema (XSAMS) and the query language. Listed below are the additions and deletions for each category. Note that a renaming will be represented as deletion of the old and addition of the new keyword.
Added MoleculeBasisStates keyword, renamed RadiativeCrossections into RadiativeCrossSections
Additions:
Returnables evolve a lot with each XSAMS and Python Node Software release, so no precise changelog is given.
AtomStateTotalAngMom, Inchi, IonCharge, MethodCategory, MoleculeQNJ, MoleculeQNK, MoleculeQNKa, MoleculeQNKc, MoleculeQNv, MoleculeQNv1, MoleculeQNv2, MoleculeQNv3, MoleculeStateNuclearSpinIsomer, ParticleName, RadTransBroadeningDoppler, RadTransBroadeningInstrument, RadTransBroadeningNatural, RadTransBroadeningPressure, StateEnergy, StateLifeTime, StateStatisticalWeight
AtomInchi, AtomInchiKey, AtomIonCharge, AtomStateEnergy, AtomStateID, AtomStateLifeTime, AtomStateStatisticalWeight, CollisionThreshold, MoleculeInchi, MoleculeInchiKey, MoleculeNormalModeIntensity, MoleculeStateCharacLifeTime, MoleculeStateCharacNuclearSpinSymmetry, MoleculeStateEnergy, MoleculeStateID
In VAMDC, different pieces of software need to communicate to each other. Apart from protocols and schema, a common vocabulary is needed. By this we mean a list of “global keywords” that should consist of reasonably short, human-readable keywords which uniquely define a certain type of information or data. In the following we describe how the keywords were created and how they are used in different parts of VAMDC software. The common gain in the various aspects is that the vocabulary allows to split the tasks that are common to all data sets from the database-specific information and routines. Thereby it becomes possible to implement software that can be re-used by multiple datasets, reducing the deployment on a new data set to implementing the parts that are truly specific for it.
In order to compile a list of well-defined names for all kinds of information that VAMDC datasets can contain, we started from the XSAMS schema for atomic and molecular data, that is used as a main data model within the project.
Flattened and stripped, xsams-derived keywords took form like AtomStateLandeFactor, SourceAuthorName, MolecularSpeciesIonCharge.
The keywords representing desired branches of XSAMS like Species,Processes, RadiativeTransitions,Collisions were added, those would find use in future VSS2 query language.
The VAMDC keywords form three overlapping subsets:
- Restrictables, used in registries and in VSS query language, any client software and VAMDC user portal must use them to be able to request the data from VAMDC.
- Returnables that are currently used in registries and internally in the Django TAP-VAMDC service implementation, they define placeholders in XSAMS tree for user data output.
- Requestables that are due to be added to the VSS2 version of the query language. They would describe the branches of the XSAMS schema client wants to see in the output document produced by the service.
The two aforementioned dictionaries RETURNABLES and RESTRICTABLES contain the most important information about each data set in the form of global keywords: what kind of data is contained in the database and which of these make sense to restrict in the query. By using only the keys in these key-value pairs we can compile this information in a format (XML-template) that the registry understands. Once this extension to the registry is specified, the portal will be able to decide from the information in the registry which databases might have a sensible answer to a particular query and only send it to these.
In data model VAMDC does not enforce the use of a certain unit for a certain physical quantity. However, in order to make queries understood by all nodes, the keywords that are used as RESTRICTABLE have a default unit, which is the one used in the query. This means that each node must be aware and convert the query to its internal unit before executing the query. For returned data the node is free to use whatever applicable units from XSAMS UnitsType.
Requestables, a future part of the VSS2 query language, defines a user-selectable branches of XSAMS schema for output. For example, client could request only species information, without any process data.
Requesting information about atoms, including the states information.
Requesting information about atoms, without their states.
collisional process data
method information
The basis states for a set of molecular states expressed as a linear combination on some basis
Request the full molecule information, including states and quantum numbers.
request molecules, including their states but excluding the quantum numbers
Request molecules, without information about their states.
non-radiative transitions data
request particle information only
data for all available processes
Absorption (or emission?) cross sections as a function of wavelength or frequency-equivalent
radiative transitions data
Restrict the search to databases containing information about solids.
source reference information
only brief species information, without states
complete states information
The following keywords may be used as restrictables in TAP-VAMDC queries using VSS1 language, also they are added to registry for each new node.
Note that each node supports only a small subset of the keywords. The list of supported keywords may be retrieved through VOSI Capabilities service endpoint. See the TAP-VAMDC documentation for further details.
Return data excluding any additions or improvements that were made after the given date (YYYY-MM-DD). This allows for reproducing an earlier query. Note that probably not all nodes support this.
Type: string
Constraints:
The atomic mass is the mass of an atom expressed in unified atomic mass unit u. It is defined as 1/12 of the rest mass of an unbound carbon-12 atom in its nuclear and electronic ground state. 1 u = 1.660538782(83)E-27 kg.
Units: u
Type: floating-point number
Constraints: >1
Atomic mass number (A), also called mass number or nucleon number, is the total number of protons and neutrons (together known as nucleons) in an atomic nucleus. Because protons and neutrons both are baryons, the mass number A is identical with the baryon number B as of the nucleus as of the whole atom or ion. The mass number is different for each different isotope of a chemical element.
Type: integer number
Constraints: >0
The total angular momentum of a nucleus, usually represented as l. For electrons spin and orbital angular momentum are treated separately but particles in a nucleus generally behave as a single entity with intrinsic angular momentum I. Associated with each nuclear spin is a nuclear magnetic moment which produces magnetic interactions with its environment.
Type: floating-point number
Constraints:
Coupling scheme used to describe the state. Currently five coupling schemes are supported LS, jj, J1J2, jK and LK. For a detailed description of these and other schemes see, e.g., Atomic Spectroscopy at http://physics.nist.gov/Pubs/AtSpec/index.html
Type: string
Constraints:
Ionization energy in eV
Units: 1/cm
Type: floating-point number
Constraints: >0
Magnetic quantum number of a state, can be integer or half-integer, positive and negative.
Type: floating-point number
Constraints:
State parity. Can have values: “even”, “odd” or “undefined”
Type: string
Constraints:
The quantum defect is a correction applied to the potential to account for the fact that the inner electrons do not entirely screen the corresponding charge of the nucleus. It is particularity important for atoms with single electron in the outer shell.
Type: floating-point number
Constraints:
The concentration of a species contributing to an Environment
Type: floating-point number
Constraints:
The mole fraction of a species contributing to an Environment
Type: floating-point number
Constraints:
The partial pressure of a species contributing to an Environment
Type: floating-point number
Constraints:
Environment temperature
Units: K
Type: floating-point number
Constraints: >0
The total number density of particles comprising an Environment
Units: 1/cm3
Type: floating-point number
Constraints:
Environment total pressure
Units: Pa
Type: floating-point number
Constraints: >=0
The IUPAC International Chemical Identifier (InChI) is a textual identifier for chemical substances, designed to provide a standard and human-readable way to encode atomic and molecular information and facilitate the search and exchange of such such information in databases and on the web.
Type: string
Constraints:
InChi key is hashed, fixed-length (currently 27 character) form of International Chemical Identifier (InChI) string describing a given atom/ion/isotope. InChIKeys consist of 14 characters resulting from a hash of the connectivity information of the InChI, followed by a hyphen, followed by 9 characters resulting from a hash of the remaining layers of the InChI, followed by a single character indication the version of InChI used, another hyphen, followed by single checksum character. More information about InChI and InChI Key can be found at http://www.iupac.org/inchi/
Type: string
Constraints:
Method category. Allowed values are: experiment, theory, ritz, recommended, evaluated, empirical, scalingLaw, semiempirical, compilation, derived
Type: string
Constraints:
Conventional molecule name, e.g. CO2, NH3, Feh (may not be unique)
Type: string
Constraints:
The harmonic frequency of a normal mode.
Units: MHz
Type: floating-point number
Constraints:
The molecular J quantum number for total angular momentum excluding nuclear spin
Type: floating-point number
Constraints:
K is the quantum number associated with the projection of the total angular momentum excluding nuclear spin, J, onto the molecular symmetry axis.
Type: integer number
Constraints:
Ka is the rotational quantum label of an asymmetric top molecule, correlating to K in the prolate symmetric top limit.
Type: integer number
Constraints:
Kc is the rotational quantum label of an asymmetric top molecule, correlating to K in the oblate symmetric top limit.
Type: integer number
Constraints:
For diatomic molecules, the vibrational quantum number, v
Type: integer number
Constraints:
Nuclear spin isomer (symmetry) of a molecular state. Can take values like ‘ortho’,’para’,’A’,’E’,’meta’, etc.
Type: string
Constraints:
Particle name, one of photon, electron, muon, positron, neutron, alpha, cosmic
Type: string
Constraints:
Only Restrictable (not NULL) to make a query where there is Broadening information.
Type: string
Constraints:
Only Restrictable (not NULL) to make a query where there is Broadening information.
Type: string
Constraints:
Only Restrictable (not NULL) to make a query where there is Broadening information.
Type: string
Constraints:
Only Restrictable (not NULL) to make a query where there is Broadening information.
Type: string
Constraints:
Effective Lande factor for a given transition
Type: floating-point number
Constraints:
Radiative transition frequency.
Units: MHz
Type: floating-point number
Constraints:
The Einstein coefficient for spontaneous radiative de-excitation (emission) A.
Units: 1/s
Type: floating-point number
Constraints: >= 0
Line profile-integrated absorption for transition between two energy levels. Line strength K = hν / 4π (n<sub>1</sub> B<sub>12</sub> - n<sub>2</sub> B<sub>21</sub>)
Units: 1/cm
Type: floating-point number
Constraints: >0
Radiative transition vacuum wavelength
Units: A
Type: floating-point number
Constraints:
Type of publication, e.g. journal, book etc.
Type: string
Constraints: Journal | Book | Proceedings | On-line
Node-specific species identifier, last measure to uniquely identify species if any other identifiers collide
Type: string
Constraints:
Life time of an atomic state in s.
Units: s
Type: floating-point number
Constraints: >0
Internal VAMDC species identifier, inchikey plus suffix, used in case inchikeys collide for molecules.
Type: string
Constraints:
The following keywords are used as Returnables in Django implementation of TAP-VAMDC node software. Returnables is an internal concept of the Django implementation, defining the names of the placeholders in the schema, where data producer may put his data. There is no requirement for other implementations of VAMDC-TAP to include support for them. Some of the keywords suppose additional suffixes that allows them to be expanded into DataType xsams object. For further information see the Django TAP-VAMDC documentation.
Another use case of returnables is the possibilty to determine if it make sense to look for a certain piece of data in the output documents of the node. But even if the node declares that it has that kind of data in it’s output, there is no guarantee that it will be present in a response for a particular query.
For the sake of not exploding the list below, keywords of a certain type are omitted. These are the ones that belong to a DataType in the XSAMS schema. A DataType has a value (the physical quantity itself) and can have units, comments, a method, references and an accuracy in different formats. Therefore, if a keyword SomeKeyword is marked as a DataType, the following words can also be used as Returnables, even though they are not listed below.
CollisionTabulatedDataYAccuracyMethod
Type: string
Constraints:
The IUPAC International Chemical Identifier (InChI) is a textual identifier for chemical substances, designed to provide a standard and human-readable way to encode atomic and molecular information and facilitate the search and exchange of such such information in databases and on the web.
Type: string
Constraints:
InChi key is hashed, fixed-length (currently 27 character) form of International Chemical Identifier (InChI) string describing a given atom/ion/isotope. InChIKeys consist of 14 characters resulting from a hash of the connectivity information of the InChI, followed by a hyphen, followed by 9 characters resulting from a hash of the remaining layers of the InChI, followed by a single character indication the version of InChI used, another hyphen, followed by single checksum character. More information about InChI and InChI Key can be found at http://www.iupac.org/inchi/
Type: string
Constraints:
The atomic mass is the mass of an atom expressed in unified atomic mass unit u. It is defined as 1/12 of the rest mass of an unbound carbon-12 atom in its nuclear and electronic ground state. 1 u = 1.660538782(83)E-27 kg.
Units: u
Type: floating-point number
Has DataType suffixes support
Constraints: >1
Atomic mass number (A), also called mass number or nucleon number, is the total number of protons and neutrons (together known as nucleons) in an atomic nucleus. Because protons and neutrons both are baryons, the mass number A is identical with the baryon number B as of the nucleus as of the whole atom or ion. The mass number is different for each different isotope of a chemical element.
Type: integer number
Constraints: >0
The total angular momentum of a nucleus, usually represented as l. For electrons spin and orbital angular momentum are treated separately but particles in a nucleus generally behave as a single entity with intrinsic angular momentum I. Associated with each nuclear spin is a nuclear magnetic moment which produces magnetic interactions with its environment.
Type: floating-point number
Constraints:
Reference key generated by the node software that connects processes and states to specific species. Each such key points at a single Species block in the XSAMS structure,
Type: string
Constraints:
This attribute should be set to true if and only if a state was added to be referenced as energyOrigin of StateEnergy or lowestEnergyStateRef of Nuclear spin isomer and does not actually match the conditions of a query that produced the document.
Type: string
Constraints:
Sate description involves particular basis in which the wavefunction can be described by a number of components and corresponding quantum numbers. In this case a comment can be added to each component.
Type: string
Constraints:
Atomic state is describe in particular framework resulting in specific presentation of the wavefunction. This comment is supposed to clarify the basis used for representing the specific state.
Type: string
Constraints:
String representing configuration in a condensed form. For instance, one may prefer to make use of a short configuration label 2s2.2p instead of providing details of shell populations etc.
Type: string
Constraints:
J1 or J2 quantum number for atomic core described in J1J2 coupling.
Type: integer number
Constraints:
j quantum number for the jj coupling view of an atomic core.
Type: integer number
Constraints:
J quantum number for the JK coupling view of an atomic core. J can be integer or half-integer.
Type: floating-point number
Constraints:
K quantum number for the JK coupling view of an atomic core. K can be integer or half-integer.
Type: floating-point number
Constraints:
S2 quantum number for the JK coupling view of an atomic core. S2 is the spin of the “external” term that couples with K to produce J. S2 is usually half-integer.
Type: floating-point number
Constraints:
K quantum number for the LK coupling view of an atomic core. K is the angular momentum of the “final” term is produced by the coupling of the total angular momentum L with the spin of the core S1. K is usually half-integer.
Type: floating-point number
Constraints: >0
L quantum number for the LK coupling view of an atomic core. L is the total angular momentum. L is integer.
Type: integer number
Constraints:
Core angular momentum symbol???
For example, “p”.
Type: integer number
Constraints:
S2 quantum number for the LK coupling view of an atomic core. S2 is the spin of the “external” term. S2 is usually half-integer.
Type: floating-point number
Constraints: >0
L quantum number for the LS coupling view of an atomic core. L is the total orbital angular momentum of the core which couples to the total spin S to produce J. L is integer.
Type: integer number
Constraints: >=0
Multiplicity of the core. Multiplicity is 2*S+1, where S is the total spin of the core.
Type: integer number
Constraints: >0
S quantum number for the LS coupling view of an atomic core. S is the total spin which couples with the orbital angular momentum of the core L to produce J. S is integer or half-integer.
Type: floating-point number
Constraints: >=0
This string element is used to represent an atomic term in a condensed form, if necessary. For instance, one may prefer to make use of a term label 3P instead of separately indicating the term S and L values.
Type: string
Constraints:
Coupling scheme used to describe the state. Currently five coupling schemes are supported LS, jj, J1J2, jK and LK. For a detailed description of these and other schemes see, e.g., Atomic Spectroscopy at http://physics.nist.gov/Pubs/AtSpec/index.html
Type: string
Constraints:
Optional AtomicCore element (type AtomicCoreType), that is used to compactly represent the atomic core. For instance, one may prefer to use notation [Ne]3d to describe the excited configuration in a Na-like ion. In this case, it would be sufficient to only indicate the ElementCore element set to “Ne”.
Type: string
Constraints:
Energy of the level
Units: 1/cm
Type: floating-point number
Has DataType suffixes support
Constraints: >=0
Hyperfine splitting due to magnetic dipole interaction
Type: floating-point number
Has DataType suffixes support
Constraints:
Hyperfine splitting due to electric quadrupole interaction
Type: floating-point number
Has DataType suffixes support
Constraints:
ID for an atomic state, e.g. for linking a process to the state
Type: string
Constraints:
Ionization energy in eV
Units: 1/cm
Type: floating-point number
Has DataType suffixes support
Constraints: >0
Lande factor
Type: floating-point number
Has DataType suffixes support
Constraints:
Life time of an atomic state in s.
Units: s
Type: floating-point number
Has DataType suffixes support
Constraints: >0
Magnetic quantum number of a state, can be integer or half-integer, positive and negative.
Type: floating-point number
Constraints:
Mixing coefficient is the coefficient in the expansion of a wave function on a specific basis. It could be - squared (non-negative) or signed. The mandatory attribute mixingClass indicates the nature of the mixing coefficient and the specifics of the expansion.
Type: floating-point number
Constraints:
Mandatory attribute of the mixing coefficient with one of the two values: “squared” or “signed”
Type: string
Constraints:
State parity. Can have values: “even”, “odd” or “undefined”
Type: string
Constraints:
State polarizability.
Type: floating-point number
Has DataType suffixes support
Constraints:
The quantum defect is a correction applied to the potential to account for the fact that the inner electrons do not entirely screen the corresponding charge of the nucleus. It is particularity important for atoms with single electron in the outer shell.
Type: floating-point number
Has DataType suffixes support
Constraints:
The bibliographical references for a particular atomic state.
Type: string
Constraints:
Number of electrons in a specific shell.
Type: integer number
Constraints: >0
ID for a pair of shells for mixed states assigned by a database.
Type: string
Constraints:
ID for shell1 in a pair of shells assigned by a database.
Type: string
Constraints:
Relativistic correction for shell 1 in a pair.
Type: floating-point number
Constraints:
Number of electrons in shell 1 in a pair.
Type: integer number
Constraints: >0
Orbital angular momentum of shell 1 in a pair.
Type: integer number
Constraints: >=0
Orbital angular momentum symbol for shell 1 in a pair.
Type: string
Constraints:
Principal quantum number of shell 1 in a pair.
Type: integer number
Constraints: >0
J1 or J2 in J1J2 coupling for shell 1 in a pair. Can be integer of half-integer.
Type: floating-point number
Constraints: >0
j in jj coupling for shell 1 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
j in jK coupling for shell 1 in pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
K in jK coupling for shell 1 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
S2 (spin of external electrons) in jK coupling for shell 1 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
K in LK coupling for shell 1 in a pair. Could be integer or half-integer..
Type: floating-point number
Constraints: >0
L in LK coupling for shell 1 in a pair. Could be integer or 0.
Type: integer number
Constraints: >=0
Orbital angular momentum symbol in LK coupling for shell 1 in a pair.
Type: string
Constraints:
S2 (spin of external electrons) in jK coupling for shell 1 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
L in LS coupling for shell 1 in a pair. Could be integer or 0.
Type: integer number
Constraints: >=0
Orbital angular momentum symbol in LS coupling for shell 1 in a pair.
Type: string
Constraints:
Multiplicity (2s+1) for shell 1 in a pair in LS coupling. Positive integer.
Type: integer number
Constraints: >0
Spin for shell 1 in a pair in LS coupling. Non-negative integer or half-integer.
Type: floating-point number
Constraints: >=0
Seniority for shell 1 in a pair in LS coupling. Non-negative integer.
Type: integer number
Constraints: >=0
Total angular momentum J for shell 1 in a pair. Could be non-negative integer or half-integer.
Type: floating-point number
Constraints: >=0
ID for shell2 in a pair of shells assigned by a database.
Type: string
Constraints:
Relativistic correction for shell 2 in a pair.
Type: floating-point number
Constraints:
Number of electrons in shell 2 in a pair.
Type: integer number
Constraints: >0
Orbital angular momentum of shell 2 in a pair.
Type: integer number
Constraints: >=0
Orbital angular momentum symbol for shell 2 in a pair.
Type: string
Constraints:
Principal quantum number of shell 2 in a pair.
Type: integer number
Constraints: >0
J1 or J2 in J1J2 coupling for shell 2 in a pair. Can be integer of half-integer..
Type: floating-point number
Constraints: >0
j in jj coupling for shell 2 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
j in jK coupling for shell 2 in a pair. Could be integer or half-integer..
Type: floating-point number
Constraints: >0
K in jK coupling for shell 2 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
S2 (spin of external electrons) in jK coupling for shell 2 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
K in LK coupling for shell 2 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
L in LK coupling for shell 2 in a pair. Could be integer or 0.
Type: integer number
Constraints: >=0
Orbital angular momentum symbol in LK coupling for shell 2 in a pair.
Type: integer number
Constraints:
S2 (spin of external electrons) in jK coupling for shell 2 in a pair. Could be integer or half-integer.
Type: floating-point number
Constraints: >0
L in LK coupling for shell 2 in a pair. Could be integer or 0.
Type: integer number
Constraints: >=0
Orbital angular momentum symbol in LS coupling for shell 2 in a pair.
Type: string
Constraints:
Multiplicity (2s+1) for shell 2 in a pair in LS coupling. Positive integer.
Type: integer number
Constraints: >0
Spin for shell 2 in a pair in LS coupling. Non-negative integer or half-integer.
Type: floating-point number
Constraints: >=0
Seniority for shell 2 in a pair in LS coupling. Non-negative integer.
Type: integer number
Constraints: >=0
Total angular momentum J for shell 2 in a pair. Could be non-negative integer or half-integer.
Type: floating-point number
Constraints: >=0
J1 or J2 quantum number for atomic core described in J1J2 coupling.
Type: integer number
Constraints:
Collision branching ratio
Type: floating-point number
Has DataType suffixes support
Constraints:
Number of elements in Linear Sequence
Type: integer number
Constraints:
The centre wavenumber, wavelength, etc. of a feature in an tabulated cross section
Type: floating-point number
Has DataType suffixes support
Constraints:
ID of a normal mode when referenced in the assignment of a band in an assigned cross section
Type: string
Constraints:
A string, optionally identifying a band in an assigned cross section, e.g. “asymmetric stretch”
Type: string
Constraints:
The width of an assigned feature in a tabulated cross section (in units of wavenumber, wavelength, etc.)
Type: floating-point number
Has DataType suffixes support
Constraints:
A string describing the cross section being given in a CrossSection element, e.g. ‘IR absorption cross section’
Type: string
Constraints:
Reference to an Environment ID describing the environment applicable to this cross section
Type: string
Constraints:
A reference to the ID of a species contributing to this cross section
Type: string
Constraints:
A list of whitespace-delimited values of the independent variable (e.g. wavelength) against which the cross section is given
Type: string
Constraints:
An error (accuracy) applying to each and every data point in the Cross section independent variable data series
Type: floating-point number
Constraints:
A list of errors (accuracy values), separated by whitespace, one for each of the data points listed in the cross section independent variable data series (e.g. wavenumber)
Type: string
Constraints:
The length of the linear series X_i = initial + increment * i giving the independent variable against which the cross section is given when this data series is an evenly-spaced series of values.
Type: integer number
Constraints:
The increment step in the linear series X_i = initial + increment * i giving the independent variable against which the cross section is given when this data series is an evenly-spaced series of values.
Type: floating-point number
Constraints:
The initial value in the linear series X_i = initial + increment * i giving the independent variable against which the cross section is given when this data series is an evenly-spaced series of values.
Type: floating-point number
Constraints:
The name of the independent variable against which the cross section is measured (e.g. wavenumber)
Type: string
Constraints:
The units of the independent variable against which the cross section is measured (e.g. 1/cm)
Type: string
Constraints:
A whitespace-delimited list of data points comprising the cross section
Type: string
Constraints:
A single error (accuracy) value applying to each and every data point of the cross section
Type: floating-point number
Constraints:
A white-space delimited list of error (accuracy) values for each data point given for the cross section
Type: string
Constraints:
The length of the linear series Y_i = initial + increment * i, giving the independent variable against which the cross section is given when this data series is an evenly-spaced series of values
Type: integer number
Constraints:
The increment in the linear series Y_i = initial + increment * i giving the independent variable against which the cross section is given when this data series is an evenly-spaced series of values
Type: floating-point number
Constraints:
The initial valie of the linear series Y_i = initial + increment * i, giving the independent variable against which the cross section is given when this data series is an evenly-spaced series of values
Type: floating-point number
Constraints:
Name of the Cross Section parameter given (e.g. ‘sigma’)
Type: string
Constraints:
Units of the cross section (e.g. ‘Mb’, ‘arbitrary’, ‘km/mol’)
Type: string
Constraints:
A reference to the ID, of the form ‘Exxx’, identifying the environment referenced here
Type: string
Constraints:
The concentration of a species contributing to an Environment
Type: floating-point number
Has DataType suffixes support
Constraints:
The mole fraction of a species contributing to an Environment
Type: floating-point number
Has DataType suffixes support
Constraints:
The name of a species contributing to an Environment
Type: string
Constraints:
The partial pressure of a species contributing to an Environment
Type: floating-point number
Has DataType suffixes support
Constraints:
The reference to an ID of a species contributing to an Environment
Type: string
Constraints:
Environment temperature
Units: K
Type: floating-point number
Has DataType suffixes support
Constraints: >0
The total number density of particles comprising an Environment
Units: 1/cm3
Type: floating-point number
Has DataType suffixes support
Constraints:
Environment total pressure
Units: Pa
Type: floating-point number
Has DataType suffixes support
Constraints: >=0
The lower limit of validity for this argument to the fit or model function
Type: floating-point number
Constraints:
The name of this argument to the fit or model function
Type: string
Constraints:
The units of this argument to the fit or model function
Type: string
Constraints:
The upper limit of validity for this argument to the fit or model function
Type: floating-point number
Constraints:
A description of this parameter to the fit or model function
Type: string
Constraints:
A name of this parameter to the fit or model function
Type: string
Constraints:
A units of this parameter to the fit or model function
Type: string
Constraints:
Method category. Allowed values are: experiment, theory, ritz, recommended, evaluated, empirical, scalingLaw, semiempirical, compilation, derived
Type: string
Constraints:
A single basis state in the description of a molecular state as an expansion in some basis
Type: string
Constraints:
The basis states for a set of molecular states expressed as a linear combination on some basis
Type: string
Constraints:
A Comment relating to this set of Basis states
Type: string
Constraints:
One or more source references relating to this set of Basis states
Type: string
Constraints:
Conventional molecule name, e.g. CO2, NH3, Feh (may not be unique)
Type: string
Constraints:
Units: u
Type: floating-point number
Has DataType suffixes support
Constraints:
Comments concerning this normal mode’s displacement vectors
Type: string
Constraints:
A reference to the atom in the molecule’s structure to which this displacement vector applies
Type: string
Constraints:
The x-component of this atom’s displacement vector
Type: floating-point number
Constraints:
The y-component of this atom’s displacement vector
Type: floating-point number
Constraints:
The z-component of this atom’s displacement vector
Type: floating-point number
Constraints:
A reference to the electronic state within which this normal mode applies
Type: string
Constraints:
The harmonic frequency of a normal mode.
Units: MHz
Type: floating-point number
Has DataType suffixes support
Constraints:
Normal mode intensity
Type: floating-point number
Has DataType suffixes support
Constraints:
The symmetry species of this normal mode within the point group of the molecule in the specified electronic state
Type: string
Constraints:
The ordinary structural formula, as it is usually written, for the molecule
Type: string
Constraints:
List of temperatures for which the partition functions are specified.
Type: floating-point number
Constraints:
Reference to the lowest rovibronic state of the nuclear spin isomer for which the partition functions are specified
Type: string
Constraints:
Symmetry of the lowest rovibronic state of the nuclear spin isomer.
Type: string
Constraints:
Name of the nuclear spin isomer for which the partition functions are specified
Type: string
Constraints:
Symmetry group in which the symmetry of lowest rovibronic state of the nuclear spin isomer is specified
Type: string
Constraints:
Unit(s) in which the temperatures for partition functions are given
Type: string
Constraints:
A label identifying the molecule’s electronic state, e.g. ‘X’, ‘A’, ‘b’
Type: string
Constraints:
The molecular state quantum number for total angular momentum including nuclear spin
Type: floating-point number
Constraints:
The molecular state quantum number for angular momentum including hyperfine coupling with one nuclear spin, F1 = J + I1
Type: floating-point number
Constraints:
Identifier for the nucleus coupling its spin to give F1: F1 = J + I1
Type: string
Constraints:
The molecular state quantum number for angular momentum including hyperfine coupling with the second of two nuclear spins: F2 = F1 + I2
Type: floating-point number
Constraints:
Identifier for the second nucleus coupling its spin to give F2: F2 = F1 + I2
Type: string
Constraints:
The Fj quantum number, for some intermediate nuclear spin coupling: Fj = Fj-1 + Ij (j>1), or Fj = J + Ij (j=1)
Type: floating-point number
Constraints:
The integer j, identifying the order of this nuclear spin coupling where several nuclear spins couple: Fj = Fj-1 + Ij (j>1)
Type: integer number
Constraints:
ID of the nuclear spin coupling to give quantum number Fj
Type: string
Constraints:
ID of the nuclear spin coupling to give quantum number F, the total angular momentum (including nuclear spin).
Type: string
Constraints:
The total nuclear spin quantum number for a coupled set of identical nuclear spins, I = I1 + I2 + ...
Type: floating-point number
Constraints:
The molecular J quantum number for total angular momentum excluding nuclear spin
Type: floating-point number
Constraints:
K is the quantum number associated with the projection of the total angular momentum excluding nuclear spin, J, onto the molecular symmetry axis.
Type: integer number
Constraints:
Ka is the rotational quantum label of an asymmetric top molecule, correlating to K in the prolate symmetric top limit.
Type: integer number
Constraints:
Kc is the rotational quantum label of an asymmetric top molecule, correlating to K in the oblate symmetric top limit.
Type: integer number
Constraints:
Lambda is the quantum number associated with the magnitude of the projection of the total electronic orbital angular momentum, L, onto the molecular axis.
Type: integer number
Constraints:
N is the quantum number associated with the total angular momentum excluding electronic and nuclear spin, N: J = N + S.
Type: integer number
Constraints:
Omega is the quantum number associated with the projection of the total angular momentum (excluding nuclear spin), J, onto the molecular axis.
Type: floating-point number
Constraints:
S is the quantum number associated with the total electronic spin angular momentum.
Type: floating-point number
Constraints:
Sigma is the quantum number associated with the magnitude of the projection of S onto the molecular axis.
Type: floating-point number
Constraints:
SpinComponentLabel is the positive integer identifying the spin-component label, Fx, where x=1,2,3,... in order of increasing energy for a given value of J - see Herzberg, Spectra of Diatomic Molecules, Van Nostrand, Princeton, N.J., 1950.
Type: string
Constraints:
a/s-symmetry: the symmetry of the rovibronic wavefunction, ‘a’ or ‘s’ such that the total wavefunction including nuclear spin is symmetric or antisymmetric with respect to permutation of identical nuclei
Type: string
Constraints:
elecInv is the parity of the electronic wavefunction with respect to inversion through the molecular centre of mass in the molecular coordinate system (‘g’ or ‘u’)
Type: string
Constraints:
The parity of the electronic wavefunction with respect to reflection in a plane containing the molecular symmetry axis in the molecular coordinate system (equivalent to inversion through the molecular centre of mass in the laboratory coordinate system), ‘+’ or ‘-‘
Type: string
Constraints:
kronigParity is the ‘rotationless’ parity: the parity of the total molecular wavefunction excluding nuclear spin and rotation with respect to inversion through the molecular centre of mass of all particles’ coordinates in the laboratory coordinate system, ‘e’ or ‘f’
Type: string
Constraints:
For linear triatomic molecules, the vibrational angular momentum quantum number associated with the nu2 bending vibration: l2 = v2, v2-2, ..., 1 or 0
Type: integer number
Constraints:
The vibrational angular momentum quantum number, l_i, associated with a degenerate vibrational mode, nu_i: li = vi, vi-2, ..., 1 or 0
Type: integer number
Constraints:
An integer identifying the degenerate vibrational mode to which the li quantum number belongs
Type: integer number
Constraints:
Total parity: the parity of the total molecular wavefunction (excluding nuclear spin) with respect to inversion through the molecular centre of mass of all particles’ coordinates in the laboratory coordinate system, the E* operation, ‘+’ or ‘-‘
Type: string
Constraints:
r is a named, positive integer label identifying the state if no other good quantum numbers or symmetries are known.
Type: integer number
Constraints:
rotSym is the symmetry species of the rotational wavefunction, in some appropriate symmetry group.
Type: string
Constraints:
The symmetry group used in giving the rotational symmetry species label
Type: string
Constraints:
For diatomic molecules, the vibrational quantum number, v
Type: integer number
Constraints:
The vi vibrational quantum number for the ith normal mode
Type: integer number
Constraints:
An integer identifying the vibrational normal mode for the vi quantum number
Type: integer number
Constraints:
vibInv is the parity of the vibrational wavefunction with respect to inversion through the molecular centre of mass in the molecular coordinate system. Only really necessary for molecules with a low barrier to such an inversion (for example, NH3), ‘s’ or ‘a’.
Type: string
Constraints:
vibRefl is the parity of the vibrational wavefunction with respect to reflection in a plane containing the molecular symmetry axis in the molecular coordinate system, ‘+’ or ‘-‘.
Type: string
Constraints:
The symmetry group used to specify the vibrational wavefunction symmetry species
Type: string
Constraints:
Case name for the case-by-case molecular state description
Type: string
Constraints:
Molecular properties such as molecular weight
Type: string
Constraints:
This attribute should be set to true if and only if a state was added to be referenced as energyOrigin of StateEnergy or lowestEnergyStateRef of Nuclear spin isomer and does not actually match the conditions of a query that produced the document.
Type: string
Constraints:
The energy of a molecular state
Units: 1/cm
Type: floating-point number
Has DataType suffixes support
Constraints:
A string identifying where the origin is taken for the energy of this molecular state
Type: string
Constraints:
One or more source references - these entries should match the sourceID attributes of the Sources.
Type: string
Constraints:
A boolean value, asserting that the state is fully assigned (true) or not (false)
Type: string
Constraints:
A string, of the form ‘Sxxx’ identifying this molecular state
Type: string
Constraints:
Molecular state lifetime in seconds
Units: s
Type: floating-point number
Has DataType suffixes support
Constraints: >0
Reference to the state of this spin isomer having the lowest energy
Type: string
Constraints:
The symmetry species of the rovibronic wavefunction of the lowest state of the nuclear spin isomer
Type: string
Constraints:
Spin isomer conventional name, like ‘ortho’,’para’,’meta’,’A’,’E’.
Type: string
Constraints:
The symmetry group used by MoleculeStateNSILowRoVibSym
Type: string
Constraints:
Nuclear statistical weight for a given molecular energy level
Type: integer number
Constraints: >0
A space-separated list of values for the matrix. For an arbitrary matrix, it has nrows*ncols entries. For a diagonal matrix there are nrows=ncols entries. For a symmetric matrix there are nrows(nrows+1)/2 entries etc.
Type: string
Constraints:
This is a space-separated list of column names for the parameter matrix, as many as there are columns.
Type: string
Constraints:
Molecular State parameter on matrix form; the matrix’ form, such as “symmetric”, “diagonal” etc.
Type: string
Constraints:
Molecular State parameters in matrix form; number of matrix columns
Type: integer number
Constraints:
Molecular state parameter data on matrix form, number of rows in matrix
Type: integer number
Constraints:
This is a space-separated list of row names for the parameter matrix, as many as there are rows.
Type: string
Constraints:
Molecular State parameters, units for data on matrix data form
Type: string
Constraints:
Molecular State parameter on matrix form; type of matrix values: “real”, “imaginary” or “complex”.
Type: string
Constraints:
State parameter with a specific value
Type: floating-point number
Has DataType suffixes support
Constraints:
Molecular State parameter reference string giving context.
Type: string
Constraints:
Molecular State parameter vector coordinate X
Type: floating-point number
Constraints:
Molecular State parameter vector coordinate Y
Type: floating-point number
Constraints:
Molecular State parameter vector coordinate Z
Type: floating-point number
Constraints:
Total statistical weight (degeneracy) for a given molecular energy level
Type: integer number
Constraints: >0
A unique string for each VAMDC node. For example used for xsams-internal referencing. This MUST be filled.
Type: string
Constraints:
Particle name, one of photon, electron, muon, positron, neutron, alpha, cosmic
Type: string
Constraints:
Comments relating to this Doppler broadening process
Type: string
Constraints:
A reference to an Environment ID, describing the environment (in particular, temperature) for this Doppler broadening process
Type: string
Constraints:
The name of the lineshape resulting from this Doppler broadening process (‘gaussian’, most likely).
Type: string
Constraints:
A parameter to the Doppler lineshape
Type: floating-point number
Has DataType suffixes support
Constraints:
The name of a parameter for the Doppler lineshape.
Type: string
Constraints:
A reference to the method by which this Doppler broadening process is determined.
Type: string
Constraints:
A source reference for Doppler broadening process.
Type: string
Constraints:
Comments relating to instrumental line broadening
Type: string
Constraints:
The ID of an Environment element, describing the environment of the intstrumental broadening process
Type: string
Constraints:
Instrument broadening lineshape name
Type: string
Constraints:
An instrument broadening lineshape parameter
Type: floating-point number
Has DataType suffixes support
Constraints:
The name of a parameter used in the description of an instrument-broadening lineshape.
Type: string
Constraints:
A reference to the Method by which the instrument-broadening process is determined.
Type: string
Constraints:
A Source reference for the instrument-broadening process.
Type: string
Constraints:
Comments relating to this natural (radiative) broadening process
Type: string
Constraints:
The ID of an Environment element, describing the environment of this natural broadening process
Type: string
Constraints:
The name of the line shape used to describe this natural line broadening
Type: string
Constraints:
A broadening parameter for natural broadening.
Type: floating-point number
Has DataType suffixes support
Constraints:
The name of natural broadening parameters.
Type: string
Constraints:
A reference to the Method by which this natural broadening line shape was determined
Type: string
Constraints:
A Source reference for this natural broadening line shape
Type: string
Constraints:
Comments relating to this pressure broadening process
Type: string
Constraints:
A reference to the Environment element describing the environment (temperature, pressure, composition) of this pressure broadening process
Type: string
Constraints:
The name of the line shape used to describe the line broadening by pressure-broadening.
Type: string
Constraints:
A parameter to the pressure-broadened line shape.
Type: floating-point number
Has DataType suffixes support
Constraints:
Type: floating-point number
Constraints:
Type: floating-point number
Constraints:
Type: floating-point number
Constraints:
The name of this parameter to the pressure-broadened line shape.
Type: string
Constraints:
A reference to the Method by which this pressure-broadened line shape was determined.
Type: string
Constraints:
A Source reference for this pressure-broadened line shape.
Type: string
Constraints:
Effective Lande factor for a given transition
Type: floating-point number
Has DataType suffixes support
Constraints:
The energy of a radiative transition
Type: floating-point number
Has DataType suffixes support
Constraints:
Radiative transition frequency.
Units: MHz
Type: floating-point number
Has DataType suffixes support
Constraints:
Reference to the lower State of this radiative transition.
Type: string
Constraints:
The Einstein coefficient for spontaneous radiative de-excitation (emission) A.
Units: 1/s
Type: floating-point number
Has DataType suffixes support
Constraints: >= 0
Type: floating-point number
Has DataType suffixes support
Constraints:
Line profile-integrated absorption for transition between two energy levels. Line strength K = hν / 4π (n<sub>1</sub> B<sub>12</sub> - n<sub>2</sub> B<sub>21</sub>)
Units: 1/cm
Type: floating-point number
Has DataType suffixes support
Constraints: >0
Type: floating-point number
Has DataType suffixes support
Constraints:
Type: floating-point number
Has DataType suffixes support
Constraints:
Type: floating-point number
Has DataType suffixes support
Constraints:
The pressure-shifting process for a radiative transition.
Type: string
Constraints:
Comments relating to this pressure-shifting process.
Type: string
Constraints:
A reference to an Environment element giving the environment (pressure, temperature, composition) in which this pressure-shifting process occurs.
Type: string
Constraints:
Reference to the Method by which this pressure-shifting process was determined.
Type: string
Constraints:
Shifting parameter value
Type: floating-point number
Has DataType suffixes support
Constraints:
Reference to a Source for this pressure-shifting process.
Type: string
Constraints:
A string, ‘excitation’ or ‘deexcitation’, determining whether a radiative transition is given in absorption or emission respectively
Type: string
Constraints:
Radiative transition vacuum wavelength
Units: A
Type: floating-point number
Has DataType suffixes support
Constraints:
The vactor to convert air wavelength to vacuum
Type: floating-point number
Has DataType suffixes support
Constraints:
The environment reference which the wavelength was measured in
Type: string
Constraints:
Boolean whether the wavelength is in vacuum (default) or not.
Type: string
Constraints:
Radiative transition wavenumber.
Type: floating-point number
Has DataType suffixes support
Constraints:
Type of publication, e.g. journal, book etc.
Type: string
Constraints: Journal | Book | Proceedings | On-line
“TAP-VAMDC” is the working title for the emerging data-access services that return data in XSAMS format. To provide the easily-accessible statistics of the response document, several custom HTTP headers were defined. They are reported for both HTTP HEAD and HTTP GET queries to the TAP-VAMDC sync endpoint.
The following headers represent document statistics, all should be integer numbers.
Total count of the atomic Ion and Molecule records with distinct SpecieID attribute.
Count of the atomic Ion records with distinct SpecieID attribute.
Count of the Molecule records with distinct SpecieID attribute.
Count of distinct Source records
Count of distinct State records, both AtomicState and MolecularState combined
Count of the CollisionalTransition elements of the Processes branch of XSAMS.
Count of the RadiativeTransition elements of the Processes branch of XSAMS.
Count of the NonRadiativeTransition elements of the Processes branch of XSAMS.
With a reasonable database layout the nodes should easily be able to gather these numbers by running COUNT queries on their corresponding tables.
A TAP-XSAMS service can limit the amount of data it returns via the synchronous interface, for example to prevent the fetching of the whole database or for performance reasons. The service may then fill the HTTP-header of the response with the field VAMDC-TRUNCATED that indicates the percentage
VAMDC-TRUNCATED: 2.9 %
VAMDC-APPROX-SIZE HTTP header is intended to provide the estimation of the size of the response document. It should return an integer value, representing estimate uncompressed document size in megabytes.
VAMDC e-science infrastructure uses a data model that is serialized in a XML schema called VAMDC-XSAMS schema. The links point to the latest release of the VAMDC-XSAMS schema (though the previous releases are left below for history)
The International Virtual Observatory Alliance (IVOA) registry allows astronomers to search, obtain details of, and leverage any of the resources located anywhere in the IVO space, namely in any Virtual Observatory. The IVOA defines the protocols and standards whereby different registry services are able to interoperate and thereby realise this goal. IVOA registry defines interfaces on how to query and share resources. Software is written to conform to standard interfaces in order to assist scientific utilities to access particular resource. A resource in this context is represented in XML form and is stored in the registry. A resource may describe anything about an observatory, particular instrument, another registry, and services such as catalogue or table type services, cone searches. Extensions can be made if necessary and this functionality is made available for VAMDC.
Resources conform to a standard schema and every XML request is validated to the schema before it can be submitted to the registry for querying. More information on IVOA schemas can be found here: http://www.ivoa.net/xml/index.html
XML resources derive from a common top layer schema titled ‘VOResource’. The VOResource may also be referred to as ‘Core’ or ‘Dublin Core’ as it contains the complete set of the necessary core data. More information on VOResource documentation can be found here: http://www.ivoa.net/Documents/REC/ReR/VOResource-20080222.html
Every resource in the registry must have an identifier (similar a primary key), which is URI based. A sample: ivo://vamdc/chianti/chianti_catalogue_service
Identifier must be in the following form:
ivo://{authorityid}/{resourcekey}
Registry manages authorityIDs. Any other registry cannot duplicate an authorityID, it is owned by one registry only. For the purposes of VAMDC only one authority id vamdc is managed at present. ResourceKey is a localised name and is unique in respect of the authorityID.
Though an identifier can be of any form, it is widely accepted that authorityID is a domain name or a subsection of an institute, such as mssl.ucl.ac.uk or climatephysics.mssl.ucl.ac.uk. Currently the assumption has been made that VAMDC only needs one main registry and will use the authority ID vamdc. A ResourceKey is typically a name with reference to the registered resource.
The Support interface is required by all IVOA compliant services and defines common interfaces for its services. The registry uses common support interfaces to help populate resources in the registry.
The End User does not have the capability to access registry via this Web interface. Only Scientist and other Technical users of VAMDC can use the registry to add or update resources in the VAMDC registry. End Users use other client programs such as the Astrogrid VODesktop to query on the resources located inside the registry.
Querying requires software to conform to the IVOA registry interface specification: http://www.ivoa.net/Documents/RegistryInterface/ . There are four main query methods:
See Querying the Registry for VAMDC Resources for the available libraries and help querying of the registry.
The recommended way to look for things in the registry is to send in queries in the XQuery language. The registry responds with XML documents carrying the information matching the query.
For a given XQuery and for a given programming language, the details of the query can be encapsulated in a client library; the library phrases the query based on simple parameters to a method call. This has been done for typical VAMDC queries from Java, and the library is described below. Often this is all you need, but sometimes it is easier or more efficient to make the query directly from your application code.
If you do not understand the basics of XQuery you will not understand the details of this section. Either skip ahead to the descriptions of the client library or have a look at an XQuery tutorial.
This is an example of a registry XQuery. It finds the formal names of all the VAMDC-TAP services.
declare namespace ri='http://www.ivoa.net/xml/RegistryInterface/v1.0';
for $x in //ri:Resource
where $x/capability[@standardID='ivo://vamdc/std/VAMDC-TAP']
and $x/@status='active'
return $x/identifier
The query could be translated as “Find all the registration documents containing a capability with the VAMDC-TAP identifier, taking only those for active services; give me the IVORNs and throw the rest away”. The XPath construct //ri:Resource means “all the registration documents”. Because this searches for a type of element, and because types have namespaces, we have to map the namespace to a prefix (the first line) and use that prefix in specifying the type (the ri: in ri:Resource).
The registry’s response will be be a document containing identifier elements as immediate children of the document element.
Most queries will be in this general form. It is important to restrict the search to active resources because the registry contains some that are “inactive” (resting, pending refurbishment) or “deleted” (gone for good, but not actually removed from the registry database).
This is a possible rearrangement of the query above.
declare namespace ri='http://www.ivoa.net/xml/RegistryInterface/v1.0';
return //ri:Resource[capability[@standardID='ivo://ivoa.net/std/TAP' and @status='active']/identifier
The constraints have been moved inside the square brackets in the return clause and the where clause disappears. Both queries should raise the same results; you can use whichever form is easiest for you.
Here is a different query, searching for TAP services.
declare namespace ri='http://www.ivoa.net/xml/RegistryInterface/v1.0';
for $x in //ri:Resource
where $x/capability[@standardID='ivo://ivoa.net/std/TAP']
and $x/@status='active'
return $x
Here, the identifier for the capability is different - IVOA TAP instead of VAMDC-TAP - and, more importantly, the query returns all the parts of the registration documents, not just the identifiers.
As a final example, here is a query to give the access URLs (the URLs to which you would send the data query) for VAMDC-TAP services that can return data on measured wavelengths of radiative transitions.
declare namespace ri='http://www.ivoa.net/xml/RegistryInterface/v1.0';
for $x in //ri:Resource
where $x/capability[@standardID='ivo://vamdc/std/VAMDC-TAP'
and returnable='RadTransWavelengthExperimentalValue']
and $x/@status='active'
return $x/capability[@standardID='ivo://vamdc/std/VAMDC-TAP']/interface/accessURL
The new trick here is to have a constraint - the part in square brackets - in both the where clause and the return clause. The constraint in the WHERE clause finds the right registrations and the one in the return clause makes sure that we get the URLs only from the VAMDC-TAP interfaces and not from any other interfaces those services might have.
The search term RadTransWavelengthExperimentalValue comes from the VAMDC dictionary. It appears in the query because VAMDC-TAP service register their returnables using that dictionary. The term is not inherent to XQuery or to the registry.
There are, broadly, four ways to put a query to the registry from Java. In increasing order of abstraction and preference they are:
- Call the registry (SOAP) web-service directly.
- Use the AstroGrid client library.
- Use the AstroGrid Astro-Runtime API.
- Use the VAMDC client library.
The AstroGrid client library is worth considering. If you have a simple query (e.g. if you already know the identifier for the service of choice and just want to extract the access URL) then the library is quite good. If you have a more-general query, particularly one that will return results from more than one registration, then the library has to be forced into a non-standard configuration to work properly.
The Astro Runtime is a better abstraction for the registry and is actually intended for applications programmers (the AstroGrid client-library above is aimed at system engineers). It can return results as Java objects rather than as XML, which is sometimes easier to deal with. However, you have to write your own query text, typically in XQuery. There is a VAMDC client-library (see below), which tries to abstract common queries so you don’t need to write any XQuery text. This library knows about (some of) the service types important in VAMDC. Support for forming queries is good. Support for parsing the results is limited; you either get a DOM or simple values in strings, depending on the kind of query.
A small (single-class) library is available for VAMDC work. Version 3.0 of this library as well as a zip file containing all the third-party, supporting jars are available for download from the links below. (The AstroGrid client-library for the registry is one of the third-party jars if you want to use it directly.)
Some usage notes follow. For the full range of function, see the Javadoc. Other technical descriptions of the software are available, but the main documentation is this page and the Javadoc.
To use the library, instantiate the single class eu.vamdc.registry.Registry. Each method call makes one registry query (internally, some of them make a sequence of queries but you receive a single set of results). You can reuse the object for multiple, successive queries, but it is not safe to share it between threads. The no-argument constructor makes a client for the release registry. To use the development registry, pass the constant Registry.DEVELOPMENT_REGISTRY_ENDPOINT to the constructor (that is a string literal stating the endpoint for the registry of choice). The ability to select the development registry was added in v2.0 of the client.
The library lets you query for three kinds of information: whole registration documents, IVORNs and access URLs. The latter two types are delivered as lists or sets of strings and the registration documents as org.w3c.dom.Document instances. In the documents, the document element is an uninteresting wrapper and the query results are its first-level children.
Here is an example of finding all the TAP services (this matches one of the XQuery examples in the section above):
import eu.vamdc.registry.Registry;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
...
Registry reggie = new Registry();
Document results = reggie.findTap();
NodeList nl = results.getDocumentElement().getElementsByTagName("ri:Resource");
for (int i = 0; i < nl.getLength(); i++) {
// Do something with this registration document...
}
You could also dismantle the results document using XSLT or XPATH. This might be better than using the DOM API.
If you want all the information for all the VAMDC-TAP services in the registry, there is a convenience method:
import eu.vamdc.registry.Registry;
Registry reggie = new Registry();
Document results = reggie.findVamdcTap();
Sometimes you just want the access URLs for a class of services. Here is how:
import eu.vamdc.registry.Registry;
import java.net.URL;
import java.util.Set;
...
Registry reggie = new Registry();
Set<String> results = reggie.findAccessUrlsByCapability(Registry.VAMDC_TAP_ID);
for (String s : results) {
URL u = new URL(s);
// Use this service...
}
Note the use of a string constant to set the standard-identifier for VAMDC-TAP. You could also write the literal identifier: ivo://vamdc/std/VAMDC-TAP.
If you want to select resources by special criteria, then you have to supply your own XQuery. Using the last example from the XQuery section above, this code looks for the access URLs of VAMDC-TAP services that can give wavelength data.
import eu.vamdc.registry.Registry;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
...
Registry reggie = new Registry();
String query =
"declare namespace ri='http://www.ivoa.net/xml/RegistryInterface/v1.0'; " +
"for $x in //ri:Resource " +
"where $x/capability[@standardID='ivo://vamdc/std/VAMDC-TAP' " +
"and restrictable='AtomSymbol'] " +
"and $x/@status='active' " +
"return $x/capability[@standardID='ivo://vamdc/std/VAMDC-TAP']/interface/accessURL";
Document results = reggie.executeXquery(query);
// NodeList nl = results.getDocumentElement().getElementsByTagName("ri:Resource");
NodeList nl = results.getDocumentElement().getElementsByTagName("accessURL");
for (int i = 0; i < nl.getLength(); i++) {
// Do something with this information...
System.out.println(nl.item(i).getFirstChild().getNodeValue());
}
Note the spaces at the end of each fragment of the query: these are necessary to make the overall query correct.
Some sample query routines are demonstrated in this eclipse project: registry-query-sample-project.tar.gz
Routines are:
Another implementation of VAMDC registry client library exists, used by VAMDC Portal and TAPValidator tool.
This client has some specific features:
It will be available for download from VAMDC official libraries section later on.
If you are only interested in getting data from VAMDC-TAP data-services, then you can access them through VAMDC web-portal, which handles registry lookups by itself.
This user guide only shows how to point to the VAMDC registry with Astrogrid VODesktop and the main query screen for the registry.
When VODesktop is launched, the first screen is normally VOExplorer. You can also find VOExplorer by selecting Window -> New VOExplorer in the menu. VOExplorer allows you to search the registry for resources in the registry. Once you select a resource you can View its contents and perform certain actions that VODesktop might be aware of such as querying a Catalogue Service or running a particular Application.
![]()
Search registry window
Clicking the ‘New Smart List’ button brings up a window to begin searching on the registry. As the Text Boxes are filled out it queries registries for a ‘count’ of how many resources would be returned, and allow making the decision to perform the query or add new constraints.
![]()
Resource list window
In the case of not being able to find any VAMDC resources it is possible that you located an incorrect registry. By clicking on VODesktop->Preferences brings up a window that allows switching to a different registry. Ensure that the correct VAMDC registry is selected (pointed to).
Production registry: http://registry.vamdc.eu/vamdc_registry/services/RegistryQueryv1_0
Development registry: http://casx019-zone1.ast.cam.ac.uk/registry/services/RegistryQueryv1_0
![]()
VO Preferences
If you are a new user you must first get a password to register your service in the registry. For passwords contact registry @ vamdc.eu .
The registration process is as follows:
Install VAMDC-TAP service on a web visible server.
Go to registry web user interface and register your service core data. (As explained in Web Administration)
Ask the registry to load VOSI data from your service. You invoke this from the registry’s web user-interface (see Web Administration) and the registry reads the capability data described below. You have to provide the URL for the VOSI data. In a typical VAMDC-TAP service this will be a URL ending in tap/capabilities; e.g.:
http://vald.astro.uu.se/tap/capabilities
The registry contains an XML registration-document for each known service (and for a few things that are not strictly services, but we ignore them here).
Inside each service registration-document there are capability elements defining the kinds of service provided. Inside each capability is an interface (sometimes more than one) and inside each interface is an accessURL element the value of which defines where to find that part of the service.
A typical registry query from code looks only at the capabilities and ignores the rest of the registration.
Here’s an example of a capability:
<capability standardID="ivo://ivoa.net/std/TAP">
<interface xsi:type="vs:ParamHTTP">
<accessURL use="base">http://vald.astro.uu.se/tap/</accessURL>
</interface>
</capability>
This refers to a web service on the Uppsala mirror of VALD.
Note the standardID attribute: the value of this identifies this capability as IVOA’s Table Access Protocol (TAP) web-service. Capabilities for standard protocols always have a standardID attribute to tell them apart.
The interface element has xsi:type="vs:ParamHTTP", meaning that basic HTTP GET or POST work on this interface, and that HTTP parameters are involved in the protocol. Other types are vs:WebService, meaning a SOAP endpoint, and vr:WebBrowser, meaning a web site for interactive viewing. The accessURL element identifies a web-resource on a server in Uppsala. The use=”base” attribute means that the client must add a suffix to the given URL to get a working URL for a query. The nature of the suffix is defined by the protocol identified in the standardID attribute of the capability. Here, because it is TAP, we know to add /sync? and then the HTTP parameters defining a query. Here is another example, from the same registration.
<capability standardID="ivo://vamdc/std/VAMDC-TAP"
xmlns:tx="http://www.vamdc.org/xml/VAMDC-TAP/v1.0" xsi:type="tx:VamdcTap">
<interface xsi:type="vs:ParamHTTP">
<accessURL use="base">http://vald.astro.uu.se/tap/</accessURL>
</interface>
<returnable>AtomStateLandeFactorRef</returnable>
<returnable>AtomNuclearCharge</returnable>
<returnable>SourceCategory</returnable>
...
<restrictable>AtomStateEnergy</restrictable>
<restrictable>AtomNuclearCharge</restrictable>
<restrictable>RadTransLogGF</restrictable>
<restrictable>AtomSymbol</restrictable>
<restrictable>RadTransWavelengthExperimentalValue</restrictable>
<restrictable>AtomIonCharge</restrictable>
</capability>
This is the VAMDC-TAP capability for the same VALD mirror. It is very similar to the TAP example (VAMDC-TAP being a specialization of TAP); in fact the access URL is the same.
The main difference is the returnable and restrictable elements following the interface element. The returnables tell you which quantities can be obtained from this service. The restrictables tell you which terms can be used as contraints in the query (i.e. which column names can appear in the WHERE clause of a query).
The standardID value identifies this capability as VAMDC-TAP. The xsi:type attribute identifies the structural type as one for which the XML schema allows the returnable and restrictable children.
In both these examples, and in all capabilities you are likely to see, the elements are in the default namespace. This means that they are written without a namespace prefix, and you do not state a namespace when searching for elements by their names. However, some of the types have specific namespaces; if you search for elements by type you will have to deal with their those.
Please note the exact namespace used for the VAMDC-TAP example, above: http://www.vamdc.org/xml/VAMDC-TAP/v1.0. Earlier examples used the namespace http://www.vamdc.eu/xml/TAPXSAMS/v1.0 which is no longer valid.
The standardID value identifying the capability for VAMDC nodes has changed from ivo://vamdc/std/TAP-XSAMS to ivo://vamdc/std/VAMDC-TAP. The examples of querying have been updated accordingly.
The namespace of the schema ruling the VAMDC-TAP capability has changed from http://www.vamdc.eu/xml/TAPXSAMS/v1.0 to http://www.vamdc.org/xml/VAMDC-TAP/v1.0. The section on registering services have been updated accordingly.
The recommended version of the VAMDC client-library for the registry is changed from 2.0 to 3.0.
The sample project associated with the guide used some out-of-date constants. These have been updated to the current standard.
The standardID value identifying the capability for VAMDC nodes has changed from ivo://vamdc/std/TAP-XSAMS to ivo://vamdc/std/VAMDC-TAP. The examples of querying have been updated accordingly.
The namespace of the schema ruling the VAMDC-TAP capability has changed from http://www.vamdc.eu/xml/TAPXSAMS/v1.0 to http://www.vamdc.org/xml/VAMDC-TAP/v1.0. The section on registering services have been updated accordingly.
The recommended version of the VAMDC client-library for the registry is changed from 2.0 to 3.0.
The sample project associated with the guide used some out-of-date constants. These have been updated to the current standard.
from\to | Å | nm | microns | mm | cm | m | 1/cm | Hz | kHz | MHz | GHz | THz |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Å | x1 | x0.1 | x10^-4 | x10^-7 | x10^-8 | x10^-10 | 10^8/ | cx10^13/ | cx10^10/ | cx10^7/ | cx10^4/ | cx10/ |
nm | x10 | x1 | x10^-3 | x10^-6 | x10^-7 | x10^-9 | 10^7/ | cx10^12/ | cx10^9/ | cx10^6/ | cx10^3/ | c/ |
microns | x10^4 | x10^3 | x1 | x10^-3 | x10^-4 | x10^-6 | 10^4/ | cx10^9/ | cx10^6/ | cx10^3/ | c/ | cx10^-3/ |
mm | x10^7 | x10^6 | x10^3 | x1 | x0.1 | x10^-3 | 10/ | cx10^6/ | cx10^3/ | c/ | cx10^-3/ | cx10^-6/ |
cm | x10^8 | x10^7 | x10^4 | x10 | x1 | x0.01 | 1/ | cx10^5/ | cx10^2/ | cx0.1/ | cx10^-4/ | cx10^-7/ |
m | x10^10 | x10^9 | x10^6 | x10^3 | x100 | x1 | 0.01/ | cx10^3/ | c/ | cx10^-3/ | cx10^-6/ | cx10^-9/ |
1/cm | 10^8/ | 10^7/ | 10^4/ | 10/ | 1/ | 0.01/ | x1 | /(cx10^5) | /(cx10^2) | /(cx0.1) | /(cx10^-4) | /(cx10^-7) |
Hz | cx10^13/ | cx10^12/ | cx10^9/ | cx10^6/ | cx10^5/ | cx10^3/ | /(cx10^5) | x1 | x10^-3 | x10^-6 | x10^-9 | x10^-12 |
kHz | cx10^10/ | cx10^9/ | cx10^6/ | cx10^3/ | cx10^2/ | c/ | /(cx10^2) | x10^3 | x1 | x10^-3 | x10^-6 | x10^-9 |
MHz | cx10^7/ | cx10^6/ | cx10^3/ | c/ | cx0.1/ | cx10^-3/ | /(cx0.1) | x10^6 | x10^3 | x1 | x10^-3 | x10^-6 |
GHz | cx10^4/ | cx10^3/ | c/ | cx10^-3/ | cx10^-4/ | cx10^-6/ | /(cx10^-4) | x10^9 | x10^6 | x10^3 | x1 | x10^-3 |
THz | cx10/ | c/ | cx10^-6/ | cx10^-6/ | cx10^-7/ | cx10^-9/ | /(cx10^-7) | x10^12 | x10^9 | x10^6 | x10^3 | x1 |
from\to | J | erg | eV | 1/cm | Hz |
---|---|---|---|---|---|
J | x1 | x10^7 | x6.24150934x10^18 | x5.03411701x10^22 | x1.509190311x10^33 |
erg | x10^-7 | x1 | x6.24150934x10^11 | x5.03411701x10^15 | x1.509190311x10^26 |
eV | x1.602176565x10^-19 | x1.602176565x10^-12 | x1 | x8.06554429x10^3 | x2.417989349x10^14 |
1/cm | x1.986445684x10^-23 | x1.986445684x10^-16 | x1.239841930x10^-4 | x1 | /(cx10^5) |
Hz | x6.626069574x10^-34 | x6.626069574x10^-27 | x4.135667517x10^-15 | /(cx10^5) | x1 |
— Conversion coefficients are based on constant values from the NIST http://physics.nist.gov/cuu/Constants/index.htmlhttp://physics.nist.gov/cuu/Constants/index.html
In order to uniquely identify common species across participant VAMDC databases, the Standard IUPAC International Chemical Identifier, and in particular a hash (based on SHA-1) of this identifier (the Standard InChIKey) must be generated for each species (i.e. atom or molecule) within each participant VAMDC node.
This is a brief overview of InChI and InChIKey. For futher informaiton see the documentation at IUPAC.
To ensure compatibility with external databases, and to give VAMDC members the widest choice of tools:
InChI is defined as “a series of characters derived by applying a set of rules to a chemical structure to provide a unique digital ‘signature’ for a compound.”
Included in the scope of InChI:
Excluded from the scope of InChI:
The InChI has a layered structure and up to 6 layers can be specified:
{InChI version}
1. Main Layer (M):
/{formula}
/c{connections}
/h{H_atoms}
2. Charge Layer
/q{charge}
/p{protons}
3. Stereo Layer
/b{stereo:dbond}
/t{stereo:sp3}
/m{stereo:sp3:inverted}
/s{stereo:type (1=abs, 2=rel, 3=rac)}
4. Isotopic Layer (MI):
/i{isotopic:atoms}*
/h{isotopic:exchangeable_H}
/b{isotopic:stereo:dbond}
/t{isotopic:stereo:sp3}
/m{isotopic:stereo:sp3:inverted}
/s{isotopic:stereo:type (1=abs, 2=rel, 3=rac)}
5. Fixed H Layer (F):
/f{fixed_H:formula}*
/h{fixed_H:H_fixed}
/q{fixed_H:charge}
/b{fixed_H:stereo:dbond}
/t{fixed_H:stereo:sp3}
/m{fixed_H:stereo:sp3:inverted}
/s{fixed_H:stereo:type (1=abs, 2=rel, 3=rac)}
(6.) Fixed/Isotopic Combination (FI)
/i{fixed_H:isotopic:atoms}*
/b{fixed_H:isotopic:stereo:dbond}
/t{fixed_H:isotopic:stereo:sp3}
/m{fixed_H:isotopic:stereo:sp3:inverted}
/s{fixed_H:isotopic:stereo:type (1=abs, 2=rel, 3=rac)}
/o{transposition}
See the InChI Technical Manual for more details.
Standard InChI was defined to ensure interoperability/compatibility between large databases/web searching & information exchange. It is a subset of InChI.
Standard InChI distinguishes between chemical substances at the level of “connectivity, “stereochemistry”, and “isotopic composition”, where:
Standard InChI prefix: InChI=1S/...........
Non-standard InChI prefix: InChI=1/...........
The Standard InChI organometallic representation does not include bonds to metal for the time being. This has important implications for some species - e.g. metal cyanides and isocyanides are currently indistinguishable with Standard InChI. (Depending on how many molecules this affects, we may need to make some exceptions to the Standard InChI rule.)
The process of generating an InChI takes the following structure normalization steps:
Step 1. Alter the structure drawing
Step 2. Disconnect “salts”
Step 3. Disconnect metals
Step 4. Eliminate radicals if possible
Step 5. Process variable protonation (charges and mobile H)
Step 5.1. Remove protons from charged heteroatoms
Step 5.2. Remove protons from neutral heteroatoms
Step 5.3. Add protons to reduce negative charge
Step 6. Process charges and mobile H
Step 6, procedure 1: Simple tautomerism detection
Step 6, procedure 2. Moveable positive charge detection
Step 6, procedure 3. Additional normalization
See the InChI Technical Manual for more details.
The InChIKey is a fixed length SHA-256 hash of InChI (27 characters, including two hyphens). Its fixed length makes it easy to index and it is thus designed for databases and web searching.
The InChIKey also serves as a checksum for verifying an InChI, for example, after transmission over a network.
The structure of the InChIKey is illustrated thus:
AAAAAAAAAAAAAA-BBBBBBBBFV-P
It consists of:
14 character hash of basic InChI layer - encodes molecular skeleton (should be the same for all isotopologues)
8 character hash of remaining layers (except protonation)
F = S or N (standard or non-standard)
V = A (InChI version 1)
P = (de) protonation indicator = N for neutral, M for -1, O for +1 proton, etc
Standard:
InChI=1S/...........
AAAAAAAAAAAAAA-BBBBBBBBSA-P
Non-standard:
InChI=1/...........
AAAAAAAAAAAAAA-BBBBBBBBNA-P
As with InChI, Standard InChIKeys do not account for tautomerism & indicates only absolute stereo (or completely ignores stereo). Also does not account for original structure’s bonds to metal.
In all cases, within VAMDC, the Standard InChI(Key) must be generated.
The species must be written in a chemoinformatic form which specifies its structure. The core version 1.04 InChI Tools only support the .mol and .sdf formats. CML was supported by InChI version 1.03, but this was withdrawn in version 1.04 (though OpenBabel supports this and many other input formats - e.g. SMILES).
Use the InChI Trust Software
Input must be in the form of .MOL or .SDFile. Version 1.03 accepts CML format as well.
Use an online converter:
InChI Trust Experimental Converter
(experimental converter powered by OASA/BKChem)
(experimental converter powered by Openbabel)
Use conversion tools:
E.g. Openbabel. Openbabel facilitates conversions from many different formats (e.g. .mol, .sdf, SMILES, CML)
Use a chemical drawing package:
E.g. Chemsketch
Web Based Lookup:
The example below is for Methane:
SMILES:
C
or (explicitly specifying hydrogen):
[C]([H])([H])([H])[H]
CML:
<molecule id="CH4-1">
<atomArray>
<atom id="C1" elementType="C"/>
<atom id="H1" elementType="H"/>
<atom id="H2" elementType="H"/>
<atom id="H3" elementType="H"/>
<atom id="H4" elementType="H"/>
</atomArray>
<bondArray>
<bond atomRefs2="C1 H1" id="C1_H1" order="S"/>
<bond atomRefs2="C1 H2" id="C1_H2" order="S"/>
<bond atomRefs2="C1 H3" id="C1_H3" order="S"/>
<bond atomRefs2="C1 H4" id="C1_H4" order="S"/>
</bondArray>
</molecule>
Both inputs will result in the following InChI and InChIKey:
InChI=1S/CH4/h1H4
VNWKTOKETHGBQD-UHFFFAOYSA-N
Some, but not all, isomerism is supported in Standard InChI(Key).
Structural isomers (same molecular formula, different connectivity) always yield different Standard InChI(Key)s.
Some stereoisomers (same molecular formula, different spatial orientation), such as cis- and trans- versions of a species can also yield distinct Standard InChI(Key)s. Note, however, that this is not always true. Two examples are cis- and trans-hydroxymethylene and cis- and trans-difluoroethene. The former yields only one distinct InChI(Key). The latter yields two distinct InChI(Key)s.
Different isotopologues (same molecule, same structure, different constituent isotopes) also yield different Standard InChI(Key)s. Note that in the case of isotopologues, ONLY the elements in the species that differ from the most abundant isotopes should have their isotopes explicitly specified. (See also the last section of this document.)
The example below is for C-13 Methane:
SMILES:
[13CH4]
or (explicitly specifying hydrogen):
[13C]([H])([H])([H])[H]
CML:
<molecule id="CH4-2">
<atomArray>
<atom id="C1" elementType="C" isotopeNumber="13"/>
<atom id="H1" elementType="H"/>
<atom id="H2" elementType="H"/>
<atom id="H3" elementType="H"/>
<atom id="H4" elementType="H"/>
</atomArray>
<bondArray>
<bond atomRefs2="C1 H1" id="C1_H1" order="S"/>
<bond atomRefs2="C1 H2" id="C1_H2" order="S"/>
<bond atomRefs2="C1 H3" id="C1_H3" order="S"/>
<bond atomRefs2="C1 H4" id="C1_H4" order="S"/>
</bondArray>
</molecule>
Both inputs will result in the following InChI and InChIKey:
InChI=1S/CH4/h1H4/i1+1
VNWKTOKETHGBQD-OUBTZVSYSA-N
Note that the first 14 characters of the InChIKey are identical to the one generated above for C-12 methane.
In principle, simple InChIs can be hand-produced (e.g. for elements) and the InChI Trust Software API used to generate the InChIKey. However, use of this mechanism to generate InChI(Key)s is unwise. A good illustration of the problem is the generation of an InChI for the Hydrogen Ion (i.e. the proton):
INCORRECT:
InChI=1S/H/q+1
ASSFXGJQJOXDAB-UHFFFAOYSA-N
CORRECT:
InChI=1S/p+1
GPRLSGONYQIRFK-UHFFFAOYSA-N
InChI uses a defined algorithm (see earlier) to generate IDs for complex structures. These must not be hand-generated or guessed.
InChI assumes the average (terrestrial) abundance when the isotope is not specified in the originating format.
This affects the 31 elements in the table below.
Species that contain the most abundant elements should NOT specify the isotope. This ensures compatibility of InChI(Key)s with external databases (e.g. NIST).
If specificity is required in any of the 31 exceptions, the affected element (and only that element) should have its isotope specified when generating the InChI and InChIKey.
Table of InChI Assumed Isotope Masses when isotope not explicitly specified
Element | Symbol | Most Abundant Isotope Mass | InChI Assumed Mass |
Nickel | Ni | 58 | 59 |
Copper | Cu | 63 | 64 |
Zinc | Zn | 64 | 65 |
Gallium | Ga | 69 | 70 |
Germanium | Ge | 74 | 73 |
Selenium | Se | 80 | 79 |
Bromine | Br | 79 | 80 |
Zirconium | Zr | 90 | 91 |
Molybdenum | Mo | 98 | 96 |
Ruthenium | Ru | 102 | 101 |
Silver | Ag | 107 | 108 |
Cadmium | Cd | 114 | 112 |
Tin | Sn | 120 | 119 |
Antimony | Sb | 121 | 122 |
Tellurium | Te | 130 | 128 |
Xenon | Xe | 132 | 131 |
Barium | Ba | 138 | 137 |
Neodymium | Nd | 142 | 144 |
Samarium | Sm | 152 | 150 |
Europium | Eu | 153 | 152 |
Gadolinium | Gd | 158 | 157 |
Dysprosium | Dy | 164 | 163 |
Erbium | Er | 166 | 167 |
Ytterbium | Yb | 174 | 173 |
Hafnium | Hf | 180 | 178 |
Rhenium | Re | 187 | 186 |
Osmium | Os | 192 | 190 |
Iridium | Ir | 193 | 192 |
Mercury | Hg | 202 | 201 |
Thallium | Tl | 205 | 204 |
Lead | Pb | 208 | 207 |
Applications to process data in XSAMS format may be made available as web sites, which makes them accessible for interactive use, or as web services, making them accessible to scripts and other software. This standard prescribes a form for these web applications that includes both the interactive web-site and scriptable web-service.
Web applications conforming to this standard can be registered in the VAMDC registry. Registration makes the applications available to generic UIs such as the VAMDC portal.
Conforming web applications can read data either from a URL (e.g. the portal passes a data-extract URL leading to a VAMDC database) or from an uploaded file (e.g. a user loads data from a file on his computer).
The web application consists in a set of web resources (where a “web resource” is anything that has its own URL) arranged in a tree structure. This standard specifies the web resources that must be provided, including their names, semantics and positions in the tree. A web application may also provide other web resources.
All the required web-resources must be available via HTTP v1.1.
None of the required web-resources use the Simple Object Access Protocol (SOAP). All these resources follow the paradigm Representational State Transfer (REST).
The core of the application is a web resource called, in this standard, its primary result. This resource represents the result of the application’s processing of one or more XSAMS input documents. The application must have exactly one primary result, implying that all the inputs are combined.
The primary result may be machine-readable (e.g. a transformation of the XSAMS input into HITRAN’s legacy format) or human readable (i.e. a web page). If human readable, it may well link to further pages showing related results (e.g. the primary result might be a web page showing a table of atomic states, with links to the details of each state). A machine-readable result may also contain links, but this standard does not define how the links are accessed. Hence, a generic client, such as the VAMDC portal, would not know how to follow the links.
An HTTP request to the primary result specifies the input data, either by passing URLs for those or by including them in the request. Therefore, a client application or script can call the primary result directly as a web service.
To make the application accessible from web browsers, the application must also supply a web page that leads the user to the primary result, allowing him to specify the input data. This resource is typically an HTML form and is referred to as “the form” in the rest of this standard.
Applications following this standard are expected to be registered, so they must provide a “VOSI capabilities” resource to convey registration details to the VAMDC registry.
Some of the registration details are repeated, in human-readable format, on the form.
The application should also provide a “VOSI availability” resource to allow checking of the system.
The root resource of the application must be available via HTTP GET and must contain a browser-accessible html form and a short verbose description of the XSAMS processor service.
The form must allow the user to specify the input data either by giving URLs (which the user might copy and paste from some other UI) or by uploading files from the desktop. Both modes must be available.
The appearance and behaviour of the form should be fixed. It should not vary according to parameters in the URL that displays it, or in response to information cached in the user’s browser session. If the form is made adaptable via parameters or responsive to session history, then it must be useable in the absence of this information.
The form must give a general description of the processing that will be applied to incoming XSAMS documents. This description may be given either in form of a link to another page or in form of a paragraph on the root page.
The author of the application may choose freely the content of the form provided that it meets the requirements above.
The primary result is at the URL /service relative to the root URL.
In a given application, the primary result may have any MIME-type, but it must always have the same MIME-type in that application.
The primary result must accept all of the following ways to specify the input data.
The application is only required to process XSAMS inputs and should typically reject other types.
The application may be written to process a single XSAMS document, a fixed number of documents, any number of documents up to a fixed limit, or any number without limit. If the request contains the wrong number of documents, then the application must reject the request.
To enable successful use of XSAMS Processor web service both as user-accessible and scriptable service, following response scenario should be taken, independently of the type of incoming request:
Other status codes can be returned both for requests to primary result and cached result URLs:
XSAMS Processor is naturally obliged to cache either XSAMS documents, intermediate or final transformation result, or both. If caching XSAMS documents, final processing must be re-applied on every request to the result URL. If result is static page, service may cache only the result of the processing itself, immediately destroying incoming XSAMS documents after the completion of processing.
If processing is done in a streaming manner, only the result of processing may be cached.
Cache lifetime is defined by the XSAMS Processor developer/maintainer, it should be reasonably high for users to be able to come from the portal using the link to processing result, but not eternally since the disk capacity of the server running XSAMS Processor service is always limited.
The VOSI capabilities are a single XML-document at the URL /capabilities relative to the root resource. A “capability” is an XML fragment describing a particular aspect of an application. The general rules for VOSI capabilities are defined by IVOA’s VOSI standard.
For applications conforming to the current standard, there must be a capability following the schema http://www.vamdc.org/xml/XSAMS-consumer/v1.0. Such a capability provides two access URLs, one for the form (of type WebBrowser) and one for the primary result (of type ParamHTTP).
Capabilities must contain at least the following information:
The following code shows a sample capabilities-document, with the namespaces and locations of schema filled in:
<?xml version="1.0" encoding="UTF-8"?>
<cap:capabilities
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:cap="http://www.ivoa.net/xml/VOSICapabilities/v1.0"
xmlns:vs="http://www.ivoa.net/xml/VODataService/v1.0"
xmlns:vr="http://www.ivoa.net/xml/VOResource/v1.0"
xmlns:xc="http://www.vamdc.org/xml/XSAMS-consumer/v1.0"
xsi:schemaLocation="
http://www.ivoa.net/xml/VOSICapabilities/v1.0 http://www.vamdc.org/downloads/xml/VOSI-capabilities-1.0.xsd
http://www.ivoa.net/xml/XSAMS-consumer/v1.0 http://www.ivoa.net/xml/XSAMS-consumer/v1.0
http://www.ivoa.net/xml/VOResource/v1.0 http://www.ivoa.net/xml/VOResource/v1.0
http://www.ivoa.net/xml/VODataService/v1.0 http://www.ivoa.net/xml/VODataService/v1.0">
<capability standardID="ivo://vamdc/std/XSAMS-consumer" xsi:type="xc:XsamsConsumer">
<interface xsi:type="vr:WebBrowser">
<accessURL>http://some.server/some/app</accessURL>
</interface>
<interface xsi:type="vs:ParamHTTP">
<accessURL>http://some.server/some/app/service</accessURL>
<resultType>text/html</resultType>
</interface>
<versionOfStandards>12.07</versionOfStandards>
<versionOfSoftware>whatever</versionOfSoftware>
<numberOfInputs>1-100</numberOfInputs>
</capability>
</cap:capabilities>
The VOSI availability is a single XML-document at the URL /availability relative to the root resource.
The general rules for VOSI availability are defined by IVOA’s VOSI standard.
The application should be registered in the VAMDC registry. This makes it visible to generic UIs such as the VAMDC portal.
If registered, the registration-document type must be {http://www.ivoa.net/xml/VOResource/v1.0}Service as defined in the IVOA standard for registration. The registration must include the capability data taken from the VOSI-capabilities resource of the application, as detailed above.
Generic UIs will typically present users with a list of XSAMS processor web services. The title element of the application’s registration-document should be suitable to distinguish the application in such a list: it should state explicitly but tersely what the application does. More detailed description may be provided within the Description element of Contents block. This description may be presented to the end user before he submits XSAMS documents to the processor service.
To test the operation of processor service, or to integrate an existing service into some software as a data source, the following script may be used:
#!/bin/bash
#XSAMS Processor service URL, ending with /service
PROCESSOR=$1
#URL to XSAMS document, either a VAMDC node output or just saved anywhere on internet
XSAMSURL=$2
LOCATION=`curl -v --get --data-urlencode "url=${XSAMSURL}" $PROCESSOR 2>&1 \
| grep Location: \
| sed -e 's/<\ Location:\ //g' \
| sed -e 's/[\n\r]//g'`
while curl --head --silent ${LOCATION} | grep -q 202
do
echo "waiting for result from ${LOCATION}" 1>&2
sleep 1
done
curl --get --silent ${LOCATION}
The script accepts two parameters:
The downloaded processing result is sent to the standard output. This script may be integrated as an input to some scientific tool, if there exists an on-line Processor service that converts XSAMS into the format of this tool.
XML schema registration name stays unchanged
Omit the requirement of /info resource
Precise request scenario that must be implemented by the service application:
- Service must respond with 302 redirect to processing requests
- Service must cache documents or results
- service must employ 202 result code for the page indicating the progress of download and processing of incoming documents
Exclude mention of employing URL parameters to distinguish closely-related applications
Provide example client shell script
Below are links to documents available for download.
document | release 11.05 | release 11.12 | release 12.07 |
---|---|---|---|
data access protocol | v11.05 | v11.12 | v12.07 |
XSAMS Processor protocol | v11.12 | v12.07 | |
query language | v11.05 | v11.12 | v12.07 |
dictionary | v11.05 | v11.12 | v12.07 |
VAMDC-XSAMS schema | v0.2 | v0.3 | v1.0 |
VAMDC-XSAMS schema doc | v0.2 | v0.3 | v1.0 |
VAMDC-XSAMS reference guide | v0.2 | v11.12 | v12.07 |
VAMDC-XSAMS change log | v0.2 | included in the XSAMS ref. guide | |
Case-By-Case schema doc | v0.2 | view v.0.3 | view v.1.0 |
registry guide | v11.05 | v11.12 | v12.07 |
Species identification (InChI) | v11.12 |
2011-05-27: First release of standards. Version 11.05
2011-12-21: Second release of standards. Version 11.12