Basic concepts

Warning

This section is non-normative.

This section describes how to conceive and implement a simple CID process such as the first examples of the Discover CID document.

What is a manifest?

A manifest is a XML file which defines the processes implemented by a server. It must be accessible through an unauthenticated HTTP GET request. The validity of an XML manifest could be controlled with the XML schema of these specifications.

A manifest is composed of two main parts :

  • A list of processes
  • A list of transports

A process neutrally defines document transactions. It defines a list of steps without the technical information of the transport. The transport part could define several transport alternatives using the single transport family of these specifications: web transport; or using a dedicated transport family defined as an extension of these specifications.

Defining a basic process

Defining the steps

Let consider a simple process of file upload from a client to a server, it is possible to define this process in three steps :

  • a pre-upload exchange to check the authentication before the main request;
  • the file upload;
  • a post-upload interaction between the client and the user.

The minimal valid definition of this process is :

1
<cid:process>
2
    <cid:exchange url="http://example.com/checkAuth" required="false"/>
3
    <cid:upload url="http://example.com/upload" required="true"/>
4
    <cid:interact url="http://example.com/interact" required="true"/>
5
</cid:process>

Each step must defines its URL and a required attribute. A required step must be processed by the client while a non-required simply may be processed by the client.

Defining the metadata

This process could use specific metadata such as the file name or the content type. It could also return other meta such as the id of the uploaded file or its public URL. All the meta used in the process must be defined at the beginning of the process definition.

These meta could then be called by the steps in three different attributes :

  • useMetas: which means the metadata may be sent by the client;
  • needMetas: which means the metadata must be sent by the client;
  • returnMetas: which means the metadata must be returned by the server at the end of the step.

In this case, the definition could look like behind.

1
<?xml version="1.0" encoding="UTF-8"?>
2
<cid:manifest xmlns:cid="http://www.cid-protocol/schema/v1/core">
3
    <cid:process>
4
        <cid:meta name="file-name"/>
5
        <cid:meta name="content-type"/>
6
        <cid:meta name="internal-id"/>
7
        <cid:meta name="public-url"/>
8
        <cid:exchange url="http://example.com/checkAuth" needMetas="content-type" required="false"/>
9
        <cid:upload url="http://example.com/upload" needMetas="content-type" useMetas="file-name" required="true" returnMetas="internal-id"/>
10
        <cid:interact url="http://example.com/interact" needMetas="internal-id" returnMetas="public-url" required="true"/>
11
    </cid:process>
12
</cid:manifest>

In this example, the server creates its own file name if the client does not provide it. At the end of the upload step, the server returns the internal id which must be sent back by the client in order to let the server builds the interaction GUI (in a conventional stateless way).

Documenting the process for the client software

There is no reserved name for metadata. The manifest could provide its own name for any meta. For example the content type could be called contentType or just type or whatever else.

To allow the client to understand these names, the manifest could describe the metadata by an IRI which formally qualifies the content. Such an IRI is written in an attribute called is. This attribute could be used on the process, meta and step elements.

The main needed IRIs could be find on schema.org for generic actions and concepts or on purl.org/dc (the Dublin-Core) for the definition of specific meta .

Enriched with the appropriated IRIs, the manifest should look like behind.

1
<cid:process is="http://schema.org/SendAction">
2
    <cid:meta name="file-name" is="http://purl.org/dc/elements/1.1/title"/>
3
    <cid:meta name="content-type" is="http://www.w3.org/TR/html4/sgml/dtd.html#ContentType"/>
4
    <cid:meta name="internal-id" is="http://schema.org/productID"/>
5
    <cid:meta name="public-url" is="http://schema.org/URL"/>
6
    <cid:exchange url="http://example.com/checkAuth" needMetas="content-type" required="false" is="http://schema.org/AuthorizeAction"/>
7
    <cid:upload url="http://example.com/upload" needMetas="content-type" useMetas="file-name" required="true" returnMetas="internal-id"/>
8
    <cid:interact url="http://example.com/interact" needMetas="internal-id" returnMetas="public-url" required="true"/>
9
</cid:process>

The is attribute is used by the client to determine how to fill a meta or how to choose if an optional step need to be executed. Consequently, required steps not need for IRIs.

Documenting the process for human

A client which implement IRI recognition could automate the filling of the needed metadata. However, a client which does not implement this feature or which does not know the specific IRIs provided by the manifest could start this process alone.

In order to help the client to build GUI dedicated to the users, the manifest could also contain human-oriented documentation.

1
<cid:label xml:lang="en">CID manifest of the ECM service of my Example Company</cid:label>
2
<cid:process is="http://schema.org/SendAction">
3
    <cid:label xml:lang="en">Send a new file</cid:label>
4
    <cid:doc xml:lang="en">Send a new file into the ECM platform. This action allows you to send directly your files without using a classical webupload.</cid:doc>
5
    <cid:meta name="file-name" is="http://purl.org/dc/elements/1.1/title">
6
        <cid:label xml:lang="en">File name</cid:label>
7
    </cid:meta>
8
    <cid:meta name="content-type" is="http://www.w3.org/TR/html4/sgml/dtd.html#ContentType">
9
        <cid:label xml:lang="en">Content Type</cid:label>
10
         <cid:doc xml:lang="en">The type of the uploaded file writen following the MIME standard (RFC 2045). For example application/xml</cid:doc>
11
    </cid:meta>
12
    <cid:meta name="internal-id" is="http://schema.org/productID">
13
        <cid:label xml:lang="en">Internal identifier</cid:label>
14
<   /cid:meta>
15
    <cid:meta name="public-url" is="http://schema.org/URL">
16
        <cid:label xml:lang="en">Public URL</cid:label>
17
        <cid:doc xml:lang="en">The URL where the uploaded content could be retrieved.</cid:doc>
18
    </cid:meta>
19
    <cid:exchange url="http://example.com/checkAuth" needMetas="content-type" required="false" is="http://schema.org/AuthorizeAction"/>
20
    <cid:upload url="http://example.com/upload" needMetas="content-type" useMetas="file-name" required="true" returnMetas="internal-id"/>
21
    <cid:interact url="http://example.com/interact" needMetas="internal-id" returnMetas="public-url" required="true"/>
22
</cid:process>

Defining the transport

The second part of the manifest is dedicated to the transport. Theoretically, it is possible to define any application layer following the OSI model (HTTP, FTP, SMTP, etc.). However, these specifications define only a generic web transport (so with HTTP request). The transport definition include also the authentication scheme of the process. In this example, the server accepts basic authenticated requests (see RFC 2617 for more details).

A manifest which define a kind of step (interact, exchange, upload) must also define the transport modalities of this step. It is possible to define several possibilities for each kind of step. A transport-generic server should include :

1
<cid:transports>
2
    <cid:webTransport>
3
        <cid:authentications>
4
            <cid:basicHttp/>
5
        </cid:authentications>
6
7
        <cid:webExchange>
8
            <request method="GET" properties="header queryString"/>
9
            <request method="POST;application/x-www-form-urlencoded" properties="queryString header post"/>
10
            <request method="POST;multipart/form-data" properties="header queryString post"/>
11
        </cid:webExchange>
12
13
        <cid:webInteract>
14
            <request method="GET" properties="header queryString"/>
15
            <request method="POST;application/x-www-form-urlencoded" properties="queryString header post"/>
16
            <request method="POST;multipart/form-data" properties="header queryString post"/>
17
        </cid:webInteract>
18
19
        <cid:webUpload>
20
            <request method="PUT" properties="header queryString"/>
21
            <request method="POST" properties="header queryString"/>
22
            <request method="POST;multipart/form-data" properties="header queryString post"/>
23
        </cid:webUpload>
24
    </cid:webTransport>
25
</cid:transports> 

The definition of the HTTP method POST must be followed by the content type of the form when such a form is used. The properties attribute list the possibilities of metadata storage :

  • header means that the server accepts the metadata in the header of the HTTP request.

    A "text/plain" value of the "content-type" meta should be inserted as "content-type":"text/plain" inside the HTTP header.

  • queryString means that the server accepts the metadata in the URL as query string (see RFC 3986 section 3.4 for more details).

    A "text/plain" value of the "content-type" meta should be inserted to the URL as "(?|&)content-type=text/plain".

  • post means that the server accepts the metadata in the form of a POST request.

    The value is stored in field which share its name with the name of the meta ("content-type" in this example).

Implementation details

The complete manifest can be seen behind. It must be exposed by the server and accessible through a single HTTP GET request without any authentication.

1
<?xml version="1.0" encoding="UTF-8"?>
2
<cid:manifest xmlns:cid="http://www.cid-protocol.org/schema/v1/core">
3
    <cid:label xml:lang="en">CID manifest of the ECM service of my Example Company</cid:label>
4
    <cid:process is="http://schema.org/SendAction">
5
        <cid:label xml:lang="en">Send a new file</cid:label>
6
        <cid:doc xml:lang="en">Send a new file into the ECM platform. This action allows you to send directly your files
7
            without using a classical webupload.
8
        </cid:doc>
9
        <cid:meta name="file-name" is="http://purl.org/dc/elements/1.1/title">
10
            <cid:label xml:lang="en">File name</cid:label>
11
        </cid:meta>
12
        <cid:meta name="content-type" is="http://www.w3.org/TR/html4/sgml/dtd.html#ContentType">
13
            <cid:label xml:lang="en">Content Type</cid:label>
14
            <cid:doc xml:lang="en">The type of the uploaded file writen following the MIME standard (RFC 2045). For
15
                example application/xml
16
            </cid:doc>
17
        </cid:meta>
18
        <cid:meta name="internal-id" is="http://schema.org/productID">
19
            <cid:label xml:lang="en">Internal identifier</cid:label>
20
        </cid:meta>
21
        <cid:meta name="public-url" is="http://schema.org/URL">
22
            <cid:label xml:lang="en">Public URL</cid:label>
23
            <cid:doc xml:lang="en">The URL where the uploaded content could be retrieved.</cid:doc>
24
        </cid:meta>
25
        <cid:exchange url="http://example.com/checkAuth" needMetas="content-type" required="false"
26
                      is="http://schema.org/AuthorizeAction"/>
27
        <cid:upload url="http://example.com/upload" needMetas="content-type" useMetas="file-name" required="true"
28
                    returnMetas="internal-id"/>
29
        <cid:interact url="http://example.com/interact" needMetas="internal-id" returnMetas="public-url"
30
                      required="true"/>
31
    </cid:process>
32
33
    <cid:transports>
34
        <cid:webTransport>
35
            <cid:authentications>
36
                <cid:basicHttp/>
37
            </cid:authentications>
38
39
            <cid:webExchange>
40
                <request method="GET" properties="header queryString"/>
41
                <request method="POST;application/x-www-form-urlencoded" properties="queryString header post"/>
42
                <request method="POST;multipart/form-data" properties="header queryString post"/>
43
            </cid:webExchange>
44
45
            <cid:webInteract>
46
                <request method="GET" properties="header queryString"/>
47
                <request method="POST;application/x-www-form-urlencoded" properties="queryString header post"/>
48
                <request method="POST;multipart/form-data" properties="header queryString post"/>
49
            </cid:webInteract>
50
51
            <cid:webUpload>
52
                <request method="PUT" properties="header queryString"/>
53
                <request method="POST" properties="header queryString"/>
54
                <request method="POST;multipart/form-data" properties="header queryString post"/>
55
            </cid:webUpload>
56
        </cid:webTransport>
57
    </cid:transports>
58
</cid:manifest>

The client must download the manifest to analyze and execute the process.

  1. The client could begin the transaction by an authenticated exchange request containing the content-type meta. It could support one of the defined transport possibilities :

    • A HTTP GET method with metadata stored in the header of the request or in the URL as query string.

    • A HTTP POST method containing a URL encoded form. The metadata could be stored in the header, in the URL as query string or in the form.

    • A HTTP POST method containing a multipart form. The metadata could be stored in the header, in the URL as query string or in the form.

    The server must support any of the configuration written in the manifest.

  2. The client must then upload the document in a single authenticated request containing the content-type meta and optionally the file name. It could support one of the defined transports possibilities :

    • A HTTP PUT method with the file in the body and the meta in the header of the request or in the URL as query string.

    • A HTTP POST method with the file in the body and the meta in the header of the request or in the URL as query string.

    • A HTTP POST method with the file in a cidContent field and the meta in the form, in the header of the request or in the URL as query string.

    Note that it is not possible to send binary files in a URL encoded form.

    The server must support any of the configuration written in the manifest.

    The server must include the returned meta in the response. With a web transport and for the exchange and upload steps, a returned meta must be inserted in a javascript object in the body of the response (for example: {"internal-id":"0X001242"}).

  3. The client must then begin an interaction between the user and the server. The client must send the first request, which must be authenticated and which must contain the internal-id meta. The client must support one of the following transport methods for the first request:

    • A HTTP GET method with metadata stored in the header of the request or in the URL as query string.

    • A HTTP POST method containing a URL encoded form. The metadata could be stored in the header, in the URL as query string or in the form.

    • A HTTP POST method containing a multipart form. The metadata could be stored in the header, in the URL as query string or in the form.

    The server must support any of the configuration written in the manifest.

    The server must include a HTML page in the body of the response. The client must show this page in a web frame. The user could now interact directly with the server through this web frame.

    At the end of the interaction (for example, by clicking on "validation" button), the HTML page must send a custom event called "cid-interaction-ended". The body of this event must embed a javascript object containing the returned metadata (the public url). The web page could also throw a custom event called "cid-interaction-aborted" to express the failure of the interaction. This second event does not require any body.