Issue Details (XML | Word | Printable)

Key: OFFICE-1826
Type: Bug Bug
Status: Applied Applied
Resolution: Fixed
Priority: Major Major
Assignee: Svante Schubert
Reporter: Robert Weir
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
OASIS Open Document Format for Office Applications (OpenDocument) TC

ISO/IEC JTC 1/SC 34 N 1078 : DEFECT REPORT NUMBER JP2-35

Created: 28/May/09 04:24 PM   Updated: 19/Nov/10 08:54 PM
Component/s: Packaging
Affects Version/s: ODF 1.0 (second edition), ODF 1.1, ODF 1.2
Fix Version/s: ODF 1.0 Errata CD 5

Environment: This issue is applicable to section 17.5 to various degrees in all versions of ODF, and to the corresponding section of ODF 1.2 Part 3.

Resolution:
17.5

Delete the paragraph starting with:
"A relative-path reference"

Delete the paragraph starting with:
"All other kinds"

Insert after the list item that ends "from the file system or another package." the paragraph:
"A *relative-path* reference (as defined in ァ4.2 of [RFC3986] that occurs in a file that is contained in a package has to be resolved exactly as it would be resolved if the whole package gets unzipped into a directory at its current location. The base URI for resolving relative-path references is the one that has to be used to retrieve the (unzipped) file that contains the relative-path reference."

Insert after the final paragraph, the following note:
"Note: URI references that are not a relative-path reference do not need any special processing. This especially means that absolute-paths do not reference files inside the package, but within the hierarchy the
package is contained in, for instance the file system."

Appendix B
Delete the reference entry for [RFC2396] and insert this reference for [RFC3986]:
[RFC3986] T. Berners-Lee, R. Fielding, L. Masinter, Uniform Resource Identifier (URI): Generic Syntax, http://www.ietf.org/rfc/rfc3986.txt, IETF, January 2005.


 Description  « Hide
Transcribed from http://www.itscj.ipsj.or.jp/sc34/open/1078.htm

Original author: "MURATA Makoto (FAMILY Given)" <eb2m-mrt@asahi-net.or.jp>
DEFECT REPORT NUMBER JP2-35

QUALIFIER clarification required

REFERENCES IN DOCUMENT Clause 17.5

NATURE OF DEFECT The last paragraph in 17.5 should be a non-normative note, since the behaviour of absolute IRI references is specified in RFC 3986 and RFC 3987.

SOLUTION PROPOSED BY THE SUBMITTER Rewrite this paragaraph as a note.


 All   Comments   Change History      Sort Order: Ascending order - Click to sort in descending order
Patrick Durusau added a comment - 06/Jul/09 01:54 PM
I'm ok with the suggested change, but isn't the last sentence of this paragraph normative for ODF?

Michael Brauer added a comment - 16/Jul/09 06:25 AM - edited
I think we may indeed turn this paragraph into a note. But we may combine the resolution of this issue with the other defect reports we got for 17.5.

The paragraphs in question read:

-----
A relative-path reference (as described in §6.5 of [RFC3987]) that occurs in a file that is contained in a package has to be resolved exactly as it would be resolved if the whole package gets unzipped into a directory at its current location. The base IRI for resolving relative-path references is the one that has to be used to retrieve the (unzipped) file that contains the relative-path reference.
All other kinds of IRI references, namely the ones that start with a protocol (like http:), an authority (i.e., //) or an absolute-path (i.e., /) do not need any special processing. This especially means that absolute-paths do not reference files inside the package, but within the hierarchy the package is contained in, for instance the file system. IRI references inside a package may leave the package, but once they have left the package, they never can return into the package or another one.
-----

One issue is that the term "relative-path reference" is not formally defined by §6.5 of [RFC3987], but by §4.2 of [RFC3986], which we reference in the ODF 1.0 OASIS standard . As a result, "relative-path" is mostly interpreted as "relative URI", which it isn't.

My suggestion how to resolve this was some time ago:
----
A relative-path reference (as defined in §4.2 of [RFC3986], except
that it may contain the additional characters that are allowed in IRI
references [RFC3987]) that occurs in a file that is contained in a
package has to be resolved exactly as it would be resolved if the whole
package gets unzipped into a directory at its current location. The base
IRI for resolving relative-path references is the one that has to be
used to retrieve the (unzipped) file that contains the relative-path
reference.

Every IRI reference that is not a relative-path reference does not
need any special processing. This especially means that absolute-paths
do not reference files inside the package, but within the hierarchy the
package is contained in, for instance the file system. IRI references
inside a package may leave the package, but once they have left the
package, they never can return into the package or another one.
----

As of today, I would keep only the first two sentences of this paragraph as a note.
The intention of the last sentence was to clarify that ODF does not specify a method how to reference files in a package from "the outside", but this actually seems to cause more confusion than it solves.

If further suggest that we replace the "IRI" with "URI/IRI". This solves the issue that ODF 1.0 uses the term URI, wile ISO 26300 says IRI.

My proposed resolution therefore is:

----
A *relative-path* reference (as defined in ァ4.2 of [RFC3986], except
that it may contain the additional characters that are allowed in IRI
references [RFC3987]) that occurs in a file that is contained in a
package has to be resolved exactly as it would be resolved if the whole
package gets unzipped into a directory at its current location. The base
URI/IRI for resolving relative-path references is the one that has to be
used to retrieve the (unzipped) file that contains the relative-path
reference.

*Note:* URI/IRI references that are not a relative-path reference do not
need any special processing. This especially means that absolute-paths
do not reference files inside the package, but within the hierarchy the
package is contained in, for instance the file system.
-----

Michael Brauer added a comment - 16/Jul/09 06:28 AM
Marked as resolved.

Dennis Hamilton added a comment - 19/Jul/09 10:03 PM
I DON'T THINK THIS RESOLUTION IS COMPLETE ENOUGH

1. How does the URI encoding actually show up in the manifest:full-path attribute that has such path segments?

2. How does the URI encoding actually show up in the Zip content item name (although I presume it and the manifest:full-path value are the same, that's never said anywhere).

3. What directory system are we to assume for normative and interoperability purposes? (I am also concerned that the package might not have a definite file-system location depending on how it is delivered to a consumer.) How are various URI/RI encodings expected to be handled or not? (I also wonder about case sensitivity, allowance of various special characters including spaces, etc.) Finally, the Zip specifications that I've looked at don't appear to specify how one maps a Zip content item name to a file-system name in a hierarchical file system. (I'd like to be mistaken about that.)

4. There are specific passages in RFC3986 (not in 4.2) about "path segments" that are in various technologies (I assume a Zip package counts as one of those) and that how those path segments are processed must be specified in addition to what RFC3986 provides.

My concern is that we need more than a patch, at least for ODF 1.2, and that we can then look at an appropriate way to make errata for 1.0 through 1.1.

Michael Brauer added a comment - 20/Jul/09 02:34 AM
Re #1: manifest:full-path does not contain an IRI or IRI. It contains a path name.We may want to clarify this, but this is a different issue. (Actually neither its description contain the term IRI/URI nor does it have the anyURI datatype).
Re #2: URI encodings do not show up as Zip content items names (see #1).
Re #3: Zip files aren't really extracted in order to resolve URI (or IRIs). Only URIs are constructed and processed on a syntactical level. In so far, we don't need to define a "directory system", nor does a file system has to exist. All that is said is that URI/IRI references that have a particular syntax are considered to be references withe the package.
Dennis: Actually, I think the intention of the paragraph in question is clear. If you have a suggestion how to phrase this differently, then I'm glad to discuss this proposal.
Re: #04: To which passages of RFC3986 are you referring?

Patrick Durusau added a comment - 28/Sep/09 10:07 AM
Michael, this issue is now in part 3.

Michael Brauer added a comment - 29/Sep/09 02:44 AM
This issue has to targets, ODF 1.0 errata and ODF 1.2. In ODF 1.2 part 3, the full sectrion regarding IRIs has been rewritten. The issue is not applicable to the revised text.

For ODF 1.0 errata, the proposal has been included already in the draft response to N1078 (except that a paragraph sign appears as a square).

Patrick, I'm removing the ODF 1.2 target from the issue and assign it back to you. I believe you can advance the status of the issue to applied.

Patrick Durusau added a comment - 29/Sep/09 09:09 AM
Not applicable to ODF 1.2 text.

Dennis Hamilton added a comment - 05/Oct/09 06:44 PM
PROPOSE-DISCUSS

Michael, I understand that the manifest:full-path attribute is not an IRI or URI. However, I assume that the IRI is somehow resolved to the names in the Zip entries for individual files of the package when an IRI is used as a relative reference. The question is, how are the special IRI features handled in this context, one that is unique to ODF.

So, if there is IRI encoding in the IRI, the question is, how does it map to the actual name in a manifest:full-path and also to the name in the Zip structure for the file having the same name (or matching initial part, for a manifest:full-path ending in "/")

Re #1: manifest:full-path does not contain an IRI or IRI. It contains a path name.We may want to clarify this, but this is a different issue. (Actually neither its description contain the term IRI/URI nor does it have the anyURI datatype).
 *** But we have to be able to figure out the right one given an IRI/URI in a reference from within one of the files.
Re #2: URI encodings do not show up as Zip content items names (see #1).
 *** That is not clear. If I am resolving a relative IRI into the same package, are the IRI encodings and URI encodings to be processed first? Maybe so. But as ODF is the authority on whatever the protocol is for path segments that refer into the package, I think we are supposed to say what the game is.
Re #3: Zip files aren't really extracted in order to resolve URI (or IRIs). Only URIs are constructed and processed on a syntactical level. In so far, we don't need to define a "directory system", nor does a file system has to exist. All that is said is that URI/IRI references that have a particular syntax are considered to be references withe the package.
  *** I strongly agree with this. However, the definition is given in terms "as if the Zip file had been extracted into a file system" and something has to be said about what that as-if is.
Dennis: Actually, I think the intention of the paragraph in question is clear. If you have a suggestion how to phrase this differently, then I'm glad to discuss this proposal.
Re: #04: To which passages of RFC3986 are you referring?
 *** I have to look this up. There is a passage in the specifications about how special applications need to define what their rules are for path segments and resource names.

Dennis: Actually, I think the intention of the paragraph in question is clear. If you have a suggestion how to phrase this differently, then I'm glad to discuss this proposal. Re: #04: To which passages of RFC3986 are you referring?

Dennis Hamilton added a comment - 05/Oct/09 06:46 PM
Added the component and expanded the affected versions.

Dennis Hamilton added a comment - 23/Mar/10 10:46 PM
THIS DEFECT DOES NOT APPLY TO THE ODF 1.0 OASIS STANDARD. IT CANNOT BE CARRIED OUT AS WRITTEN. (There is no reference to [RFC3987] in 17.5 of ODF 1.0, and there are no bibliographic references to [RFC3986] and [RFC3987].

IT CAN'T BE FIXED WITH AN ERRATUM AGAINST ODF 1.0. I assert that, furthermore, this is a substantive change and not merely an erratum in the case of ODF 1.0..

(I also don't think the remedy is sufficient for any other version, but that is a different problem.)

I recommend that we leave the action from the existing Approved Errata.

I think the way to gain alignment on this is *after* ODF 1.1 transposition and amendment occurs. We can then bundle this in an Errata and Corrigenda against ODF 1.1 and the aligned IS 26300:2006/AMD1.

Before that, I think we need to nail this for ODF 1.2 Part 3 and then see how to retrofit it appropriately to ODF 1.1 and the amended IS 26300.




Svante Schubert added a comment - 19/May/10 02:40 PM
@Dennis:
True words. I mistakenly based the patch on the ODF 1.0 2nd version, but adjusted the patch to be applied on both. ODF 1.0 and ISO version.

Dennis Hamilton added a comment - 19/May/10 05:06 PM
You still have to do something about the fact that the ODF 1.0 standard does not make a normative reference to [RFC3986] or [RFC3987]. Or did you not update the resolution here? In that case, I should wait to see your CD04-rev01.

Furthermore, I don't think this qualifies as a non-substantive change for the OASIS ODF 1.0 Standard.

This part has been worked over rather heavinly in ODF 1.2, so we need to be more careful here. It would be interesting to see what ODF 1.1 says in its section 17.5.


Svante Schubert added a comment - 19/May/10 07:21 PM
@Dennis: The references to [RFC3986] or [RFC3987] are to be made explicitly by the one who integrates the errata into the spec (similar as internal references to section it is not possible to add them easily to the errata - any suggestion welcome as always).
As the RFC naming is unique the references are unique. In addition these are the RFC of URI and IRI no technologies that are new features to the standard.
I see no perfect solution to solve this defect, but I prefer the integration.

ODF 1.1 is based on ODF 1.2 2nd and would overtake the given patch.

OpenDocument-v1.2-part3-cd01-rev06.odt defines Usage of IRIs Within Packages at chapter 3.7. No contradiction on first sight.

Dennis Hamilton added a comment - 19/May/10 08:37 PM
Svante, it is my considered opinion that there is no IRI provision in OASIS Standard 1.0 and, in fact, section 17.5 of ODF 1.0 refers explicitly to IETF Draft Standard [RFC2396], approved in 1998. The IETF Proposed Standard [RFC3987] on IRIs was not published until January 2005.

To introduce dependency on [RFC3987] is a substantive change and inappropriate for an Errata against OASIS Standard 1.0. [RFC3986] on URIs was not approved as an IETF Standard until January 2005 and it was not referenced in ODF 1.0 (although [RFC3986] does obsolete [RFC2396]. IETF Proposed Standard [RFC3987] on IRIs was not approved as an IETF Standard until January 2005 and it never existed before.

The change to support IRIs was made in ODF 1.0 ed.2 (and IS 26300:2006) but never approved in an OASIS Standard until ODF 1.1 (and, for reasons I don't understand, there is no IRI reference in ODF 1.2 Part 3 at the moment - only [RFC3986] is referenced).

When we do Errata for the OASIS ODF 1.1 Standard as part of the alignment of IS 26300 and ODF 1.1, we cna resolve this defect report. It is inappropriate to apply that IS 26300 defect against the OASIS ODF 1.0 Standard.

Svante Schubert added a comment - 19/May/10 08:49 PM
Now I finally understand what your objections were, they only refer to the IRI part.
I changed the resolution to refer only to URI and removed all naming of IRI.
ODF 1.1 should later expand this patch by using IRIs as well.

Dennis Hamilton added a comment - 15/Jul/10 10:41 PM
Svante,

You got it.

We need to be sure to use the complete "relative-path reference" term in the context of [RFC3986]. By the way, notice that for a relative-path reference, the first segment can't contain a ":" in its name so that is the one case where use of a leading "./" is relevant to avoid confusing the : as the delimiter after a scheme name. (Note that this has nothing to do with the manifest:full-path value since that is neither URI nor URI reference.)

Dennis Hamilton added a comment - 19/Nov/10 08:42 PM - edited
Errata 02 - ODF 1.2 Reconciliation:
  There is a completely different approach to providing resolution of IRIs and relative references to other files of the same package. No reconciliation is required.
  The Errata 02 Appendix B reference to [RFC3986] is already provided in Part 3 of ODF 1.2.