What is USFX?
What is different between USFM and USFX?
Why create yet another Bible file format?
What is the USFX philosophy?
How do USFM, USFX, and OSIS differ?
What are the USFX tags?
Where is the USFX schema?
Copyright and Permissions
Who should I contact with comments on USFX?
Unified Scripture Format XML (USFX) is derived from USFM, a file format for publishing and interchanging basic Scripture texts in multiple languages. It is XML, and has an XML schema. It is not intended to be used by itself for all aspects of Bible layout and publishing, but just for representation of the Scripture text itself, and a small amount of accompanying material. USFX exists primarily to bring the advantages of XML to USFM, with minimal additional changes. USFX can be quickly and easily converted to and from USFM. Like USFM, USFX is not designed to be used for general books, theological works, dictionaries, or any other type of data. USFX is not intended to totally replace other Bible formats, but since it is XML, it can be converted to other formats with tools like XSLT.
USFX is derived from USFM, but it is not USFM. It can, however, encode everything in a USFM document.
The most obvious difference between USFM and USFX is that USFM
is based on backslash codes, but USFX is XML. Another difference is that
elements representing things like words of Jesus Christ and Old Testament quotes
in the New Testament must be properly closed with their own corresponding XML
closing tags. Furthermore, these things may be nested in any way that can
actually happen in Scripture, provided that the resulting XML is well-formed.
Attributes like that are not assumed to be closed when a verse marker is
encountered or when another style starts, unlike the way Paratext interprets
USFM.
“Necessity is the mother of invention.” The need for USFX was first felt in the process of converting Scriptures from one “standard” format to another, and in editing some Scriptures in the process of Bible translation work. The first application is to embed a simple XML schema in a Microsoft Word 2003 (or later) XML document that is both easy to work with in Microsoft Word, and easy to convert back to USFM. There are a several other XML Bible schemas in existence that I'm aware of, but these don’t map very cleanly to USFM. USFX is very easy to convert to and from USFM, because it is based on USFM. It is also much simpler to embed in WordML than complex schemas like OSIS, saving me a great deal of time, and making some applications possible that would otherwise be impossible.
More on the philosophy of USFX and how it compares with other Scripture file formats is available here.
USFM is an attempt to unify the many variations in usage of backslash (\)
codes to mark Scripture texts. It is not XML. There are many Scriptures encoded
in some form near to this format, mostly for minority languages. USFM is
preferable to the many similar, but slightly different, implementations of SFM
codes to represent Scriptures used by different, because it is well thought-out,
and because it is easier to support one standard way to mark Scripture files
with backslash codes than many ways, thus making these files more portable among
organizations and branches and making software support for these files easier,
less error-prone, and less costly. USFM is currently the format that I recommend
for practical Bible translation work.
USFX is primarily an expression of what USFM would look like as proper XML
instead of a set of backslash codes. Every USFM backslash code has a
corresponding USFX XML tag. USFX is more verbose than USFM, as that is the
nature of XML, but it is easier to parse with XML software libraries and XSL
transformations. Because USFX and USFM are so similar, it is very easy to
convert between the two.
OSIS is another
proposed XML Scripture interchange standard. The OSIS XML schema and
documentation view Scriptures differently than USFM and USFX, so a fully
automatic and lossless transformation between the two is currently not possible.
Not only are the metadata sections of OSIS different, but to be fully compliant
with the OSIS standard, some punctuation in the Bible text itself must be
converted to markup in such a way that it cannot be recovered without language-
and style-dependent processing. This conversion is language-dependent and
labor-intensive. Because of differences in the kinds of things that are encoded
and the ways they are encoded, the current version of OSIS is not suitable for
many applications that USFM works well for.
It is here. Documentation is available. Currently the USFX schema is not as restrictive as it might be. Validation of a USFX document against the schema is necessary, but not sufficient, to ensure compliance with the USFX standard.
The USFX Schema is copyright © 2005-2006 SIL International and EBT. It is released under the Gnu Lesser Public License or the Common Public License, as explained in LICENSING.txt.

Comments on USFX should be directed to Kahunapule Michael Johnson. You may use his web contact form or email Michael Paul Johnson at the address you get by replacing the spaces in that name with underscores and appending an at sign then sil.org.