One method of linebreaking that works reasonably well is sometimes referred to as a "best-fit" algorithm. It works by computing a "penalty" for each potential break point on a line. The break point with the smallest penalty is chosen and the algorithm then works on the next line. Three useful factors in a penalty calculation are:
-
How much of the line width (after subtracting of the indent) is unused? The more unused, the higher the penalty.
-
How deeply nested is the breakpoint in the expression tree? The expression tree's depth is roughly similar to the nesting depth of
mrow
s. The more deeply nested the break point, the higher the penalty. -
Does a linebreak here make layout of the next line difficult? If the next line is not the last line and if the indentingstyle uses information about the linebreak point to determine how much to indent, then the amount of room left for linebreaking on the next line must be considered; i.e., linebreaks that leave very little room to draw the next line result in a higher penalty.
-
Whether
linebreak
has been specified:nobreak
effectively sets the penalty to infinity,badbreak
increases the penaltygoodbreak
decreases the penalty, andnewline
effectively sets the penalty to 0.
This algorithm takes time proportional to the number of token elements times the number of lines.
A common method for breaking inline expressions that are too long for the space remaining on the current line is to pick an appropriate break point for the expression and place the expression up to that point on the current line and place the remainder of the expression on the following line. This can be done by:
-
Querying the text processing engine for the minimum and maximum amount of space available on the current line.
-
Using a variation of the automatic linebreaking algorithm given above), and/or using hints provided by linebreak attributes on
mo
ormspace
elements, to choose a line break. The goal is that the first part of the formula fits "comfortably" on the current line while breaking at a point that results in keeping related parts of an expression on the same line. -
The remainder of the formula begins on the next line, positioned both vertically and horizontally according to the paragraph flow; MathML's indentation attributes are ignored in this algorithm.
-
If the remainder does not fit on a line, steps 1 - 3 are repeated for the second and subsequent lines. Unlike the for the first line, some part of the expression must be placed these lines so that the algorithm terminates.
Some use-cases require precise control of the math layout and presentation. Several MathML elements and attributes expressly support such fine-tuning of the rendering. However, MathML rendering agents exhibit wide variability in their presentation of the the same MathML expression due to difference in platforms, font availability, and requirements particular to the agent itself (see the Inroduction to Presentation MathML). The overuse of explicit rendering control may yield a ‘perfect’ layout on one platform, but give much worse presentation on others. The following sections clarify the kinds of problems that can occur.
For particular expressions, authors may be tempted to use the
mpadded
,
mspace
,
mphantom
,
mtext
elements to improve (“tweak”) the spacing generated by a specific renderer.
Without explicit spacing rules, various MathML renders may use different spacing algorithms. Consequently, different MathML renderers may position symbols in different locations relative to each other. Say that renderer B, for example, provides improved spacing for a particular expression over renderer A. Authors are strongly warned that “tweaking” the layout for renderer A may produce very poor results in renderer B, very likely worse than without any explicit adjustment at all.
Even when a specific choice of renderer can be assumed, its spacing rules may be improved in successive versions, so that the effect of tweaking in a given MathML document may grow worse with time. Also, when style sheet mechanisms are extended to MathML, even one version of a renderer may use different spacing rules for users with different style sheets.
Therefore, it is suggested that MathML markup never use mpadded
or mspace
elements to tweak the rendering of specific expressions, unless the
MathML is generated solely to be viewed using one specific version of
one MathML renderer, using one specific style sheet (if style sheets
are available in that renderer).
In cases where the temptation to improve spacing proves too strong,
careful use of mpadded
, mphantom
, or the alignment
elements may give more portable results than the
direct insertion of extra space using mspace
or mtext
. Advice given to the implementers of
MathML renderers might be still more productive, in the long run.
MathML elements that permit “negative
spacing”, namely mspace
,
mpadded
, and mo
, could in theory be used to simulate new
notations or “overstruck” characters by the
visual overlap of the renderings of more than one MathML
sub-expression.
This practice is strongly discouraged in all situations, for the following reasons:
-
it will give different results in different MathML renderers (so the warning about “tweaking” applies), especially if attempts are made to render glyphs outside the bounding box of the MathML expression;
-
it is likely to appear much worse than a more standard construct supported by good renderers;
-
such expressions are almost certain to be uninterpretable by audio renderers, computer algebra systems, text searches for standard symbols, or other processors of MathML input.
More generally, any construct that uses spacing to convey mathematical meaning, rather than simply as an aid to viewing expression structure, is discouraged. That is, the constructs that are discouraged are those that would be interpreted differently by a human viewer of rendered MathML if all explicit spacing was removed.
Consider using the mglyph
,element for cases such as this. If
such spacing constructs are used in spite of this warning, they should
be enclosed in a semantics
element that
also provides an additional MathML expression that can be interpreted
in a standard way. See Annotating MathML for
further discussion.
The above warning also applies to most uses of rendering attributes to
alter the meaning conveyed by an expression, with the exception of
attributes on mi
(such as mathvariant
) used to distinguish one variable
from another.
The reasons for using specific mo
elements for invisible operators include:
- such operators should often have specific effects on visual
rendering (particularly spacing and linebreaking rules) that are not
the same as either the lack of any operator, or spacing represented by
mspace
ormtext
elements; - these operators should often have specific audio renderings different than that of the lack of any operator;
- automatic semantic interpretation of MathML presentation elements is made easier by the explicit specification of such operators.
For example, an audio renderer might render f(x)
(represented as in the above examples) by speaking “f of x”, but use
the word “times” in its rendering of xy.
Although its rendering must still be different depending on the structure
of neighboring elements (sometimes leaving out “of” or
“times” entirely), its task is made much easier by the use of
a different mo
element for each invisible
operator.
MathML also includes DifferentialD
(U+2146) for use
in an mo
element representing the differential
operator symbol usually denoted by “d”. The reasons for
explicitly using this special character are similar to those for using
the special characters for invisible operators described in the
preceding section.
Note that there are other special characters that convey more meaning than their ASCII look-alike character such as ExponentialE
(U+2147).
The following notes are included as a rationale for certain aspects of the above definitions, but should not be important for most users of MathML.
An mfrac
is included as an
“embellisher” because of the common notation for a
differential operator:
<mfrac>
<mo> &DifferentialD; </mo>
<mrow>
<mo> &DifferentialD; </mo>
<mi> x </mi>
</mrow>
</mfrac>
Since the definition of embellished operator affects the use of the
attributes related to stretching, it is important that it includes
embellished fences as well as ordinary operators; thus it applies to
any mo
element.
Note that an mrow
containing a single argument
is an embellished operator if and only if its argument is an embellished
operator. This is because an mrow
with a single
argument must be equivalent in all respects to that argument alone (as
discussed in https://w3c.github.io/mathml/#presm_mrow).
This means that an mo
element that is the sole argument of an mrow
will determine its default form
attribute based on that
mrow
's position in a surrounding, perhaps inferred,
mrow
(if there is one), rather than based on its own
position in the mrow
in which it is the sole argument.
Note that the above definition defines every
mo
element to be “embellished” — that is,
“embellished operator” can be considered (and implemented in
renderers) as a special class of MathML expressions, of which
mo
is a specific case.
In some cases, text embedded in mathematics could be more appropriately
represented using mo
or mi
elements.
For example, the expression 'there exists
such that f(x) <1' is equivalent to
and could be represented as:
<mrow>
<mo> there exists </mo>
<mrow>
<mrow>
<mi> δ </mi>
<mo> > </mo>
<mn> 0 </mn>
</mrow>
<mo> such that </mo>
<mrow>
<mrow>
<mi> f </mi>
<mo> ⁡ </mo>
<mrow>
<mo> ( </mo>
<mi> x </mi>
<mo> ) </mo>
</mrow>
</mrow>
<mo> < </mo>
<mn> 1 </mn>
</mrow>
</mrow>
</mrow>
An example involving an mi
element is:
x+x2+···+xn.
In this example, ellipsis should be represented using an mi
element, since it takes the place of a term in the
sum; (see https://w3c.github.io/mathm/presm_mi).
On the other hand, expository text within MathML is best
represented with an mtext
element. An example
of this is:
Theorem 1: if x > 1, then x2 > x.
However, when MathML is embedded in HTML, or another document markup language, the example is probably best rendered with only the two inequalities represented as MathML at all, letting the text be part of the surrounding HTML.
Another factor to consider in deciding how to mark up text is the
effect on rendering. Text enclosed in an mo
element is unlikely to be found in a renderer's operator dictionary,
so it will be rendered with the format and spacing appropriate for an
“unrecognized operator”, which may or may not be better than the
format and spacing for “text” obtained by using an
mtext
element. An ellipsis entity in an
mi
element is apt to be spaced more appropriately
for taking the place of a term within a series than if it appeared in
an mtext
element.