You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
535 lines
25 KiB
535 lines
25 KiB
Box tree dynamism
|
|
=================
|
|
|
|
The first building block needed for event-driven layout is for the box tree
|
|
construction and subsequent maintenance to become event driven and for the
|
|
box tree management implementation to emit events marking parts of the tree
|
|
changed so that the layout calculator can react to them.
|
|
|
|
We expect the box tree to be in normal form at all times (and certainly
|
|
whenever an event is emitted). The box tree manager must ensure that this is
|
|
the case.
|
|
|
|
[Note: the box tree manager could be plumbed into the existing content handler
|
|
with a little effort: have the box tree constructed dynamically during the
|
|
parse, but ignore box tree events and perform layout computation at the same
|
|
place as currently. As box tree events are ignored, subsequent changes to the
|
|
tree will not result in updated layout computations -- this might produce weird
|
|
effects, so an option might be to forcibly relayout the entire tree on receipt
|
|
of box tree change events received after the initial layout has been performed.
|
|
This would not be substantially different from the existing reformatting on
|
|
viewport dimension changes, and similarly heavyweight]
|
|
|
|
The box tree manager needs to know when any of the various inputs that
|
|
determine the shape of the box tree changes. These inputs comprise:
|
|
|
|
1. Style changes
|
|
2. DOM changes
|
|
3. Dynamic pseudo classes
|
|
4. Other non-DOM changes (predominantly Media Queries)
|
|
|
|
The following sections consider the each of these inputs and attempts to scope
|
|
out the impact of changes made to them.
|
|
|
|
|
|
Style changes
|
|
-------------
|
|
|
|
Style information can change as the source document is parsed and as external
|
|
stylesheets are fetched. Script may also modify style information via the CSS
|
|
Object Model or simply injecting its own style attribute values into Elements.
|
|
Changes to any of this information has the potential to cause wide-ranging
|
|
ripples across the box tree structure.
|
|
|
|
|
|
DOM changes
|
|
-----------
|
|
|
|
Given CSS 3 selectors and combinators, the impact of DOM changes is thus:
|
|
|
|
1. Adding a new Element node:
|
|
Selection must be re-run for all children of the new node's parent
|
|
and their descendants (i.e. all trees derived from the new node and
|
|
its siblings).
|
|
[XXX: is this correct?] If it is known that neither the :nth-last-child()/:nth-last-of-type()
|
|
pseudo-classes nor the sibling combinators are in use, the selection
|
|
re-run can be limited to the preceding siblings and the new node (and
|
|
their descendants).
|
|
2. Removing a new Element node:
|
|
Selection must be re-run for all children of the node's former parent
|
|
and their descendants (i.e. all trees derived from the node's siblings)
|
|
3. Moving an Element node:
|
|
This is effectively a removal (see: 2) followed by an addition (see: 1)
|
|
There is an optimisation to be had if the node is attached to one of
|
|
its ancestors: in that case, selection need not be re-run on removal
|
|
(as all of the affected nodes will be in a subtree of the node's
|
|
siblings on re-insertion).
|
|
4. Addition/Removal/Modification of an Element node's Attributes:
|
|
Selection must be re-run for all children of the Element node's parent
|
|
and their descendants (i.e. all trees derived from the node and its
|
|
siblings).
|
|
|
|
|
|
Dynamic pseudo classes
|
|
----------------------
|
|
|
|
These require dynamic re-selection based on user activity. The most
|
|
aggressive of which is :hover (as, in theory, causes changes whenever the
|
|
user moves the mouse). So this probably requires the presence of some
|
|
debouncing logic to avoid excessive reselection every time the mouse
|
|
pointer moves 1px in any direction.
|
|
|
|
The set of affected nodes is as for node addition (where the directly
|
|
affected node serves as the added node).
|
|
|
|
|
|
Other Non-DOM changes
|
|
---------------------
|
|
|
|
Media queries will require re-selection to happen whenever any of the
|
|
queries properties (e.g. viewport dimension) changes. The impact in this
|
|
case, however, is completely unconstrained -- potentially the entire document
|
|
tree may require reselection.
|
|
|
|
|
|
|
|
Change detection
|
|
----------------
|
|
|
|
In general we want, wherever possible, to minimise the amount of work performed
|
|
to make the box tree reflect reality. This will be achieved through aggressive
|
|
filtering out of unnecessary work and reuse of existing results where possible.
|
|
|
|
The following sections consider change detection for each of the information
|
|
sources.
|
|
|
|
|
|
Detecting selection context changes
|
|
-----------------------------------
|
|
|
|
Style selection for a node is performed using a selection context
|
|
(simplistically: an ordered list of stylesheets). If a suitable hashing
|
|
scheme can be devised such that we can detect:
|
|
|
|
a. changes to the contents of a stylesheet (e.g. as a result of rules
|
|
or properties being added/removed/changed via the CSS Object Model)
|
|
b. addition/removal/reordering of stylesheets within a selection context
|
|
|
|
and we record the aggregation of this information as a single hash value
|
|
within the selection context and also record this hash value alongside the
|
|
computed style for a node, then it becomes trivial to compare the recorded
|
|
value with the live value within the selection context to determine if there
|
|
is any need to do any work.
|
|
|
|
If work is necessary, and assuming no ability to determine the differences
|
|
between the prior and current set of input styles, a change to the selection
|
|
context effectively invalidates the computed styles for the entire tree. This
|
|
may require further optimisation, however.
|
|
|
|
|
|
Detecting DOM changes
|
|
---------------------
|
|
|
|
Changes to a given Element node that might affect style selection rules are:
|
|
|
|
1. The Attributes associated with the Element
|
|
|
|
(no part of the node's QName can be altered after it is created).
|
|
|
|
In addition, the Element's position within the DOM also affects the set of
|
|
selection rules that apply to it. Specifically, given the existence of
|
|
ancestor and sibling combinators, changes to any part of the tree above and
|
|
to the left of an Element can affect its styling.
|
|
|
|
In other words, for a given Element node, we care about changes to:
|
|
|
|
1. the Element itself
|
|
2. any preceding sibling Element of the Element
|
|
3. any of its ancestor Elements
|
|
4. any of the preceding sibling Elements of its ancestor Elements
|
|
|
|
In the case where the tree structure itself does not change (i.e. only changes
|
|
to Attributes are in play), it is sufficient to mark the modified Element as
|
|
dirty and then proceed on the basis that the subtree of the Element's parent
|
|
element is dirty as a result (and, given that changes to an Element cannot
|
|
affect its preceding siblings, the processing of the parent's children can
|
|
skip over any that occur before a dirty child is encountered).
|
|
|
|
Where an Element (potentially the root of a subtree) is inserted into the DOM,
|
|
the Element itself should be marked dirty and its parent's subtree should be
|
|
processed (again, any child of the parent preceding a dirty child can be
|
|
skipped).
|
|
|
|
Where an Element is removed from the DOM, its (pre-removal) subsequent sibling
|
|
should be marked dirty and its (pre-removal) parent's subtree should be
|
|
processed (again, any child of the parent preceding a dirty child can be
|
|
skipped).
|
|
|
|
Under the (old) Mutation event model, we can use the following events to mark
|
|
nodes as dirty (and enqueue their parents for consideration):
|
|
|
|
1. DOMAttrModified
|
|
- Dispatched after the modification (addition/modification/removal) has
|
|
happened
|
|
- The event target is the Element node to mark dirty
|
|
- The relatedNode field identifies the modified Attr
|
|
- Its parent Element must be searched for
|
|
2. DOMNodeInserted
|
|
- Dispatched after the insertion is complete
|
|
- The event target is the inserted node (which may not be an Element)
|
|
- The relatedNode field identifies the parent node (again, may not be
|
|
an Element)
|
|
3. DOMNodeRemoved
|
|
- Dispatched before the removal has begun
|
|
- The event target is the removed node (which may not be an Element)
|
|
- The relatedNode field identifies the parent node (again, may not be
|
|
an Element)
|
|
|
|
Determining whether styling for descendants of the dirty node will change
|
|
requires an efficient way of evaluating whether the change to the dirty node
|
|
causes a change in the applicable selectors. One way might be to consider
|
|
ancestor/parent combinators where the dirty node appears on the left hand side
|
|
(i.e. as the ancestor/parent). If we can efficiently detect no change in the
|
|
potential set of applicable selectors (which will necessarily require knowing
|
|
which ones matched last time), then we can minimise the number of descendant
|
|
nodes for which we need to perform a full reselection.
|
|
|
|
Similarly, for subsequent siblings of the dirty node (and thus, their
|
|
descendants), we want to consider the (previous) sibling combinators where the
|
|
dirty node appears on the left hand side (i.e. as the preceding sibling).
|
|
Again, if there is no change to the set of applicable selectors, there is no
|
|
need to reselect for any subsequent sibling of the dirty node or their
|
|
descendants.
|
|
|
|
Note, however, that all dirty nodes in a tree need to be processed, so event
|
|
processing should not try to short circuit here.
|
|
|
|
In addition to DOM changes which might affect the application of styling, we
|
|
are also interested in changes to text content which may cause the creation,
|
|
destruction, modification, or invalidation of boxes intended to contain the
|
|
content. To do this, we will use the following events, marking the containing
|
|
Element as requiring reflow (this is distinct from marking it dirty for the
|
|
purposes of style selection):
|
|
|
|
1. DOMCharacterDataModified
|
|
- Dispatched after the modification has happened
|
|
- The event target is the Text (or other, uninteresting) node
|
|
- The relatedNode field identifies the parent node (which may not be an
|
|
Element)
|
|
- Its parent Element must be searched for
|
|
2. DOMNodeInserted
|
|
- Dispatched after the insertion is complete
|
|
- The event target is the inserted node (which may not be a Text node)
|
|
- The relatedNode field identifies the parent node (again, may not be
|
|
an Element)
|
|
3. DOMNodeRemoved
|
|
- Dispatched before the removal has begun
|
|
- The event target is the removed node (which may not be a Text node)
|
|
- The relatedNode field identifies the parent node (again, may not be
|
|
an Element)
|
|
|
|
Detecting changes to pseudo classes
|
|
-----------------------------------
|
|
|
|
Changes to structural pseudo classes should fall out of the generic DOM change
|
|
detection described above: the DOM tree will have to change for these to do so.
|
|
|
|
Dynamic pseudo classes, however, are a right pain, particularly in the case
|
|
where you have rules containing selectors that include combinators in
|
|
conjunction with pseudo classes. In these cases, the pseudo class might become
|
|
engaged on any other element in the tree (be that ancestral or siblings, or
|
|
both) surrounding the target element. Thus, there's no trivial way to
|
|
pre-compute an answer and test against it efficiently.
|
|
|
|
Consider something like this:
|
|
|
|
div:first-child + :hover wibble :last-child > foo { ... }
|
|
|
|
This will match a "foo" element whose parent is the last child of "foo"'s
|
|
grandparent and "foo"'s parent has an ancestor element, "wibble" and "wibble"
|
|
has an ancestor element that the user is hovering over and that ancestor
|
|
element is immediately preceded by a "div" that is the first child of their
|
|
parent. If the user isn't hovering over the intermediate ancestor, this rule
|
|
will not match, regardless of the shape of the document tree.
|
|
|
|
To make matters even more involved, the user may be considered to be hovering
|
|
over the intermediate ancestor as a result of hovering over some other
|
|
descendant of that ancestor. Also, because any ancestor of "wibble" with a
|
|
preceding first child "div" will do, the identity of that ancestor could
|
|
equally well change but the rule still match.
|
|
|
|
Perhaps, therefore, it becomes necessary to gather the set of nodes for which
|
|
each dynamic pseudo class is active and record this on the document element.
|
|
Then, for each subsequent selection, gather the set of nodes to which these
|
|
pseudo classes apply at that time. This gives two sets:
|
|
|
|
a. The previous set of nodes for which each dynamic pseudo class was active
|
|
b. The current set of nodes for which each dynamic pseudo class is active
|
|
|
|
The disjunctive union of these two sets produces a set of nodes for which the
|
|
active state of each dynamic pseudo class has changed between selections. Any
|
|
descendant of any node in this set, or of any sibling of any node in this set,
|
|
will require reselection.
|
|
|
|
The current set of nodes for which each dynamic pseudo class is active will be
|
|
recorded on the document element each time. [Note: it may be possible to
|
|
minimise the set sizes by taking account of the rules involving the relevant
|
|
pseudo class -- in the extreme case, no rules use the relevant pseudo class and
|
|
so the set will always be empty].
|
|
|
|
|
|
Detecting other non-DOM changes
|
|
-------------------------------
|
|
|
|
Media Queries Level 4 permits testing against a variety of mostly-static
|
|
properties of the rendering device. Some properties may change dynamically
|
|
(e.g. viewport width/height when the window is resized). Assuming that all
|
|
property values are always provided and they do not change regularly, the
|
|
simple approach would be to hash the information provided (i.e. property
|
|
names and values) and store the hash alongside computed styles. If the
|
|
hash value changes, then reselection must occur.
|
|
|
|
If the hash value changes, a more granular additional test would be to apply
|
|
the input properties to all known media queries and cache the results, thus
|
|
ensuring that range-based properties continue to match until they don't even
|
|
if there are slight changes to their declared values. The assumption is that
|
|
this matching will be performant (if it is not, more intelligence is likely
|
|
required).
|
|
|
|
|
|
Construction nuances
|
|
--------------------
|
|
|
|
Avoiding repetitious selection during document construction (assuming no
|
|
scripting causing DOM changes) would be desireable. Given the rules above,
|
|
performing selection for the Element node children of a node is best done
|
|
after the last child of the parent has been added to the document (i.e. when
|
|
the parser encounters the parent Element's closing tag in well-formed
|
|
document source).
|
|
|
|
This may, however, be difficult to detect reliably from the DOM event stream
|
|
(where only node insertion/removal events are emitted). In a well-formed
|
|
document, we know that a node's subtree is complete once either:
|
|
|
|
a) a subsequent sibling is inserted into the node's parent; or
|
|
b) a subsequent relation is inserted into one of the node's ancestors
|
|
|
|
In the case of the document root node, the completion of the parse is the
|
|
relevant point (as, until then, children may be changing).
|
|
|
|
Where the source document is not well-formed (i.e. the likes of the HTML5
|
|
adoption agency are in play), it is possible for nodes to be moved around
|
|
the tree or inserted in unlikely locations in the first place. Where this
|
|
happens, it will be difficult to avoid repetitious selection in a reliable
|
|
manner.
|
|
|
|
In either case, if repetitious selection is avoided at all, the box tree
|
|
will not be complete until the entire document parse has ended (and the
|
|
event-based box tree builder must take great care to ensure the tree is
|
|
in normal form at all times).
|
|
|
|
Assuming a well-formed input, all stylesheet references and inline styles
|
|
should be known when the Body Element is inserted into the document. There
|
|
is no point attempting to build any part of the box tree until all the style
|
|
information is available, so waiting until all of the external stylesheets
|
|
have been fetched is probably sensible (perhaps with some timeout to avoid
|
|
waiting forever -- if style information arrives late, then too bad, you get
|
|
to enjoy a load of repetitious selection).
|
|
|
|
|
|
Migration considerations
|
|
------------------------
|
|
|
|
The existing box tree structure conflates a variety of things. Specifically,
|
|
the box tree is not "just" a structure containing nodes to be laid out but it
|
|
also contains large swathes of information that is already stored in the DOM,
|
|
and also pieces of information that have no bearing on how the document is
|
|
laid out or displayed (e.g. form gadgets, which really belong on the DOM, and
|
|
object parameters).
|
|
|
|
Additionally, there are special-case box types for floated boxes (even using
|
|
different types to distinguish between left and right floated boxes),
|
|
linebreaks, and flex-layout boxes. There are also custom INLINE_CONTAINER
|
|
boxes (which serve as a transparent container for INLINE boxes when the
|
|
containing block only contains INLINE children, but act as a BLOCK box when
|
|
the containing block has both BLOCK and INLINE children). There's also the
|
|
magical INLINE_END box type which represents the far end of a fragmented
|
|
INLINE box.
|
|
|
|
Retaining the myriad box types (including the special cases) is possible, but
|
|
care will need to be taken to ensure that the box tree remains in normal form
|
|
as and when the type of a box changes (i.e. because the display or float
|
|
property values change). However, retaining the existing box types does
|
|
restrict what we can do (as they effectively lose information we would
|
|
otherwise want to retain). Therefore, the most sensible course of action
|
|
would likely be to hide the box type information behind some API surface
|
|
which the layout calculator can use to obtain it. This would allow the
|
|
existing layout calculator to obtain the legacy box type without needing to
|
|
bake this into the modern box model.
|
|
|
|
Boxes also contain various pieces of information created by the layout
|
|
calculator and stashed away for future use. Some generic mechanism to permit
|
|
the layout calculator to associate state with a box is is probably needed
|
|
(with such information being invalidated whenever the relevant section of the
|
|
tree changes).
|
|
|
|
There is some magic handling of trailing space characters in text elements
|
|
(simplistically: they're flagged on the preceding text box). This was probably
|
|
an optimisation at the time (as the box structure is relatively large), but
|
|
makes many things more complex. This optimisation should be removed entirely
|
|
and, instead, the boxes should reference the underlying text data attached to
|
|
the source DOM nodes (Note: some consideration needs to be given to whitespace
|
|
squashing here, to ensure that continues to happen in a sensible way,
|
|
preferably without duplicating the entire textual contents of the DOM into the
|
|
box tree).
|
|
|
|
Ideally, then, the box tree contents will be reduced to solely the information
|
|
needed to lay the document out and render it. Where possible, anything that is
|
|
already stored on the DOM should not be duplicated (as it is easy to obtain the
|
|
corresponding DOM node from the box and then query that).
|
|
|
|
It would be nice to retire talloc, too, as object lifetimes are now
|
|
well-defined.
|
|
|
|
|
|
Outwith the HTML content handler, there are several parts of the code that
|
|
expect to be able to access box objects and their contents:
|
|
|
|
1. iframe handling
|
|
|
|
This involves unpleasant collusion between the HTML content handler and
|
|
the desktop UI. This layering violation should be cleaned up by
|
|
introducing a proper orthogonal interface (if needed, or something else,
|
|
if not). The public "html_redraw_a_box" API should die with it.
|
|
|
|
2. save as text
|
|
|
|
This is a similar layering violation -- the code is simply in the wrong
|
|
place. It should be moved down into the HTML content handler and the
|
|
higher level code should simply dispatch the operation through the content
|
|
system.
|
|
|
|
There are also the following things that grub around inside the DOM to fish
|
|
out the box pointer:
|
|
|
|
1. The JavaScript Canvas binding
|
|
|
|
This is done so it can then call html__redraw_a_box, which is an API
|
|
private to the HTML content implementation. Bad Daniel, no biscuit.
|
|
|
|
The fix here is probably to have the binding fire a "redraw needed" event
|
|
at the canvas dom_node (which it already knows) and for that to be caught
|
|
and handled by the HTML content handler itself.
|
|
|
|
Additionally there are a few places that leak box pointers through APIs:
|
|
|
|
1. The content_html_object struct contains the box pointer
|
|
|
|
This is used by:
|
|
|
|
a. save complete
|
|
|
|
This simply checks that the pointer is non-NULL to determine whether
|
|
to save the corresponding content for the object. Like save as text
|
|
(described above), this arises as a result of a layering violation.
|
|
The implementation should be pushed down into the HTML content
|
|
handler and the higher level code should simply dispatch the
|
|
operation through the content system. (This would also permit
|
|
retirement of the similarly anachronistic html_get_document API).
|
|
|
|
b. reload
|
|
|
|
This cares not about the box pointer, but does process the objects to
|
|
invalidate their caching status. Again, a layering violation: push
|
|
the implementation down and invoke it via the content system.
|
|
|
|
2. The textsearch_bounds content interface accepts box pointers (as does
|
|
content_textsearch_add_match)
|
|
|
|
While this is internal to the content infrastructure it would be better
|
|
all round for the exposure of box pointers here to go away and be replaced
|
|
with something more orthogonal.
|
|
|
|
There are also places where box pointers might once have been used and no
|
|
longer are, but where vestiges remain:
|
|
|
|
1. The core text selection object contains a pointless box pointer field
|
|
(it is initialised to NULL and never used)
|
|
2. The Windows frontend's browser_mouse object contains a pointless box
|
|
pointer field (it is never used).
|
|
|
|
These should just be cleaned up.
|
|
|
|
|
|
Layout calculation
|
|
------------------
|
|
|
|
The layout calculators will be notified about changes to the box tree. Change
|
|
events will report the containing block-boxes in which the alterations were
|
|
made. This is analogous to the DOM's SubtreeModified event which is targetted
|
|
at the parent Element containing the modified subtree. (TBD whether one event
|
|
per containing block or one fat event, or some combination of the two -- in the
|
|
situation where positioned and/or floated boxes are involved, changes to the
|
|
in-flow box tree structure may result in *multiple* parent containing blocks
|
|
being affected).
|
|
|
|
The layout calculators will operate on the box tree to lay it out for a given
|
|
viewport width (where possible, only needing to reflow the altered containing
|
|
block contents, thus performing the minimal amount of work). The resulting
|
|
layout information should be cached on each box tree element (potentially with
|
|
some magic for inlines, as they get (de)fragmented depending upon available
|
|
space).
|
|
|
|
Once the position and dimensions of all boxes have been determined, the entire
|
|
document should be rendered *unclipped* to produce the intermediate structure
|
|
used for redraw. We expect this step to produce a stack of device-independent
|
|
render operations (one stack level per layer).
|
|
|
|
"Pictorally":
|
|
|
|
[canvas] -> [command, ...]
|
|
[ 1] -> [command, ...]
|
|
[ 2] -> [command, ...]
|
|
[ 3] -> [command, ...]
|
|
[ 4] -> [command, ...]
|
|
|
|
where the document is then drawn layer-by-layer (canonically: starting at the
|
|
canvas and moving down the stack, overdrawing each layer on top of what has
|
|
been drawn before). [Note: the layers in this structure may not directly
|
|
reflect the z-indexes specified in the source stylesheets in part as a result
|
|
of the magical handling of elements which can create new stack levels -- see
|
|
Appendix E of CSS2.1 for the gory details].
|
|
|
|
With the exception of text plotting commands, the render operations are
|
|
entirely independent of the originating document and box tree. Text commands
|
|
will necessarily reference the source text data by holding references to the
|
|
relevant DOM string data, and whatever metadata is needed to index into it.
|
|
(As this data is referenced, if the DOM changes, the old text will continue
|
|
to be rendered safely until the redraw stack is updated).
|
|
|
|
To ensure that the document may be redrawn at any time, regardless of the
|
|
underlying state of the DOM, box tree, or layout calculation, and to ensure
|
|
that redraw output is consistent the redraw stack must be updated atomically
|
|
(simplistically: build a new one, flip the old and new pointers, and destroy
|
|
the old one; more optimised solutions are likely available).
|
|
|
|
[Note: we omit the precise detail about how the layout calculation is perfomed
|
|
as this is left for future work. The overall event flow and decoupling of
|
|
layout from rendering is, however, important to record].
|
|
|
|
|
|
Redraw
|
|
------
|
|
|
|
Redraw operates solely using the render stack. This is the point that clipping
|
|
is applied (e.g. to skip over any commands that would render outside the clip
|
|
boundary). Additionally, if output scaling is desired, it can be applied at
|
|
this point, too (i.e. by transforming the device-independent render operations
|
|
into device-dependent ones -- just apply a transformation matrix, right?).
|
|
|
|
Multiple buffering, if desired, is left up to the plotter implementation
|
|
provided by the client (as different frontends may wish to do things
|
|
differently).
|
|
|
|
[Note: again, the precise detail is omitted here].
|