2019 Toronto Wednesday

Planning!

End of 2019 goals:

all platforms (a bit of everything)
- Linux (wayland prioritized)
- MVP Android
- Laptops
- more Windows (include Win7/8)
- Beta

Android:

glyph zooming $
PLS optimizaitons
GLES 2.0
RGBA/swizzling
disable array textures $
tests
fix getScaledFont crash

Linux:

shader binary support
blacklisting
Wayland support
fix vsync

Mac:

Core Animation (CA) presentation
CA document splitting
CA WebGL
blob recording of native themes:
- don't rasterize themes in content process
texture uploads
testing coverage

Picture caching:

version 2.0
universal picture internation
cache filter outputs (blur)
cache and share clip masks
use blits for tiles $

Direct composition:

scrolling
document rendering
WebGL
video
Windows 7 presentation
Angle subpx extension
test suits run with WR
replace D2D with WR/Canvas2D

Threading:

use less threads
multi-thread scene building
parallel task scheduling that isn't Rayon $$$
non-blocking hit-testing
remove IPC channel support

Display list:

delta encoding
spatial/clip trees
make items tighter $
reduce scene building times
data pipe:
- better way to pass data through IPC
more animated propertied $$

Blobs:

recoordination
bounds changing invalidation
SVG filters
(some form of) path rendering
image font performance
- global locks in Skia
- single global context
clip paths on GPU

Software WR:

test LLVM pipe
ship SwiftShader
pick low-hanging fruits

Performance:

enable document splitting
gradient fast path
make box shadows to be 1-st class primitives
improve opaque pass fragment count
optimize resource bindings
optimize clip mask renderings
local space raster scale $$
SIMD optimizations
render task graph 2.0
proof of concept Vulkan/D3D12/Metal $$$
better primitive culling
animation junk at 60fps (frame scheduling)
consider spatial culling structure
optimize Intel GPU perf
make FPS shooter fast
BGRA8 and swizzling support
glyph cache optimizations
- mipping
- size/scaling re-use
- sharing between windows

Tooling:

Android mobile profiling tools
multi-frame WR captures
WR capture tiled blobs
picture caching debugging infrastructure $

Correctness:

WR 67 bugs
WR 68 bugs
snapping!

Refactor:

remove Cairo
rename Document and Pipeline terms
tech debt cleanup
rename some modules: tiling.rs, clip_scroll_tree.rs

Other:

hire more engineers
compile the list of websites we are good at (better than Chrome)

Fission:

move ImageLib to WR process
move font management to WR process

Security:

fuzzying
font sanitation

APZ discussion

(??)

Better clip mask rendering

Goals:

avoid doing work more than once (when a clip affects multiple primitives)
avoid doing work on fully opaque areas of the clip
simplify the cs_clip shaders

Ideas to explore:

Clip mask inversion if we know that it's more 0 than 1.
Use stencil. Potentially, test stencil for each clip.
Share clips between items (under conditions).

Gather data about:

number of clips affecting items
- average ratio of a clip area to the sum area of all clips
how widely image masks are used
how often the clip is shared between primitives
- what is the ratio of total primitive area versus the clip area
- are sub-pixel offsets of the primitives different?

Fast clears

We need to get back to a point where clips are fast-cleared to 1. This requires disabling scissor and re-evaluating performance against the current path that tries to render the first clip without blending. We can still render the corners of the first clip without blending. We don't need to render the opaque areas at all.

Snapping

Need automated infrastructure that modifies the reftests:

scaling both ref and image
switch some of the pictures to have their own surfaces
change the node in spatial tree where we switch to screen space rasterization

Document splitting overview

BGRA8 on Mac and Android

On Android, we don't always have BGRA8 internal format with glTexStorage. On MacOS, we never have that.

Choices are:

use glTexImage2D to make BGRA8 our internal format. Pay for mipmap allocation in VRAM. (Currently used on Android)
use glTexStorage(RGBA8) and pay for conversion of data from BGRA8. (Currently used on Mac).
- we can convince ImageLib to produce RGBA8 data in the first place
use glTexStorage(RGBA8) and pretend the data is in RGBA8, but use a swizzling sampler state when reading from it.
- as a follow up, we can make some of the cached render tasks to produce BGRA8 right away, so that texture cache entries have more consistent swizzling
use texture rectangles with BGRA8 internal format. Requires us to remove texture arrays.

DL interning API

Core idea:

move internation logic scene building to the API side
the current DL builder would just intern everything as an implementation detail
another DL builder would work with primitive handles and update vectors
- there is a benefit of providing structure of update arrays, especially if those don't have any variable-encoded enums inside

DL restructuring:

provide spatial tree, clip tree, picture tree and potentially a hit test tree

Picture caching improvements

Picture cache slices:

Introduced by:
- WebGL, canvas, video elements
- Scroll roots (if using for performance within WR / low-end GPUs)
Don't want to do component alpha blend, because:
- It's not supported by OS compositors.
- If we are doing slices for internal WR reasons (performance) we probably don't want to render twice anyway.
For each slice:
- Try to determine if opaque.
  - If yes, enable subpixel AA.
  - Otherwise, use grayscale AA.
Various possible options for switching between subpx / gray AA:
- Consider a sticky downgrade where an interned text run stays gray after downgrading.
- Might be OK to switch between them.
Consider using framebuffer fetch as a follow up.
- Interpolate between subpx / grayscale based on fragment alpha.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly