Skip to content

2019 Toronto Wednesday

Dzmitry Malyshau edited this page Apr 5, 2019 · 2 revisions

Planning!

End of 2019 goals:

  • all platforms (a bit of everything)
    • Linux (wayland prioritized)
    • MVP Android
    • Laptops
    • more Windows (include Win7/8)
    • Beta

Android:

  • glyph zooming $
  • PLS optimizaitons
  • GLES 2.0
  • RGBA/swizzling
  • disable array textures $
  • tests
  • fix getScaledFont crash

Linux:

  • shader binary support
  • blacklisting
  • Wayland support
  • fix vsync

Mac:

  • Core Animation (CA) presentation
  • CA document splitting
  • CA WebGL
  • blob recording of native themes:
    • don't rasterize themes in content process
  • texture uploads
  • testing coverage

Picture caching:

  • version 2.0
  • universal picture internation
  • cache filter outputs (blur)
  • cache and share clip masks
  • use blits for tiles $

Direct composition:

  • scrolling
  • document rendering
  • WebGL
  • video
  • Windows 7 presentation
  • Angle subpx extension
  • test suits run with WR
  • replace D2D with WR/Canvas2D

Threading:

  • use less threads
  • multi-thread scene building
  • parallel task scheduling that isn't Rayon $$$
  • non-blocking hit-testing
  • remove IPC channel support

Display list:

  • delta encoding
  • spatial/clip trees
  • make items tighter $
  • reduce scene building times
  • data pipe:
    • better way to pass data through IPC
  • more animated propertied $$

Blobs:

  • recoordination
  • bounds changing invalidation
  • SVG filters
  • (some form of) path rendering
  • image font performance
    • global locks in Skia
    • single global context
  • clip paths on GPU

Software WR:

  • test LLVM pipe
  • ship SwiftShader
  • pick low-hanging fruits

Performance:

  • enable document splitting
  • gradient fast path
  • make box shadows to be 1-st class primitives
  • improve opaque pass fragment count
  • optimize resource bindings
  • optimize clip mask renderings
  • local space raster scale $$
  • SIMD optimizations
  • render task graph 2.0
  • proof of concept Vulkan/D3D12/Metal $$$
  • better primitive culling
  • animation junk at 60fps (frame scheduling)
  • consider spatial culling structure
  • optimize Intel GPU perf
  • make FPS shooter fast
  • BGRA8 and swizzling support
  • glyph cache optimizations
    • mipping
    • size/scaling re-use
    • sharing between windows

Tooling:

  • Android mobile profiling tools
  • multi-frame WR captures
  • WR capture tiled blobs
  • picture caching debugging infrastructure $

Correctness:

  • WR 67 bugs
  • WR 68 bugs
  • snapping!

Refactor:

  • remove Cairo
  • rename Document and Pipeline terms
  • tech debt cleanup
  • rename some modules: tiling.rs, clip_scroll_tree.rs

Other:

  • hire more engineers
  • compile the list of websites we are good at (better than Chrome)

Fission:

  • move ImageLib to WR process
  • move font management to WR process

Security:

  • fuzzying
  • font sanitation

APZ discussion

(??)

Better clip mask rendering

Goals:

  • avoid doing work more than once (when a clip affects multiple primitives)
  • avoid doing work on fully opaque areas of the clip
  • simplify the cs_clip shaders

Ideas to explore:

  1. Clip mask inversion if we know that it's more 0 than 1.
  2. Use stencil. Potentially, test stencil for each clip.
  3. Share clips between items (under conditions).

Gather data about:

  • number of clips affecting items
    • average ratio of a clip area to the sum area of all clips
  • how widely image masks are used
  • how often the clip is shared between primitives
    • what is the ratio of total primitive area versus the clip area
    • are sub-pixel offsets of the primitives different?

Fast clears

We need to get back to a point where clips are fast-cleared to 1. This requires disabling scissor and re-evaluating performance against the current path that tries to render the first clip without blending. We can still render the corners of the first clip without blending. We don't need to render the opaque areas at all.

Snapping

Need automated infrastructure that modifies the reftests:

  • scaling both ref and image
  • switch some of the pictures to have their own surfaces
  • change the node in spatial tree where we switch to screen space rasterization

Document splitting overview

BGRA8 on Mac and Android

On Android, we don't always have BGRA8 internal format with glTexStorage. On MacOS, we never have that.

Choices are:

  1. use glTexImage2D to make BGRA8 our internal format. Pay for mipmap allocation in VRAM. (Currently used on Android)
  2. use glTexStorage(RGBA8) and pay for conversion of data from BGRA8. (Currently used on Mac).
    • we can convince ImageLib to produce RGBA8 data in the first place
  3. use glTexStorage(RGBA8) and pretend the data is in RGBA8, but use a swizzling sampler state when reading from it.
    • as a follow up, we can make some of the cached render tasks to produce BGRA8 right away, so that texture cache entries have more consistent swizzling
  4. use texture rectangles with BGRA8 internal format. Requires us to remove texture arrays.

DL interning API

Core idea:

  • move internation logic scene building to the API side
  • the current DL builder would just intern everything as an implementation detail
  • another DL builder would work with primitive handles and update vectors
    • there is a benefit of providing structure of update arrays, especially if those don't have any variable-encoded enums inside

DL restructuring:

  • provide spatial tree, clip tree, picture tree and potentially a hit test tree

Picture caching improvements

Picture cache slices:

  • Introduced by:

    • WebGL, canvas, video elements
    • Scroll roots (if using for performance within WR / low-end GPUs)
  • Don't want to do component alpha blend, because:

    • It's not supported by OS compositors.
    • If we are doing slices for internal WR reasons (performance) we probably don't want to render twice anyway.
  • For each slice:

    • Try to determine if opaque.
      • If yes, enable subpixel AA.
      • Otherwise, use grayscale AA.
  • Various possible options for switching between subpx / gray AA:

    • Consider a sticky downgrade where an interned text run stays gray after downgrading.
    • Might be OK to switch between them.
  • Consider using framebuffer fetch as a follow up.

    • Interpolate between subpx / grayscale based on fragment alpha.