The initial implementation works well for text that is all in one page (Chrome or apps with webviews like Gospel Library), broken down by paragraph into small AccessibilityNodeInfo
s that can be accessed from rootInActiveWindow
. (https://github.com/KyleFin/live-scroll-transcript/issues/6 is about communicating this to users)
There are many use-cases I would like to support. Here are some thoughts:
Use cases
(See sample code for an example of how to print a11y tree)
- Scrolling apps where text is in one giant node or not exposed in a11y tree (Google Docs, Drive, Gmail)
- If we can't access a11y tree, we might use OCR and swipe gestures to scroll. One difficulty is knowing if we should continue scrolling when images filling the whole screen or if we miss scrolling and audio goes beyond the current screen.
- Create a generic "ctrl + f" macro functionality to find text in the current app?
- PDFs
- Could be very useful. OCR and swipe gestures. In addition to images, another complication is knowing which way to swipe (how to handle multiple columns on the same page)
- Page-turning apps (Hoopla, Libby, Google Play Books)
- Some may provide good a11y info, but only for the current page. We can probably turn page quite reliably with just a tap or swipe gesture. Also may support NEXT_PAGE action.
- How to know which apps support page turning (can AccessibilityService query support for NEXT_PAGE?)
- How to decide when to turn the page? (Once we've matched text on the page, turn page immediately to stay ahead? Wait until audio goes to next page and try to keep up? What if there are images?)
- Kindle (page-turn or continuous)
- This may be lower priority because some Kindle books have WhisperSync with Audible to automatically sync text/audio. Live Scroll would extend support for using different versions of the media and work for virtually any title.
- Page-turn is same as other page-turning apps and provides a11y info for current page.
- Continuous scroll provides no text a11y info (confirm if implementing), but we could scroll with OCR and gestures.
- We can know we're in page-turn or continuous mode from package name (Kindle) and content description in KRFView (has text for page, empty for continuous). Confirm if implementing.
- Same concerns about images and how to recover if we fall behind.
- Photos (panning instead of scrolling)
- Very low priority, but it could be neat.
- Requires OCR and gestures.
Solutions
Sample code
// Logging current a11y tree (very similar to getNodesContainingWord)
private fun printAccessibilityTree(root: AccessibilityNodeInfo, level: Int) {
if (root == null) return
Log.d(tag, "Node at level %s with childCount %s: %s".format(level, root.getChildCount(), root))
for (i in 1..root.childCount) {
root.getChild(i - 1)?.let { printAccessibilityTree(it, level + 1) }
}
}
// From AccessibilityService:
printAccessibilityTree(this.rootInActiveWindow, 0)
private GestureDescription advanceTextGestureDescription() {
if (currGestureRegion.equals(paginatedAppGestureRegion)) {
return tapRightSideOfScreen(); // swipeLeftGestureDescription();
} else if (currGestureRegion.equals(scrollableAppGestureRegion)) {
return swipeUpGestureDescription(currGestureRegion.bottom);
}
return null;
}
private GestureDescription swipeUpGestureDescription(int initialY) {
// ** Swipe up (e.g. to scroll down). */
Path path = new Path();
path.moveTo(currGestureRegion.left, initialY);
path.lineTo(currGestureRegion.left, currGestureRegion.top);
StrokeDescription strokeDescription =
new StrokeDescription(path, /*startTime=*/ 0L, /*duration (in ms)=*/ 500L);
return new GestureDescription.Builder().addStroke(strokeDescription).build();
}
private GestureDescription swipeLeftGestureDescription() {
// ** Swipe left (e.g. to turn to next page). */
Path longSlowPath = new Path();
longSlowPath.moveTo(900, 1000);
longSlowPath.lineTo(200, 1000);
Path flickPath = new Path();
flickPath.moveTo(200, 1000);
flickPath.lineTo(100, 1000);
StrokeDescription strokeDescription =
new StrokeDescription(
longSlowPath, /*startTime=*/ 0L, /*duration (in ms)=*/ 400L, /* willContinue= */ true);
strokeDescription.continueStroke(
flickPath, /*startTime=*/ 0L, /*duration (in ms)=*/ 100L, /* willContinue= */ false);
return new GestureDescription.Builder().addStroke(strokeDescription).build();
}
private GestureDescription tapRightSideOfScreen() {
// ** Tap right side of screen (e.g. to turn to next page). */
Path path = new Path();
path.moveTo(screenHeight / 2, 5 * (screenWidth / 6));
StrokeDescription strokeDescription =
new StrokeDescription(path, /*startTime=*/ 0L, /*duration (in ms)=*/ 10L);
return new GestureDescription.Builder().addStroke(strokeDescription).build();
}