Daniel sent us this one — he's mapping an 8BitDo Micro gamepad to control a dictation app on Android. Record, Pause, Stop — three buttons, three actions, sounds simple. But Android doesn't exactly roll out the red carpet for custom hardware shortcuts. He's got four specific questions: how do you identify the UI elements you want to target, do you need ADB or is there an app for that, does the action need to be exposed as an intent, and if it isn't, is there any workaround or are you just out of luck? I love this prompt because it's one of those things where the hardware is fully capable and the software is technically capable, but the bridge between them is held together with hope and accessibility APIs.
That bridge is the whole episode. This is the impedance mismatch between physical buttons and Android's touch-first paradigm. Android was designed assuming your primary input is a finger on glass. When you press a button on a Bluetooth gamepad, the operating system generates a KeyEvent — KeyEvent dot ACTION DOWN, standard Android input pipeline. But here's the thing — most apps don't listen for arbitrary key codes. They listen for taps on specific View objects. A Record button isn't waiting for someone to press F13. It's waiting for a touch event inside a bounding box at coordinates five-forty by twelve-hundred.
The button press arrives, Android knows it happened, and the app just shrugs.
The event exists, it's dispatched, and it falls into a void because nothing is registered to receive it. The app's UI thread never hears about it. So the entire challenge is building a translation layer: physical button press to KeyEvent to something the app actually responds to. That something is almost always a simulated tap on a UI element. And to simulate a tap, you first need to identify what you're tapping.
Which is question one. How do you identify the UI elements?
This is where Android's accessibility infrastructure becomes your best friend, completely by accident. The Accessibility Service API was introduced way back in Android one point six, API level four, but the real game-changer came in Android four point zero, API level fourteen, when they added performAction — the ability for an accessibility service to actually simulate clicks on behalf of a user. That's the primitive that makes everything we're about to discuss possible.
The thing designed for screen readers is also the thing that lets you remote-control a dictation app with a tiny Nintendo-looking gamepad.
The assistive technology to automation pipeline is one of the great unintended side effects in mobile operating systems. Every UI element in Android has something called an AccessibilityNodeInfo. This is a data structure that exposes information about what's on screen — the content description, the resource ID, the class name, the bounding box coordinates, whether it's clickable, whether it's enabled. An accessibility service can traverse this entire tree and find specific nodes.
This tree is always there, regardless of whether the app developer thought about accessibility?
It's always there. The question is how well-populated it is. Every View object in Android gets an AccessibilityNodeInfo by default. The system auto-generates basic information — the class name, the bounds, whether it's clickable. But the content description field — that's the human-readable label like "Record" or "Pause" — that's only populated if the developer explicitly sets it. And this is where things get interesting.
Or frustrating, depending on how lazy the developer was.
A well-coded app will have content descriptions on every interactive element. You tap the Record button with Accessibility Scanner running, and it tells you this is a button with content description "Start recording," resource ID "com dot dictation dot app colon id slash btn underscore record." That's the gold standard. You can target that reliably.
A badly-coded app?
You get an unlabeled button. The node exists, it's clickable, but there's no content description. You might still get the resource ID if the developer assigned one, but many don't. In the worst case, you get a generic node with no label and a resource ID like "button seven" that changes every build. At that point you're down to coordinates.
Which is fragile.
Fragile but functional. We'll get there. So let's walk through the three identification methods, because this answers question two about whether you need ADB. Method one: Accessibility Scanner. This is a free app from Google on the Play Store, no ADB required, no desktop needed. You install it, enable it as an accessibility service, and it overlays information on every UI element you tap. You open your dictation app, tap the Record button, and it shows you the content description, resource ID, and whether it's clickable. That's your first stop.
For question two — do you need ADB — the answer is no, at least for basic identification.
For basic identification, no. But there's a catch. Accessibility Scanner shows you what the node tree looks like, but it doesn't give you the full XML dump. Method two is UI Automator Viewer, which does require ADB and a desktop connection. You run a command — adb shell uiautomator dump — and it generates an XML file of the entire screen hierarchy. You pull that file to your desktop and open it in the viewer. You get every node, every attribute, every coordinate. This is where you find resource IDs that Accessibility Scanner might miss, and critically, you get the exact bounding box coordinates.
Developer options, Pointer Location. You enable it in settings, and it overlays a crosshair with pixel coordinates at your fingertip. That's purely visual — no IDs, no node information, just raw coordinates. It's useful as a quick check but not for building reliable automation. So the tool hierarchy is: Accessibility Scanner for quick looks, UI Automator Viewer for deep inspection when you need resource IDs and exact bounds, Pointer Location for rough coordinate estimation.
This all works on any app, even ones that were never designed to be automated.
That's the beauty of the accessibility node tree — it's not opt-in. The system builds it regardless. The app developer can make it richer by adding content descriptions, but they can't prevent the tree from existing. Even a completely unlabeled app still has a tree of clickable nodes with coordinates.
We've identified the button. Now how do we actually press it programmatically? That's where question three comes in — does it need to be exposed as an intent?
This is the fork in the road. An intent in Android is a messaging object you can use to request an action from another app component. A well-designed app might expose an intent like "com dot dictation dot action dot RECORD" that you can broadcast, and the app's broadcast receiver picks it up and starts recording. That's the cleanest possible integration. You send an intent, the app responds. No UI simulation needed.
How common is that?
For dictation apps? Most apps don't expose custom intents for every UI action. They expose intents for launching the app, maybe for sharing content into the app, but recording controls tend to stay internal. There's no incentive for developers to expose these, and it creates a support burden if the intent API changes. So in practice, you should check the app's manifest or documentation for custom intents, but expect to find nothing.
Which brings us to question four. If it isn't exposed as an intent, is there any workaround, or are you out of options?
You are never out of options. This is the misconception I want to demolish right now. The idea that no intent means no automation is completely wrong. There are three workaround strategies, and they form a reliability hierarchy — you trade elegance for universality as you go down the list.
Give me the hierarchy.
Strategy A: AccessibilityService performAction. This is the cleanest workaround. An app like Tasker or MacroDroid, once you've granted it accessibility service permissions, can traverse the node tree, find a node by resource ID or content description, and call performAction with ACTION CLICK. This simulates a tap on that specific element. It's reliable because it targets the logical element, not a screen position. If the button moves because of a layout change, it still works as long as the resource ID or content description stays the same.
This is why we bothered identifying the resource ID in the first place.
The resource ID is the most stable target. Content descriptions are also good but can change with localization. Resource IDs tend to be stable across versions because they're tied to the layout XML. So in Tasker, you'd create a task with an Accessibility Action, point it at "com dot dictation dot app colon id slash btn underscore record," action type Click, and that's it. Tasker finds the node and clicks it.
What if the app has no resource IDs? You mentioned some developers just don't assign them.
Then we move to Strategy B: coordinate-based tapping via ADB shell. The command is "input tap x y" — for example, "input tap five-forty twelve-hundred." Tasker can run shell commands, so you create a task that executes "input tap five-forty twelve-hundred" and it taps those exact screen coordinates. This works on everything. There is no app that can prevent the input subsystem from injecting touch events. But it's fragile — if the app updates and the button moves by ten pixels, your automation breaks.
This is the nuclear option.
The nuclear option. Works everywhere, breaks on UI changes. If you go this route, I recommend saving your coordinates in a document and retesting after app updates. Some people build a calibration routine — they map coordinates as percentages of screen dimensions rather than absolute pixels, which helps with different screen sizes but not with layout changes within the app.
Strategy C is keycode injection. Instead of simulating a tap, you inject a key event that the app might already be listening for. The command is "input keyevent KEYCODE underscore MEDIA underscore PLAY underscore PAUSE" or similar. If the dictation app happens to respond to media keys — some do, because they integrate with Bluetooth headset controls — then this works without any UI targeting at all. But most dictation apps don't listen for media keys on their recording controls. They might listen for play-pause to control audio playback of recorded files, but not to start and stop recording. It's worth testing, but don't count on it.
The practical advice is: try Strategy A first with accessibility actions, fall back to Strategy B with coordinate taps if the app is accessibility-hostile, and treat Strategy C as a bonus if the app happens to support media keys.
And now let's talk about the 8BitDo Micro specifically, because this device has a feature that most people don't know about, and it's the key to making this whole workflow sing.
The keyboard mode.
The keyboard mode. The 8BitDo Micro can pair in multiple modes — D-Input, X-Input, and keyboard mode. The manual buries this, but you switch modes by holding Start plus a face button during pairing. In keyboard mode, the device sends standard HID keycodes — F13 through F24, media keys, even regular alphanumeric keys. This is crucial because Android treats these as keyboard input events, which are much easier to intercept than game controller events.
If you pair it in gamepad mode?
In gamepad mode, it sends Android game controller events — BUTTON A, BUTTON B, and so on. These are dispatched through the InputDevice API for game controllers, and most automation apps can't intercept them. Tasker's key event trigger doesn't see BUTTON A. It sees keycodes. So keyboard mode is the difference between your 8BitDo being a fully programmable macro pad and being a paperweight for productivity.
The full chain is: press a button on the 8BitDo, it sends F13 as an HID keycode, Android generates a KeyEvent for F13, Tasker has a profile listening for F13, and that profile triggers the accessibility action that clicks the Record button.
That's the chain. And here's a detail that matters: Tasker can intercept any keycode from F1 through F24, plus media keys. Android twelve and later also has a built-in Key Mapper feature under Settings, System, Keyboard, Physical Keyboard Shortcuts, but it only works with external keyboards, not gamepads in keyboard mode — which the 8BitDo in keyboard mode technically is, but mileage varies. I've seen reports that it works and reports that it doesn't. Tasker is the reliable path.
MacroDroid can do the same thing?
MacroDroid can do the same thing with its key event trigger. The choice between Tasker and MacroDroid mostly comes down to preference. Tasker is more powerful but has a steeper learning curve. MacroDroid is more approachable. Both can intercept keycodes and execute accessibility actions. Both can run shell commands for coordinate taps. The underlying primitives are the same.
Let's make this concrete. I've got my 8BitDo Micro, I've paired it in keyboard mode, I've mapped the three buttons to F13, F14, and F15 using the 8BitDo configuration software. I've identified that my dictation app's Record button has resource ID "com dot dictation dot app colon id slash btn underscore record." What does my Tasker setup look like?
You create three profiles. Profile one: Event, Key, F13. Linked task: Accessibility Action, target by resource ID "com dot dictation dot app colon id slash btn underscore record," action Click. Profile two: F14 mapped to the Pause button's resource ID. Profile three: F15 mapped to Stop. That's the ideal setup. You press the top button on the 8BitDo, Tasker intercepts F13, finds the Record button in the accessibility tree, clicks it. The dictation app starts recording. It has no idea a physical button was involved.
If the app has no resource IDs?
Then your Tasker task changes from Accessibility Action to Run Shell — "input tap five-forty twelve-hundred" — where those coordinates came from UI Automator Viewer. Same profiles, different action. The F13 trigger stays the same. This is the beauty of the architecture: the input interception is decoupled from the action execution. You can swap out the action without touching the trigger.
There's something almost subversive about this. The app developer didn't build keyboard shortcut support, didn't expose intents, didn't think about accessibility beyond the bare minimum, and yet here you are, controlling it with a gamepad anyway.
It's the universal solvent of accessibility APIs. They were designed so that users with disabilities could interact with apps through alternative input methods. But the side effect is that any app becomes automatable, whether it wants to be or not. The accessibility tree is mandatory infrastructure. You can't opt out. And that's a good thing — it means users have agency over their own devices even when app developers don't provide explicit integration points.
Let's talk about reliability, because this is where the real-world experience diverges from the theory. You set this up, it works beautifully for two weeks, and then the dictation app updates and everything breaks. What actually happens?
The failure mode depends on which strategy you're using. If you're using Strategy A with resource IDs, you're relatively safe. Resource IDs rarely change between minor updates. The developer might redesign the UI, but the resource ID for the Record button probably stays the same because it's tied to the function, not the visual design. I've had accessibility-based automations survive multiple app updates without modification.
If you're using coordinates?
The developer shifts the button by twenty pixels, or adds a new toolbar that pushes everything down, and your tap lands on empty space or, worse, on a different button. This is why I recommend building what I call a UI element map — a document where you save the resource IDs, content descriptions, and coordinates for every button you automate. When an app updates, you can quickly check whether your targets still work and update the coordinates if needed.
A maintenance document for your automation setup.
It sounds tedious, but it takes five minutes per app, and it saves you from having to rediscover everything from scratch. I keep a note with entries like "Dictation App, version four point two, Record button: resource ID equals com dot dictation dot app colon id slash btn underscore record, coordinates five-forty comma twelve-hundred, content description equals Start recording." If the app updates to version four point three, I open Accessibility Scanner, check if the resource ID still exists, and if the coordinates changed. Usually the resource ID survives and the coordinates shift slightly.
What about apps that are actively hostile to accessibility inspection? Banking apps, for instance, sometimes block the accessibility tree for security reasons.
That's the one genuine limitation. Some apps, particularly banking and payment apps, set the flag "accessibilityDataSensitive" on their views, or they override the accessibility delegate to return empty node trees. The stated reason is security — they don't want malware using accessibility services to read sensitive information off the screen. The practical effect is that Strategy A doesn't work, and you're forced to Strategy B — coordinate taps — which still works because the input subsystem doesn't care about accessibility flags. You can't read the screen, but you can still tap it.
Which is a weird security model. You can't inspect the vault, but you can still press buttons inside it.
It's security theater, honestly. A malicious app with accessibility permissions can still inject taps. Blocking the node tree doesn't prevent tap injection. It just makes life harder for legitimate automation users while doing almost nothing to stop determined attackers. But that's a rant for another episode.
Let's circle back to the 8BitDo Micro specifically. You mentioned it's popular for productivity even though it's marketed for gaming. What makes it better than a dedicated macro pad?
First, price — it's about thirty dollars, which undercuts most dedicated macro pads by a factor of two or three. Second, portability — it's tiny, it fits on a keychain, it has a rechargeable battery that lasts for weeks. Third, the keyboard mode — most dedicated macro pads also use HID keycodes, but they're less flexible about remapping. The 8BitDo configuration software lets you assign any keycode to any button, including the F13 through F24 range that almost nothing else uses, which means no conflicts with your regular keyboard.
F13 through F24 is the sweet spot because no physical keyboard has those keys anymore.
They're vestigial keycodes from the IBM terminal keyboard era, still present in the HID specification, still recognized by every operating system, but never used by modern keyboards. They're the perfect channel for macro pads because there's zero chance of accidentally triggering your dictation shortcut while typing an email.
The glockenspiel of keyboard inputs — technically present, rarely heard.
And I want to address a misconception that I see constantly in forums: the idea that you need root access to remap hardware buttons. You absolutely do not. Tasker and MacroDroid can intercept key events without root. The key event interception happens at the application layer — the automation app registers as a key event listener, and Android delivers the events to it before they reach the foreground app. No root required. This has been true since Android four point zero.
What about the built-in Android key remapping? You mentioned Android twelve added something.
Android twelve added physical keyboard shortcut remapping under Settings, System, Keyboard. If your 8BitDo is in keyboard mode, it might show up there as a physical keyboard, and you can remap keys to launch apps or trigger shortcuts. But the functionality is limited — you can map a key to open an app or send a basic intent, but you can't trigger an accessibility action or run a shell command. It's fine for simple use cases but not for the kind of targeted UI automation we're discussing.
Tasker or MacroDroid is the real answer.
Tasker or MacroDroid is the real answer. And between the two, I slightly prefer Tasker for this specific use case because its accessibility action support is more mature and it handles shell commands more gracefully. But MacroDroid is perfectly capable, and if you're already using it for other automations, there's no reason to switch.
Let's talk about debugging. You set all this up, you press the button, and nothing happens. Where do you start?
First, verify the key event is being received. In Tasker, you can add a Flash action — just a brief toast message — to your profile. Press the button, see if "F13 received" pops up on screen. If it doesn't, the problem is in the pairing mode or the keycode mapping. Check that your 8BitDo is actually in keyboard mode. Check that the button is actually mapped to F13 and not something else.
If the key event is being received but the action isn't firing?
Then the problem is in the action configuration. If you're using accessibility actions, open Accessibility Scanner and verify that the resource ID or content description you're targeting still exists. App updates can change these. If you're using coordinates, verify that the button is still at those coordinates — Pointer Location in developer options is your friend here.
There's also the permission layer. Accessibility services need to be explicitly enabled in settings.
That's the most common failure point for new users. Android requires you to manually enable each accessibility service in Settings, Accessibility. Tasker and MacroDroid both have accessibility services that need to be toggled on. Without that, the accessibility actions silently fail. There's no error message, no crash — the task just doesn't do anything. It's maddening.
Which is a design choice by Android to prevent malware from silently enabling accessibility services, but it also means legitimate users get tripped up.
The security versus usability tradeoff in accessibility services is a genuine tension. Android has been tightening the screws on accessibility service permissions for years because they're powerful and have been abused by malware. But the collateral damage is that legitimate automation users have to jump through more hoops.
Before we distill this into a practical checklist, I want to zoom out for a moment. What does the existence of this entire workflow say about the state of mobile operating systems? We're in twenty twenty-six, and to map a physical button to a Record function, we're using accessibility APIs designed for screen readers, debugging with XML dumps, and injecting synthetic tap events.
It says that mobile operating systems were designed around a single interaction model — direct touch manipulation — and everything else is an afterthought. Desktop operating systems have had keyboard shortcut APIs for decades. On a Mac, you can map any key combination to any menu item through System Preferences. On Windows, AutoHotkey has been doing this since two thousand three. On mobile, the assumption is that you're holding the device and touching the screen, and if you're not doing that, you're a weird edge case.
Yet here we are, building workflows that prove the edge case is more productive than the default.
The edge case is more productive than the default. A dedicated Record button that you can press without looking at the screen is faster and more reliable than hunting for a touch target while you're trying to capture a thought. The hardware is cheap and capable. The software is just barely accommodating. The gap between what's possible and what's supported is where this whole podcast lives.
Alright, let's turn this into something actionable. If someone listening wants to do this tonight with their own 8BitDo and dictation app, what's their checklist?
Step one: pair your 8BitDo Micro in keyboard mode. Hold Start plus the appropriate face button during pairing — check the manual for the exact combination, it varies by firmware version. Step two: use the 8BitDo configuration software to map three buttons to F13, F14, and F15. Step three: install Accessibility Scanner from the Play Store and enable it in accessibility settings. Step four: open your dictation app, tap the Record button with Accessibility Scanner active, and note the resource ID and content description. Repeat for Pause and Stop.
Step five: if Accessibility Scanner gives you clean resource IDs, set up Tasker profiles for F13, F14, and F15, each triggering an Accessibility Action targeting the corresponding resource ID. If Accessibility Scanner gives you nothing useful — no resource IDs, no content descriptions — then move to step five-b: connect your phone to a computer, run "adb shell uiautomator dump," pull the XML file, and extract the coordinates for each button. Then set up Tasker profiles with Run Shell actions using "input tap x y.
Step six is testing and debugging.
Step six: test each button. If nothing happens, check that Tasker's accessibility service is enabled. Add a Flash action to verify the key event is being received. Verify the resource ID or coordinates are still correct. Step seven: create your UI element map document with all the IDs and coordinates so you can recover quickly after app updates.
That's a solid checklist. And I want to emphasize something you said earlier — the solution is always possible. You might have to go all the way down to coordinate-based tapping, but that still works. There is no Android app that is completely immune to input injection.
Completely immune, no. There are apps that make it harder, but the input subsystem is below the application layer. The shell command "input tap" talks directly to the system's input dispatcher. The app can't block it. The app doesn't even know it's a synthetic tap — it looks identical to a real finger touch.
Which is both powerful and slightly unsettling.
It's the same mechanism that automated testing frameworks use. UI Automator, Espresso, Appium — they all use the same primitives. The accessibility tree and the input injection subsystem are the foundation of Android test automation. We're just repurposing them for productivity instead of testing.
One last question before we wrap up. You mentioned Android sixteen might have something called Adaptive Input API. What's the rumor, and does it change any of this?
The rumor, and I want to be clear this is speculative, is that Android sixteen will introduce an API that lets apps declare supported input actions — essentially, a formalized version of the intent system we were wishing for. An app could declare "I support a Record action, here's how you trigger it," and any input device could map to it without going through the accessibility tree. If that ships and developers adopt it, this entire workflow becomes obsolete in the best possible way.
You're skeptical about adoption.
I'm skeptical because it requires developers to do extra work for a feature that most users don't know they want. The accessibility tree approach works regardless of developer cooperation. A formal input API only works if developers implement it. I think we'll be using accessibility-based automation for a long time.
The workarounds are here to stay.
The workarounds are here to stay. But they work, and they're getting easier as tools like Tasker and MacroDroid improve their accessibility action support. The barrier to entry is lower than it's ever been.
Now: Hilbert's daily fun fact.
Hilbert: In the year twelve hundred and twelve, the Arab geographer Yaqut al-Hamawi recorded the existence of a large inhabited island in the Aral Sea called Barsa-Kelmes, meaning "the land of no return," which he claimed was populated by a tribe of people who had never been conquered. The island appears on multiple medieval Islamic maps of the Khwarazm region. There is no geological evidence it ever existed.
...right.
A non-existent island called "the land of no return." That's on brand.
So here's what I'm left thinking about: as these hardware shortcut devices proliferate — the 8BitDo, the Stream Deck mobile app, custom macro pads — the pressure on platform developers to provide first-class input mapping APIs is only going to grow. The demand is real, the hardware is cheap, and the workarounds are increasingly well-understood. The question is whether Google and Apple will meet users where they are, or keep treating physical buttons as a niche gaming concern.
Until they do, we've got accessibility services and shell commands. Try this workflow yourself — grab a free app like MacroDroid, install Accessibility Scanner, and see if you can map a button to something useful. If you build something weird, send it our way. This has been My Weird Prompts. Thanks to our producer Hilbert Flumingtop. Find us at myweirdprompts dot com.
We'll be here.