-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-printables.. #3
Comments
Re: "different key codes for different layouts" see: http://stackoverflow.com/a/16125341/1123094 - note that there is a enum of codes that do change and some that don't.. not sure if this helps - it looks old. |
correction above... I get it - but how do we get a shifted_codedict with a list of all shifted forms to. Possible? |
First, I'm not sure what you mean by "used in tostring". That function has to return a single name for whatever keycode you gave it; if you define multiple aliases for each keycode (like As for keycodes for different layouts: classic Mac guaranteed that the special keys (return, F1, left arrow, etc.) were the same on all keyboards, but the normal keys (including things like KP -) weren't. As far as I know, OS X doesn't offer any such guarantees anywhere in the documentation, but the Carbon.h headers have the exact same list, with the same comment, so presumably you can still count on that as long as the Carbon framework exists (so at least up to 10.10). This means that you can build a table (in both directions) for the special keys and use that, and only fall back to the TIS functions for normal keys. (Notice that you still can't distinguish between, e.g., Finally, for "shifted forms", that 65535 is actually an error. (Notice that the newer If you can explain what you're trying to do, I might be able to help more. But I can try to give you some background on the basic concepts. When you hit a physical key, the keyboard driver emits a keycode. That keycode has no meaning other than what physical key you happened to press. If you're on a US MacBook Pro, and you hit the key labeled Meanwhile, there's always a current keyboard layout (I think these are called keyboard input sources nowadays… I learned this stuff on System 6, so you'll have to forgive me…). A keyboard layout is a mapping from modifier states (and dead key states) to keyboard tables. A keyboard table is a mapping from keycodes to characters or a few other things. So, the US keyboard layout's default table maps keycode If you're interested in the "few other things", my knowledge is probably way too out of date to be accurate, but I can at least give you a vague idea. First, when I said "characters", that was cheating a bit; it's really "strings of Anyway, if you look at the function I'm calling, Also, notice that there can often be multiple ways to type the same character. Most layouts have (or at least they did, in the old days) separate tables for shift and capslock, but most of the entries were the same, so shift+a and capslock+a were both Now that I've explained way too much and probably confused you even more, maybe if you explain what you're actually trying to do and why you think pykeycode is the answer, I can either help you move forward, or explain why you're doing the wrong thing, or admit that I'm as lost as you. :) |
Sorry for being a little vague! I'm way out of my depth..! OK - here's the end plan:
In essence what would make my life super great was if pykeycode (or something else):
Maybe this is just too tricky! Thanks for the explanation of how this all works.. Its actually fairly straightforward - but the solution doesn't look like it is! I hope this helps clarify the situation - thanks for your time! |
The good news is, changing our The bad news is that the naive way of building the dict to go in the opposite direction is completely infeasible, because there are 2^32 possible modifier states for each of the 128 key codes. So, the right way to solve this is to use the But there's a nice shortcut that I think will work for you. I'm pretty sure the only modifier combinations that have ever produced printable characters in any keyboard layout provided by Apple are none, shift, option, shift+option, and control. And I'm pretty sure the modifiers 0x2, 0x8, and 0x10 have been shift, option, and control on every Apple keyboard forever. So, all you have to do is handle modifier states 0x0, 0x2, 0x8, 0xa, and 0x10. So I checked in code that does this. (I left out control, because I didn't think you wanted it, but it's trivial to add it in.) This is of course relying on undocumented behavior, but if I'm right and it's behavior that's remained unchanged from System 6 through OS X 10.10, I think it's safe. The new
(I used a list comprehension to try to map the whole string in advance, so If you want this to work for dead keys as well (after all, Getting back to your original point: As you can see, all we're doing is building up a dict of 512 Unicode characters to keycode-and-modifier-state pairs. If you want to add aliases for non-printables, and add names for non-character special keys (which, as you discovered, are fixed across all keyboards), that's trivial:
And now:
|
Just for fun, I decided to see how hard it would be to parse the structures directly. And it's not that bad. See layout.py. It obviously needs a lot of cleanup if you want to use it for real, but it handles modifiers, and even simple dead-key states (like those on the US keyboard), and extending it to handle range-based dead-key states wouldn't be too hard. I didn't bother with the functions; just look at the dicts directly. In the case of characters that can be typed multiple ways, it prefers the earlier modifier tables to later, and it prefers non-dead-key sequences (e.g., on a US keyboard, a bare umlaut is shift-option-u, not option-u dead-key followed by space). Notice that I switched from ctypes to struct for parsing this stuff, because ctypes is no good at dealing with C libraries that use the struct hack and/or treat structures as byte buffers, both of which are all over the Unicode Utilities types. In real-life code you'd want to add some comments explaining the formats rather than just unpacking 'IIHHH' into five cryptically-named variables, but I think you should be able to figure this out. In fact, it's a lot simpler; I'm just using ctypes to call a handful of functions and then never touching it again. |
thats ruddy amazing. Thanks a million. I have to admit I've only briefly looked at it (this is a hobby project!) but I'll try and find some time to properly digest all that.. I think we may have worked out a better system for the whole of pyKeyCode actually.. thanks @abarnert :) |
two (I'm sure stupid) questions:
should that not be mod 2 (its shift 6 on my layout) - not mod 8? Can I just confirm as well: ? |
The first one is odd. I'll take a look and see what happens on my Mac when I get home. Does the same thing happen with the newer layout.py (the one that parses the keyboard layout structures), or only the older keycode.py version (the one that asks the Unicode services to do it for you)? Anyway, my first guess would be that on your keyboard, opt-6 is the caret dead-key prefix (which at least my older code would treat the same way as the caret itself), and therefore shift-opt 6 is probably a normal caret as well as shift-6 (in the same way that opt-E is the acute dead-key prefix, and shift-opt-E is an acute accent). If there are multiple ways to type the same key, it has to pick one of them. IIRC, layout.py iterates the modifier maps in the order you probably want (no mod, shift, opt, shift-opt, then everything else in numerical order), but the older code may have done so in whatever order the tables were defined or something stupid. Anyway, if the code is still treating dead-key-caret the same as plain-caret, and picking it even when plain-caret exists elsewhere, I should fix that. For the second one: I think I forgot to explain that mod is a bitmap, so 00000001 (decimal 1) is… control or fn or command or something, 00000010 (2) is shift, 00000100 (4) is caps-lock, 00001000 (8) is option, 00010000 (16) is command or control. If two or more modifiers are active, you | them together, so 00001010 (decimal 10) is shift and option together. And 0 means none of them are active. Finally, I don't think there's any Objective-C in either the original version (the C extension) or the latest version (the one that parses the layouts manually), just plain old C. The one in between did use PyObjC to let me use Python strings instead of ctypes strings, and make Python automatically release some objects instead of manually calling CFRelease, but I later decided it wasn't worth dragging PyObjC into it just for that. |
Thanks @abarnert Re: the bucky bit... Just to confirm - this is what I thought it was (from Win) MOD_SHIFT = 0b00000001 # == 1 But its different on the mac?: MOD_CONTROL = 0b00000001 # == 1 Apologies - I totally forgot layout.py (pah! Sorry after all your hard work!) So I'm firing it up but getting a Unicode error on layout.py :
https://github.com/abarnert/pykeycode/blob/ctypes/layout.py#L136 |
Oops, I was carefully writing everything to work with both Python 2.x and 3.x, but apparently I only tested the last set of changes in 3.x… Yeah, you don't want to format a Unicode string into a str like that, because in 2.x it means trying to encode it to ASCII. Plus, revmapping is supposed to be a mapping from keycode/mod pairs to unicode anyway. Also, 2.7 didn't have str.isprintable (or unicode.isprintable). Try it now. Meanwhile, control is 16, not 1; I'm pretty sure command is 1. But the others are right. They're not the same on Windows and OS X because each has a long chain of backward compatibility running back to the late 70s/early 80s, and Apple ][ keyboards and IBM PC keyboards weren't even remotely similar (different set of keys in a different layout, different electronics and a different way of talking to the computer, etc.). |
I've done some tweaks to the layout.py code now - really just some wrapper functions to make it a bit more accessible.
I have to admit (embarrased to admit) that whole chunk is like black magic to me! I've tried looking at it drunk and its not helped.. Any tips on bringing in a aliases lookup table at all? I can't quite figure out how that should work.. One thing too.. I'm slowly getting my head around unicode - its making a bit more sense now. Does isprintable ever return true? I can't see how it would so everything gets returned hexified right? |
Hey @abarnert, Not sure if you still care too much about this project, but I'm looking for some guidance as I'm planning on using it as a base for a fix on a project called Plover. This issue might not be the best place to have this discussion, but here you get very close to what I need. So, I'll give you the brief. I'm working on Plover, stenography software. Part of the stenography process is emulating a keyboard, so that given a string, like "Katz are awesome!" , we'll get the simulated keypresses no matter the user's layout. Right now, we just send key code 0, and set the Unicode string. This proves to be problematic on software that ignores the Unicode string, like VNC. What I need is the ability to read in the user's layout, then be able to process a string, and if the character is available to write on the layout, then output it (I'm assuming using only key codes, no Unicode string), and if the letter is not on the current layout (other locales, or special symbols like 🐱), then fall back to the Unicode string method. I'm all right with doing the work, just wondering if there's a place I can start. I realize you guys had a lot of issues with the extended key presses, I'm trying to consider at what point to cut off and just fall back to Unicode. Any guidance would be thoroughly appreciated, cheers, |
Hm, actually looking into your code on PyUserInput, I think I could work off of that branch. The only bit that concerns me is the use of Carbon APIs instead of Cocoa. That, and I guess I'd try to port it over to PyObjc in the process. |
Hey @morinted - sorry for the lack of interest from me.. I have to admit its taking me a little while to get my head around that code again! @abarnert was the true author - I simply bundled it together for https://github.com/SavinaRoja/PyUserInput It's not the nicest but its the best I have found to do this in python. (Doesn't help that Pythons unicode handling is a headache..).. Ive actually since wondered about other alternatives e.g. NodeJS but not sure. Will watch how your plover stuff comes along :) |
Yeah, the narrow unicode in Python 2.7 kind of sucks, but luckily we've managed to get around most of these issues. I played with the code and got something nice going, now in a pull request, so thanks for the code. It works a lot better than just setting the string on every stroke, so thanks, guys! |
@willwade @morinted : I haven't looked at this code in a couple years (since Will opened this bug), but…
That's exactly what the layout-parsing code does. IIRC, I never merged that to master (because I never wrote the code to use the dict that it creates, but I think Will used it somewhere else?), but it's checked in on some other branch? Anyway,
I remember I did one stupid thing with the API that makes it not quite that nice, but I can't remember what it was; anyway, if you run into that stupid thing and can't fix it, let me know and I'm sure I can. :)
IIRC, it's only using Text Input Services from Carbon, which (a) is still supported, and available in 64-bit/El Capitan land (and even has Swift bindings), and (b) has no more modern equivalent (at least in public userland).
I'm pretty sure the original person who wanted this 4 years ago wanted a pure C extension with no dependencies. The stuff I added later for Will uses PyObjC, but still only partly. Most of what we're accessing is Carbon/CoreFoundation stuff—and, while PyObjC has nice wrappers for parts of CoreFoundation, they don't include most of the stuff I'm calling here. The right thing to do is to write your own PyObjC wrappers for those bits (and submit them upstream to Ronald Oussoren), but I was lazy and did it the quick&dirty way instead.
There's a dead-easy solution there: stop using Python 2. Python 3's Unicode handling is great (at least in 3.3 and later, but I doubt 3.2 is a problem for anyone nowadays).
JavaScript defines strings as sequences of UTF-16 code units, not code points or characters or anything else sensible. So Node.JS is effectively stuck with the equivalent of narrow-build Python 2 forever, and all the headaches you're trying to escape will never go away, while in Python they're already solved. Also, of course, porting from Python 2 to Python 3 is a much smaller job than porting to Node, and doesn't require you to use a language with horrible syntax. And NodObjC is nowhere near as mature as PyObjC—last I checked, there was no way to use it with Cocoa classes with missing/incomplete/incorrect BridgeSupport, much less with CoreFoundation types, while PyObjC not only gives you hooks for such types, but comes with 90% of what you want pre-hooked (although, unfortunately, this project falls in the remaining 10%). |
Thanks Andrew. I did get the code working for my use case in the PR I
|
Hey again, This past week I took some time to fix some bugs in the code and at the same time turned it into a class and added a thread that watches for layout changes (then reloads the keyboard layout accordingly). There is a little bit of Plover-specific stuff in there, but it should be easy to pull out. The list of changes:
https://github.com/openstenoproject/plover/blob/master/plover/oslayer/osxkeyboardlayout.py This might be useful for anyone who needs to do similar work. Here's a sample of the beginning of the test code output:
|
I haven't had time to properly look at this but it looks neat.. nice one |
Non-printable characters..
I was thinking of creating a lookup table of aliases e.g:
which could then be used in tostring..
my thought though is whether this would work across different keyboard layouts.i.e. is it only the main letter keys that are fixed? Do you know at all?
Also - why is it pykeycode always gives 65535 for (some?) shifted forms e.g:
thanks again
will
The text was updated successfully, but these errors were encountered: