-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider dropping permission for captured surface control APIs #48
Comments
It's not clear to me which heavy machinery is required. If a browser wishes to have a trivial, always-granted permission of the
|
Stepping back slightly - I feel like the risk here is that we are encouraging the user to view the captured surface through the machinery of the video call app (say) but interact with it semi-directly. There is no certainty that the VC will faithfully render the capture to the local user. What an attacking remote user sees is not necessarily what the local user sees (until they uncover the captured surface). It might have scrolled down through your emails but still be showing you and the rest of the conference the 3rd page of the first email whilst rendering the whole thing to the attacker. So yes, I think we need informed user consent (unless you tell me how deceptive zoom/scroll is prevented otherwise). |
Indeed. |
We should work to help UAs ensure this, as I propose in #49. I'd rather address the risk than slap a permission on it. |
Appeal to Triviality arguments for implementing something run afoul of § 1.7. Add new capabilities with care. A new permission adds:
One might even say a new permission without good reason is a failure to standardize. If there's a way to avoid this extra level of complexity for web developers (not vendors), I'd like to exhaust those ideas first. |
That is desirable. Think embedding a third-party video-conferencing tool into an application and limiting its capabilities to control the user experience.
That's not a problem.
This point is purely academic, because Chrome deems the permission policy as required, and that means we have to introduce this implementation-defined part of the API (your own citation discusses that the policies are impl-defined). Luckily, we can more easily reach consensus here, because our belief that a permission policy is needed, does not force you to also implement things this way - you are free to infer user intent and skip a prompt.
There are good reasons and they have been presented to you in this thread as well as the earlier threads ([1], [2]).
For iframes - see earlier in this comment.
The onus is on you to present a way to avoid this "level of complexity". |
Most of the permissions I saw are high level permissions like camera, location... Note that this feature is very particular since it is already gated by screen share permission. I do not think other permissions are usually gated by super permissions. To be noted that it would be convenient for the user/UA to enable/disable/reenable gesture forwarding during a capture. It would seem better to expose the fact that gesture forwarding for a particular media element is on or off. |
In our experience, it is clear and it does help.
Yes.
PTZ is gated on camera.
The spec mandates that every forwarded event is permission-checked before being forwarded. By the way, you have just made a great point. You want the UA to allow users to revoke this capability, be it during a capture session or outside of one. I want that too. And this perfectly illustrates the need for the established and meticulously spec-ed and implemented mechanism of permissions policies. We should not "roll our own". |
I doubt this is possible - you'd have to disable SVG, CSS, layout, z-order and a tonne of things to ensure that the video tag is a faithful rendering of the capture. |
Indeed again. The result is that it breaks any app that ever wants to draw in front of the video element. And what would be achieved? A way to avoid a prompt, which is optional to begin with if UAs wish to skip it? What's the benefit? And what would replace the Permission Policy machinery in allowing users to revoke this capability if they don't desire it? I don't understand this thread at all. Nor do I understand the basis for the insulting accusations that Chrome has just "slapped" this prompt on without thinking it through. |
@eladalon1983 My "slap a permission on it" refers to what you asked other browser vendors to do in #48, which you said was "trivial". Thank you very much, but I'd prefer not to take that advice. I'll try to choose my words more carefully next time when rejecting an idea. I want to correct the record that I have not accused the Chrome team of anything. We are here to standardize functionality for all browsers. This inherently involves scrutiny and disagreements over designs and ideas, but let's keep it to that and not people. With my co-chair hat, I ask that we not "assign intent or interpretations to other contributors' comments". This is part of our work mode, which I encourage everyone to reread every so often. |
We'd disable forwarding, not those things. UAs already do all sorts of mitigations like this. So that doesn't seem unusual. We should also be considering these mitigations whether we add permission or not. |
I think you have misread that comment. Please read it again. It said that you could have a "trivial" implementation that avoids a prompt, so it's the opposite of slapping a permission (prompt) - it said it would be "trivial" for you to avoid the permission (prompt). As to adding the permission - we thought about it long and hard in Chrome and carefully decided it was necessary. If you think long and hard about this non-trivial issue, and come to the opposite conclusion, it would still be trivial for you to avoid this prompt which we non-trivially added.
Given these points - especially 1 and 4 - I don't see what this discussion aims to achieve. |
Jan-Ivar, when you read this message by Tim, please consider the possibility of drawing two video elements, one with (1) an unfaithful representation of the capture at full opacity (fully visible), then another (2) faithful representation at minimum opacity (practically invisible). The app places (2) on top and forwards scrolls from it, proving your prompt-replacement mitigations useless. Now consider that you are asking the browser to employ heuristics to avoid that, and Tim is telling you that there might just be too many ways to do it, and that it's infeasible to both enumerate them as well as mitigate them all. |
You don't even need multiple video tags to be usefully deceptive. The simplest example I can come up with is to lie about the amount of scroll by clipping (in css) the rendered video to a small subsection of the captured video then un-scrolling the view port at a rate that partially counteracts the user's attempt to scroll. Most users would just spin the scroll wheel a bit further - possibly revealing things they didn't intend to. |
I would add that Jan-Ivar is right that the problem stems from the terrible UX of having the user view and interact with a re-render of their own app. Unfortunately the modern combination of screen shares and the expectation that a VC app will fill the screen makes this inevitable. (in times gone by we'd have re-parented the 'captured' window and added it to the window-tree of the VC app. We live in more complicated times now ;-) ) |
If you think my example is absurd, consider the value to an attacker of being able to see a couple of slides ahead in an earnings call. |
Summarizing things:
I move to close this issue. |
PTZ is not a permission, it is a descriptor.
The only protection that the Google Chrome prompt will provide is that most users may ignore it. I am not sure they might be able to understand the potential threats, this is pretty hard. One disconnect seems to be that one approach is to start small and very protective and another approach is to support all potential use-cases.
I might have missed my explanation. That is not to say there are no pros for the permission model or permission policy. |
I see this in Chromium's Translation Console (database of localized strings with some visual context for the translator). Possibly I am missing some later development and/or a nuance?
Nobody has said this would be the only protection. We have already discussed multiple other protections; see the move from
We are not proposing a synchronous prompt.
If this is about a permission prompt, then starting small is to have it and potentially remove it.
I think you mean "you", and if so, then yes, I have missed it. Could you please link me to the relevant comment?
Multiple concurrent screen-captures by one application is an edge-case that I don't expect to tackle in Chrome. (But the spec does not prohibit it, so other browsers should not be bothered.) |
This is click-jacking, which UAs already deal with on the daily. UAs should be given a lot of leeway to fight such behavior when detected, and we should build APIs that support them in this (like tying forwarding to playback as a start). So these are click-jacking mitigations. Calling them "prompt-replacement" mitigations wrongly suggests a permission prompt adequately addresses click-jacking (or that mitigation can be skipped because the user was warned), which is false. Permission prompts have shown to be useless in explaining click-jacking threats to users. If users can't understand the risk then we have not obtained meaningful consent.
Remote control is mitigated. A company using a malicious VC app for earnings calls seems a bit unrealistic. I do not wish to diminish the risk. On the contrary, I'm arguing for mitigation and against permission as panacea. |
The permission policy and prompt are NOT a click-jacking prevention mechanism. This issue started with a claim that a permission prompt is unnecessary, and a suggestion that its benefits could be better provided with other mechanisms; namely, with a limitation of the element types. In response, Tim and I have shown that element-type-limiting is easy to circumvent, which means it cannot be used as a substitute for anything, because it provides nothing. This is the correct context of this exchange. The claims that (1) a permission prompt is undesirable, and that (2) other mechanisms are sufficient substitutes, both remain unsubstantiated. Moreover, the counter-claim that if a permission prompt is truly undesirable, the spec does not prevent UAs from skipping it, has not been addressed.
And I am claiming that the mitigation you proposed (limiting element types) confers no security benefits. Further, the permission was not presented as a panacea, so let's please not characterize that claim as such. |
The only mitigation currently under discussion is captured-surface-control/issues/28. However, my questions about the attack-vector involved, and the effectiveness of the mitigation, remain unanswered. (If you are referring to a different mitigation, however, please advise which.) |
Permissions are necessary when undesirable behaviors are indistinguishable from desirable ones (microphone capture for example). Undesirable behaviors:
If UAs detect these behaviors, they can simply disable forwarding. No trust needed. Desirable behaviors:
This seems to satisfy the desirable behaviors except emojis. It seems implementable without relying on users trusting the website, provided the UA detects the undesirable behaviors that remain possible (last-second moving of the element, pausing playback etc.). I'm open to extending MVP to solving emojis. As I proposed in #49: "CSS [or]... a div.enableGestureForwarding to forward gestures to the video element underneath." |
I also would like to see whether we can do without prompts.
That would be ideal.
is it even needed for the initial MVP? FWIW, we started very cautious for getDisplayMedia and we added some features progressively, maybe we can do the same here. |
The established TAG design principle is: "If a useful feature has the potential to cause harm to users, make sure that the user can give meaningful consent for that feature to be used, and that they can refuse consent effectively." A call to Prompts might not be perfect, but they are better than guessing. It is perfectly spec-compliant for the user agent to (i) augment the permission policy with heuristics, (ii) skip the additional prompt, or (iii) modify the
Chrome usually uses async prompts.
Yes, it is. The MVP is informed by the real-world requirements of real-world Web developers, and the real-world scenarios they inform us that they need to solve.
This claim is unsubstantiated; the counterarguments are available earlier in the thread, which explained that the representation in the video element might not be faithful. ([1], [2]) To name another counterargument - malicious sites can get the user to scrolls somewhere, then either:
Reminder - I am NOT saying that the prompt is intended to stop clickjacking (recall here). Rather, I am saying that the proposed alternative of limiting-scrolling-to-video-element is succeptible to clickjacking, and it therefore fails to add any value, let alone can it obviate any security measures.
The MVP is informed by Web developers' stated needs. As explained, this includes a video completely obscured by a canvas, div or other element on top of which developers introduce whichever other elements. |
Thank you all for the continued and thoughtful discussion on this important issue. I'm glad we agree that serious click-jacking concerns remain with this API. To address them, I've filed #41 so we can collaboratively work on mitigating them. I agree with @youennf that adopting a cautious and protective approach by adding features progressively is prudent. In the long term, I believe that user agents shouldn't rely solely on user consent — especially when that consent may not be fully informed due to the complexities of risks like click-jacking. Instead, we should aim to build robust protections directly into the technology. However, I understand that browsers may need time and practical experience to develop effective safeguards against click-jacking attacks introduced by these new features. As a compromise, I'm open to considering the inclusion of a permission prompt in the short term, provided we can agree that our long-term goal is to eliminate the need for it once adequate mitigations are in place. @eladalon1983 makes a good point that vendors can choose to grant this permission by default once they feel confident in their protective measures. Once all browsers reach that level of confidence, we should be able to deprecate the permission or at least reduce its implementation cost. That said, I would prefer if the inclusion of permission doesn't dictate the API shape. E.g. if prompting on first scroll is undesirable, what if @youennf's proposed API triggered a prompt and instant NotAllowedError? try {
videoElement.enableGestureForwarding = true; // triggers a prompt and fails instantly
} catch (e) {
if (e.name != "NotAllowedError") throw;
console.log(videoElement.enableGestureForwarding); // false
} |
Thank you for proposing this compromise. It works for me. But to be perfectly clear - we agree to explore ways to eliminate the policy and/or prompt in the long-term, once we gain real-life data on users' and Web apps' behavior. If the policy and/or prompt are proven unnecessary, we will gladly remove them. But this long-term goal cannot block short-term API shapes that return a Promise, which is necessary for now (see below).
I agree.
That is a completely unworkable solution. Please see this comment for a list of the benefits of an async API that returns a Promise. (There are three quote-response pairs in that comment. I am referring to the middle one.)
We do not agree about that, but since you propose a compromise which I support - starting with a prompt and considering its removal later - we can bench this discussion. |
Issue transferred; heads up to those discussion participants who might otherwise be looking for it elsewhere: @jan-ivar, @youennf, @steely-glint |
Let's continue discussion from #27 here where other members can contribute.
The choice of requiring permission influences API design, like requiring methods over attributes, but this is of secondary concern.
We should first agree on whether permission is required or not for scrolling and/or zoom features. This should be based on threat vectors and UX concerns, and what the guidelines say around that. I mention some of them in #27 (comment):
Should we work on mitigating these risks directly?
Do we need to pull in the full permission machinery with delegated permissions, query and the like, or might giving UAs the option to throw
NotAllowedError
suffice?The text was updated successfully, but these errors were encountered: