Posted on
in Presentations
Recent research has shown that control of what happens to data entered in the browser is gradually diminishing — whether that’s due to the JavaScript or “tags” on a webpage, or the behaviour of in-app browsers. Regulators expect organisations to protect their customers from browser data leaks. Do we have the risk management and security controls to comply, and if not, what should we do?
Video Transcript
>> ANNOUNCER: Please welcome John Elliot.
>> JOHN ELLIOT: Good morning. Good morning, ladies and gentlemen. Oh, gosh, this is a popular session, isn't it? I would like to thank you all for coming this morning. I feel, I don't know, that the middle of the morning on the last day of RSA Conference is a bit like, I don't know, the twentieth mile when you’re running a marathon. I mean, you are almost done and yet there’s still a little bit further we’ve got to push on. And hopefully I'm going to make that pretty easy today because my message is dead simple and which is this, that I'm really worry we broke the internet.
And when I say that we broke the internet, I don’t mean that we’ve actually broken the internet as it is because it worked this morning when I was using it. But we’ve broken the security model of the internet. And we’ve broken it for this reason. When we started building web applications, they used to look like this. It used to be a one to one conversation between a web browser and a webserver. And if there was any business logic or intelligence that was required, that was in webserver. And if we wanted to have any interaction or get any data from anywhere else, either within our own organizations or across the internet, that was mediated by a webserver.
And the problem is now that we have changed the architecture slowly of how we build web apps, and now it looks like this. The business intelligence has moved in to the consumer browser powered by JavaScript and distributed APIs and microservices. And so, now that the intelligence is sitting in the browser, and if we want some functionality, we pull JavaScript from somewhere on the internet.
And the problem with that is that we have created a scenario where it effectively, what is an unsafe environment, we have our business intelligence, and we have no segmentation between any bits of application that are running in that environment.
So, my agenda today is pretty simple, again, cognizant of the fact that we are halfway through the last day of RSA and it has been – gosh, has a lot of information happened.
I'm going to give you a quick technical introduction and I’m going to talk about what the big problem is and we’ll look at the four threat vectors, threat actors that are leaking information from the consumer browser. Going to talk about what regulators think about this. And then finally, ask what should we do as an organization about it?
So, I'm going to kick off with a very quick technical introduction. Now, cognizant of the fact that it is the last day, I want to make sure we are all on the same page and up to speed on this. So, I'm going to go quite quickly through just the basic fundamentals of how the internet works. When your browser wants to get information from a website or a web application, it goes there, it says hey, I want a website or a webpage. You are going to get a bundle of information, HTML, CSS, images, videos, and JavaScript, which will then render the page. And then as you want more information, you know, the browser will say I need some more JavaScript libraries from the website and then that's delivered and then you might get some more from another webserver in your domain.
And then maybe you need some information, you need some business partners, you need some functionality from somewhere else on the internet so you bring in some information, some JavaScript from third parties.
And then we get to something called a tag manager. Most websites nowadays use a tag manager. I don't know if you are familiar with the word tag or a tag manager. Let's find a quick definition before we go on.
Let's use the one from Adobe. Adobe says a tag is a piece of JavaScript code. So, whenever you hear the word tag, you think it’s a program in JavaScript language. And tag managers help marketers put tags, JavaScript, through a nice user interface, onto your organization's website. And it means it doesn't involve the developers or anyone else putting JavaScript code on the website that your people, your users will use.
So, in other words, the tag manager is essentially designed to completely defeat change control and information security overview by allowing anyone to put random JavaScript code into your website from people you don't know and you have never met before.
So, if we look – if we think about it as a tag manager, you are going to get some JavaScript which is the tag manager which is a container and then that's going to load lots of other JavaScript from any of the advertising, marketing, tracking, and analytics websites.
So, at the end, your webpage or your web application now is a collection of JavaScript from a number of different places on the internet. So, that’s the small technical training bit.
So, what's the big problem? And the problem is there is no segmentation in the JavaScript land. So, any piece of JavaScript has access to any element on the page or every piece of JavaScript has access to every element on the page.
Now, I have deliberately not included cross domain iFrames here. Because they’ve got access, let’s make this even simpler. It means they can read it. It means they can alter any information on the page. And it means they can steal or exfiltrate any information on the page.
So, every single piece of JavaScript, no matter where it comes from, whether it’s come from your domain, another service on your domain, or from third parties can access any element on the page. And so, the question I'm asking really is do we all trust every bit of JavaScript that we've loaded into our consumer browser? Because every piece of JavaScript, no matter where it’s come from, can access any data that's been entered by the user. It can access anything that’s been rendered on the page in terms of text or images. It can access mouse events. It can access keyboard events. It can access the URL. And in some cases, if you don't configure your cookies properly, it can access the contents of other people's cookies.
And what can it do with that? Well, it can send the data anywhere onto the internet. It can highjack mouse events or keyboard events. It can add extra code and change the complete functionality of a website.
You think, well, what is all this JavaScript doing, John? And the answer that it is doing an awful lot nowadays. It might be a library that you’re using to control and build user interfaces. It might be part of a framework you’re using like Angular. It might be for personalization.
It might be for one of those really helpful things that pop in the bottom right-hand corner of your website that says, “Hello, I'm a chatbot. Would you like to chat to me?” It could be that sort of JavaScript.
It could be one of those brilliant bits of, thank you GDPR, cookie and privacy notices that pop up every time we visit a website. It could be used for AB testing. It could be doing user behavior recording so that the user experience can make the website better. It could be for business functionality, working out shipping costs or which shipper to use. It could be working out tax if you’ve got multijurisdictional sales tax. It could be doing location services. It could even be managing payments. Or it could be increasing the reach of your website by using marketing, social media, analytics, and advertising.
Because every time you see one of those little logos with that little bird or the little F or something like that, that's a bunch of JavaScript that's making that happen. And that JavaScript can't do anything to the website.
So, the question is how big is this is an issue. Some research recently done by Jscrambler looked at the top eighteen – top nineteen websites, e-commerce websites in the U.S. On average, and this is the figure that scares the heck out of me, but each website had a hundred scripts on it, a hundred different individual JavaScripts on it.
About half of them came from the organization that ran the website. We call those first party JavaScripts. And about half of them came from some other random – sorry, not some other random place on the internet, John – don't be pejorative.
Half of them came from somewhere else on the internet. So, about half were yours, half were somebody else's.
When we talk about that somebody else's, the average was it contacted thirteen domains and the largest number was thirty-five different domains where JavaScript was being pulled to assemble the webpage.
When I looked to this in a great deal of depth in the past few months, I thought how come I have not noticed this before? And the fact is, I guess like me, a lot of us run Brave and we run uBlock Origin and we run Adblock Plus because we’re that type of people, which actually blocks a lot of the JavaScript we would see. But now if you turn off all your protection and press F12 and look at the amount of JavaScripts on a modern website, it is amazing.
I remember maybe six years ago, I used to use an extension called – anyone remember, it like stopped JavaScript running. And every time you got a JavaScript, it said, “Hello, there’s a new JavaScript. Would you like to run it?” And you went yes or no, right? You can’t do that now because the whole web experience will be going yes, yes, yes, yes. So, you just can’t do that anymore.
Okay, so let’s look at the threats, and there are four threats or threat actors or threat companies that are leaking our data, and they are these happy three characters and their camera. And remember, any JavaScript can leak any data.
So, I’m going to talk about the first one which I call covert. I’m not 100% happy with calling it covert. But basically, advertising, marketing, and tracking companies are in the business of knowing as much about every user as they can because then they can target better adverts of that person. So, their incentive, when you put their little logo or icon or include their code in your website, their incentive is to take as much information as they possibly can from the user’s browser. And a lot of people don’t understand how much the default amount of information is being taken.
I’m going to give you one really great example, which was that a large number of hospitals in the U.S. included the Facebook or Meta pixel on their website for doing – the little F icon then – and then it was really good so they could do remarketing and targeting.
And what some security and health researchers found was that if you went to book an appointment, say, say you are worried about, I don't know, maybe you had a lump somewhere. And so you made an appointment at the hospital with a, I don’t know, a lumpologist, yeah?
And what happened was, a couple of days later, you would be browsing the internet and there would be this great snake oil advert saying, hey, are you worried about a lump today? Then get some lump tablets. Right? And so – I am staying away from proper medical problems but these unregulated products, they're not FDA approved, they’ve got no efficacy. But when you are actually worried about health conditions, you are not – you are not always a really great – I mean, sometimes you just buy anything because you want to get rid of the lump, okay?
And so, these researchers discovered that masses of this health data was being sent by these hospitals to Meta’s advertising network. And in terms of the scale of it – and I know you can’t copy the URLs down because the fashion is for writing the longest URLs as possible to make it hard for presenters now in conference sessions. So, I know you can't copy these down but the session is recorded and slides are available on – once you’re logged into RSAC. Or you could just search for this. It’s very – it’s a pretty famous paper.
In terms of size of it, like 98.5% of hospitals are using Google's advertising and about half of them are using Meta's. This wasn't a small problem.
The second threat actor is your marketing department, and I’m sorry, marketing department. Because in a lot of cases, these tag managers are configurable. You can say what data they can take. Some, you can't do that with. There’s a lot you can restrict the amount of data that the tag manager can take. And that's quite – or the advertising and tracking JavaScript can take and that is quite interesting, but you have to go and do it. If you don't do it, then you have got the default which is that they will try and take as much information as possible.
And you might have seen last year, two really interesting reports that came out that when you have – and I understand Americans, you have to fill out tax returns using some software online. It's how you do taxes in America. And so, they all have tracking, advertising, and marketing JavaScript on them, which took information about how much money you earned and how much tax you were going to pay, all sorts of things like that, and sent it to advertising networks. Which actually, if you’re an advertising network, that's really good because now I know we will have to segment you as a customer because I know how rich you are. And also, it took in information from people who went to abortion advice clinics, which is probably not a great thing, especially in today’s climate.
So, that’s one thing is the configuration of the market, the Marketing Department threat actor.
The second is what I call capture tools. Now, I'm going to ask you a question. You actually have to answer. Actually, there’s so few people, we could answer it. Do you think it’s reasonable that an advertising or marketing or anything takes data from your browser and exfiltrates it if you haven't pressed the submit button and sent it to website? I think that's really – because I'm not actually sure that the owner of that website or application has any – and I come from Europe, so we’re got this weird GDPR thing which is the way I think – I’m not sure that they’ve actually got any right to that data at that stage. I still think that’s the consumer’s own data until they press submit. Because the expectation of the consumer is it wouldn't have gone anywhere near anywhere – until your browser, until you press the submit button.
There was some great research done by a Leuven University where they found that a lot of the tracking software was actually reading the form fields before anyone would press the submit, and reading the deleted form field and what you typed in the second time or the third time. Scary if you are filling out an insurance claim form.
Also, a lot of websites use behavior replay, so they can track what the user is doing so that they can then make sure they can tailor their user interface to make it a better user experience. Why did someone drop out of our payments channel? Why did they book this and not book that? A lot of those applications were found to be again, exfiltrating anything that was typed into a form field before it was submitted, but also usernames and passwords. The paper is called “Leaky Forms.” Great title for a paper. And actually really interesting.
And then finally, our final threat actor is one we might consider to be a threat actor which is a criminal. Criminals has been skimming payment card data from websites for about six or seven years. And they’ve been doing that because payment card data is something that can be monetized. This is something that’s not going away. Skimming payment card data increased by 174% in the last half of last year, according to Visa. So, this is still a very, very viable and happening threat.
So, let's look at how that would work. Here is our kitten website, our kitten pictures website. Let’s imagine we want to make an e-commerce website. So, we're going to charge people for kitten pictures. So, we need an arrangement with a payment processor or an acquiror that's going to be now delivered by a piece of JavaScript. So, it’s like, give me the payment form. Here comes the payment form. There’s the payment form appears. Type in my cardholder data, press the submit button, and now it’s with the payment processor and now I get my kitten picture. Everyone clear with that, okay?
So, what's the attack? The attack is really simple. The criminal needs to get some criminal JavaScript onto the consumer's browser. And then when I request the payment form from the payment processor, the form is random. All of the form fields are now hooked by the criminal's JavaScript. I type in the cardholder data and I press the submit button. It goes through the acquiror process as normal. It is transparent to the consumer. It is transparent to the merchant. The transaction happens as normal. But also, the data is exfiltrated to the criminal somewhere on the internet.
And you’ve probably heard of this, called a Magecart attack on e-commerce, skimming attack. It’s really popular. It’s a terrible thing to say, isn’t it? But it is really popular for criminals.
The question is then how does the criminal get their JavaScript into the consumer's page? And the answer is well, you know, what's our attack surface? Where can they attack? They can attack first parties, third parties, and the tag manager.
And so, a first party attack would be to hack the merchant's website and add their own skimming code to it, and that's what happened to British Airways, very famously, where one of their secondary webservers that served their JavaScript library was attacked by the criminals. Thirteen lines – thirteen lines of JavaScript code was added that skimmed cardholder data, including my cardholder data actually, from the BA website.
And that’s a pretty easy attack and it's quite hard to notice, especially if it is in a secondary webserver and it is just a little code change.
Third party, now remember that you might have up to fifty third party – fifty bits of third party JavaScript on average coming to a website. So, the criminal could attack any one of those third parties, which is why I asked if you trusted them all and you’ve done your third party risk assessment on them and you’ve worked out that they were pretty secure and criminals wouldn’t be able to attack the JavaScript on it. Because again, if the criminal can attack one of those third parties, they can get their skimmer into a webpage.
And the most famous example of that is Ticketmaster, who had one of those, “Hello, I'm a chatbot. Would you like to chat to me?,” on its payment page. Would you believe it? And the criminal deliberately attacked the third party provider because they realized Ticketmaster's in-house security was probably too high a bar for them to go over. But this small startup chat provider, they managed to compromise, add their skimming code to the chat provider, and then every time that, “Hello, would you like to talk to me?” chat box pops up, it was happily skimming the cardholder data from all of Ticketmaster’s worldwide customers. That was a bit sad.
And then the third place is the tag manager. And the ability here is like can we use the tag manager to get criminal JavaScript into the consumer browser? And you’re thinking, that's not a real attack, John, is it? Well, yes, it is, because all you need to do is compromise any of the third – the many locations where tags or JavaScript are coming from and hopefully get some that would include that in your website. That’s what happened with what was called the breach a few years ago.
But also, a really awesome attack that Gemini Advisory posted on recently. And this is great. What you do is you write a skimmer and then you get Google or whichever tag manager company you are using to add it to the library of tags that marketing are allowed – that marketing can choose from to put on the website.
Now, that's quite hard because all the tag manager companies are quite aware that this might be an attack vector, so they try hard not to let criminals put skimming tags into their tag managers. But some criminals succeeded because JavaScript is quite hard, if you obfuscate it. And remember, a script running can then just go and load another script and completely change its behavior. So, it is quite hard to work out what the behavior of JavaScript is.
And so, then what we have is marketing departments being compromised for their usernames and passwords into the tag interface by a typical phishing type of attack, and then the skimming is appearing on the website and it is really hard to find how it got there.
Another great – I shouldn't use the word great, should I? Another innovative criminal attack was criminals found a large number of websites that were using a defunct – defunct analytics piece of JavaScript. It was that defunct, not only was it no longer maintained, right – pardon me. But the company, the company doing it actually let the domain expire.
And so, the criminals went and bought the domain. You know where this is going, don't you? Yeah. The criminals bought the domain, wrote their own skimmer, put it as the same name as the old tracking and started skimming cardholder data. I thought that was pretty clever.
And the third one that I thought was really clever, and this was reported by Akamai recently, and this is a really good attack, is that rather than actually preconfiguring the skimmer, what the criminals did was they had a very small bit of JavaScript that created a single web socket, and they sent the URL of the page they’d landed on back to the command and control server. And said, is there anything you want me to do?
Of, it’s like, I didn’t know we were on this – I didn’t know we were on this website. That’s cool. Let’s go and take a look at it, write some custom skimmer code, so the next time that something lands on it and says this is the URL it’s on, it’s like, oh now, now we’ve written the skimmer code for that. So, now we’ve got quite a large, extensible skimmer.
And again, the security writeups – pardon me – for all of these, you can see the URLs there.
So, we have effectively three threat vectors we need to be worried about. Things that we legitimately have on the website that take far too much information, things that could have been configured not to take information, things that record stuff the user does before they press the submit key, and I really think that’s quite a hard thing. And then criminals.
Criminals, as I said, have been mainly been into taking payment card data. And so, the first thing I would like to say – I'm going to make six predictions. I'm here. I can make predictions today. Okay?
So, I'm going to make six predictions. The first of that is hostile threat actors will start to use JavaScript skimming to steal things other than payment card data. It’s a really great way of attacking because I don’t need persistence in your environment. I need to get my JavaScript on your customer's – into your customer’s browser once. I don’t need persistence. I am not coming back. It is a really good attack which is why it is being used by criminals. It will be start to be used for stealing other forms of data, both by organized crime and nation-states.
So, what's the solution? We're all information security professionals. We know the solution. The solution is things like this. Let's have an inventory of all JavaScript that's running on our consumers’ websites. A part from – that's assembled in real-time most of the time. And it’s coming from a number of third parties. Maybe we cannot do that.
What about this? Why don’t we just do like code review? Because we don’t have to do – like, we can do static analysis, we can do dynamic analysis, we could even do the analysis that doesn't really work where somebody looks at the code analysis.
And that might work for your first party JavaScript code but maybe not for third party code, so I’m not sure that’s very good.
We could add all those vendors, those third parties where JavaScript comes, into our third party risk management programs which are currently not overloaded and don't have enough – hang on. That's going to be quite hard, isn’t it, to add them to our third party risk management.
I know. We'll put everything through very strict change and release control. That might work for your first party. You can put it into your CI/CD pipeline. But for third parties, Google, every time you change any of your tag management code, can you let us know so we can put it – you mean you change it dynamically? What? Without doing anything? Oh, well, that's gone as well.
Okay. So, some of our traditional things we might do as information security professionals might not actually work to fix this problem. And my worry is we've stuck this problem into what I am going to call the too hard box. We have got other things to worry about, like how on earth are we going to buy – I went to the show floor yesterday. What are we going to buy? AI powered XDR I think is what we're supposed to be buying this year. Right.
How am I going to buy AI powered XDR? This JavaScript problem is a silly problem. It will go away.
Unfortunately, the role of regulators is to make us open boxes that we think are the too hard box and look at it. And I want to very quickly look at four bits of regulatory action that we've seen in the past twelve months that describes why this – that I think this is a thing that regulators have noticed, and if you are in an industry that is regulated or you process personal data somewhere in California or Utah or anywhere with a personal data law, or Europe, or you’re a regulated financial institution or a regulated healthcare, you are processing that type of data, your regulator knows about this. And that's what I think the – that’s really the story of my talk today.
So, I’m looking at four regulators, the payment card industry, the UK's implementation of GDPR, the Danish implementation of GDPR, and in the U.S., the health and human services. And I know PCI DSS is a security standard and not a regulator but it’s imposed by contracts on people who do things with payment cards, and I’ll start with that because that’s what I know quite well.
Back in 2015 when we wrote PCI DSS version 3.0, we were very aware – sorry, when I say we, I was Visa's representative on the PCI Standards Council at the time. Excuse the we bit, okay.
So, but we were really aware that this attack was an interesting attack. And it was because of this, that merchants were loading JavaScript from their payment processor, collecting cardholder data, and then sending it back to the payment processor. And so, the merchant said, hello. I'm terribly sorry, PCI people, but we don't store, process, or transmit cardholder data anymore. We don’t store it. We don’t process it. And we don’t transmit it. So, bye. We’re not doing PCI DSS anymore.
And of course, the payments industry went, well hang on. The attacks are really obvious. You’re not doing PCI DSS. We just put a JavaScript on your website. We get ahold of the cardholder data.
So, we created a new self-assessment questionnaire for smaller merchants that said if you do this model, you have to do ninety requirements. But if you redirect it to a payment processor or you put the fields into an iFrame and the iFrame gives you separation between the JavaScript and the parent page and anything running in the iFrame, you can do ten requirements.
Guess what happened? The industry moved. The payment processors moved to giving solutions that did iFrames. And so, no one ever used the new standard because no one wanted to really do that.
Of course, that only fixes the first party JavaScript problem. It doesn’t fix the third party JavaScript problem.
So, in PCI DSS 4, we actually put two new requirements into PCI DSS 4. A protection requirement and a detection requirement. And again, when I say we, I was unlucky to be MasterCard's representative to the Payment Card Industry Security Standards Council, so I spent two years on this problem.
And we actually took a very standard view of having an inventory, making sure that all scripts were authorized. I.E., somebody knew that script had been put on the page, making sure that it was necessary for the payment functionality and that it was validated. And then having some change detection there so if a piece of JavaScript that we’d said that's okay, you can put it on the website, changed either because the vendor changed it or because it was being attacked by a criminal and some malicious script had been inserted, that you noticed that change within seven days and you generated an alert and you worked out it was reasonable.
Coming now to GDPR, the Information Commissioner has given us some written guidance as to how they think as a result of the Ticketmaster data breach. What they said was Ticketmaster should have reasonably – see, that's the word we use – reasonably been aware of the risk of putting third party JavaScript on a website that processes personal data, including payment card data. And that therefore, because of that, they should have done a risk assessment. They didn't say you shouldn't have done it. They didn't say it was a bad idea, although actually, if you read all fifty pages of their decision notice, they do say it wasn't a great idea at the time, but they should have at a minimum done a risk assessment.
The Danish Data Protection Regulator, who I am not going to try and embarrass myself by saying their name, but the Danish Data Protection Regulator took some action against a very small e-commerce shop. And it said look, and this e-commerce shop had had an attack and there had been some skimming JavaScript inserted into its website and criminals had stolen cardholder data.
And the Danish Data Protection Regulator said like, it was obvious that that's what happened, that there was a JavaScript skimmer put on your website.
And the company, a small e-commerce company, said this. They said, well, we can't even spell JavaScript. I mean, how on earth can you expect us to understand JavaScript and to be able to decipher what's being put on the website? And the regulator said it's your website, it’s your problem. Right? That really is your problem. If you run an e-commerce website, you should be able to deal with this.
The fourth piece of regulation, and this is I think the most powerful for us all to be interested in, is as a result of that research report I talked about earlier about hospitals and lumps and anti-lump cream or whatever being advertised, about sending protected healthcare information to advertising, tracking, and marketing networks, in November, the Office of Civil Rights of the U.S. Department of Health and Human Services said, hey, if you are using tracking technologies, remember – and you are covered by HIPAA, remember you have got to comply with the HIPAA rules.
Which means that the disclosure needs to be permitted by the privacy rule but you have to have a formal HIPAA Business Associate Agreement in place with Meta. That you have to do a risk assessment and comply with the security rule. And otherwise, if you don't do that, that's a breach notification. And a lot of hospitals last November issued breach notifications and there are some class action lawsuits because of this breach that happened last year.
This is the – you know, like, talks are supposed to have a funny bit in? It's coming.
Right. The hospitals said, oh, it's okay because Meta said if they got any protected healthcare information, they’d just delete it. So that makes it fine. And the OCR said well, actually, that’s not really – the OCR’s written thing was that's not really up to the tracking technology vendor to agree to do it.
But secretly, I think they were rolling on the floor laughing when the hospital said that.
So, my second prediction is this, that managing the risk associated with JavaScript that executes in a customer's browser will become an explicit, I.E., the regulator said you have to do this, or an implicit because it's reasonable or appropriate regulatory requirement.
Regulators all talk to each other. So, because HHS has issued this guidance to all covered entities, other regulators in the U.S. that regulate any personal data will be aware of it. International regulators will be aware of it. And so, their attention, like the eye of Sauron, will start landing on it.
Now, regulators do things, different regulators do things very differently. Some will just say, do reasonable things or do appropriate things. Some will say, make sure this happens or more effectively make sure this doesn't happen. And some, like PCI DSS, will write five pages of prescriptive requirements that tell you exactly what to do and what not to do.
Which means we may be back to this, that we have to have SBOMs and an inventory and do code reviews and third party risk and change and release control. My argument actually is we are not really back to this. What we are back to is that I think anyone in information security would agree that these were reasonable things for us to do. None of them are practical, okay, but they're all reasonable things for us to do. And what a regulator would expect us to do is to have done a risk assessment and a business assessment of which of these things would help and which of these things this wouldn't help.
So, I'm not suggesting we need to do this. What I'm suggesting is we need to do a risk assessment to understand what we are going to do that compensates for this or why we're not going to do this.
Now, I know some of the audience, some of you will have been thinking, John, you have not mentioned content security policy and subresource integrity. The tools of the internet, they are actually designed to stop this happening. Content security policy or CSP says what scripts can run and also where scripts can exfiltrate data to. Subresource integrity attaches a hash to any piece of JavaScript that you may want to run. And then when the browser loads the JavaScript from wherever it is loading from ,it compares the hash that's in the script tag with the – it calculates the hash of the JavaScript it’s just loaded, and if they don't match, it stops the script running. Those are two techniques that we think should or maybe could fix the problem.
But the issue is that CSP is quite hard to do. There are quite a few bypasses to it and it is very hard to manage in a dynamic environment where your marketing departments are adding things to your website on a regular basis. CSP will stop stuff it doesn't know about, the maintenance doesn’t know about running.
From an SRI perspective, the code that most of us use on our websites is changing daily, and so you can’t create a hash every day. And the great report that’s being presented next month called it unfeasible to use SRI. And the other problem with SRI is it fails silently. So, if the hashes don't match, the script doesn't load. Now that's great if it was criminal script, but if it was your payment process who just added a new line to the payment processing script, you are not taking payments on your website anymore. And having been in the position of a CISO, I know that if I did that, my Head of Sales would come to me and have a rather strong conversation that we’d stopped taking payments for half a day while I ran around working out what was going on.
So, SRI is not great from a business practicality perspective. It's not bad if you only have first party JavaScript. If you only have first party JavaScript, you could probably put it in your CI/CD pipeline to calculate the hashes, to do a good content security policy, and that would work. But as soon as you introduce third party JavaScript, these two techniques don't work.
Now, and I guess one of you might be thinking, yeah, but you know, this will change. The browser vendors will fix this problem for us.
Here is sadly what I think my ideal view of the world looks like. Which is this. There should be two sorts of companies. Companies that make browsers and then companies who like to track users and sell online advertising based on the profiles of the users they have been tracking. That would be a good idea, wouldn't it? But unfortunately, this is the model we have at the moment, that the people who make browsers also do all the tracking and advertising and marketing.
So, maybe there is not an incentive for the browser vendors to fix this problem because half of their business model is based on this problem.
Now, if you are going from Google, you will be saying, John, that's very disingenuous. And it is a little, I know, because Google are looking at doing something called fenced frames to try and fix this problem but that’s at the very early stages of development.
The other thing I'm worried about browsers – this is like a browser tangent moment – is in app browsers, which I classify as pirates that steal your data, because normally when you open a link in an application, it launches the browser from your mobile device, whether that’s Safari or Chrome. TikTok, Facebook, and Instagram actually wrote their own browser. So, when you launch the browser from a TikTok – whether it’s a hyperlink in TikTok, Facebook, or Instagram, it launches their own browser that rewrites all of the JavaScript on the page. So, it interprets the JavaScript and rewrites it. So, it tracks – sorry – I have to be very careful because no one said they were doing this. They just said they could do this.
The internet browsers were tracking everything you clicked on and looked at and scrolled and everything and could send and exfiltrate that information back to the people who may be in that browser.
I asked myself, and again, the reference for the paper on this is on the slide. And I asked myself this, which is, what would a regulator think about that? What would a regulator think whose problem that was? Would it be the problem of TikTok, Facebook, and Instagram? Or would it be the problem of the person whose website has just been rewritten?
When I first did this presentation, there is a highlight around the pirates because I thought the regulator would definitely think it was the people who were stealing – sorry – no, stealing data's problem. But the more I thought about it and remembered on internet banking when internet banking first started and we didn’t have browser protection from malicious plugins, all the financial regulators were pretty clear, it was the bank’s responsibility to see if that browser was a safe place to conduct financial transactions.
And so, I’m not convinced now. I have not got an answer to this question. But I’m going to make a third prediction. It’s that over time, some organizations will have to consider that anything that is loaded in a browser where they can’t trust the browser is a hostile, unsafe environment, and they will need to do things like code obfuscation and other tamper resistant or tamper detection techniques to make sure that their webpage is operating in a hostile browser, not a non-hostile browser.
So, maybe another thing is there will be technical solutions to fix this problem. And the answer is yes, there are. If I was a lawyer, I would say this list is including but not limited to. Right? These are all the people I found who do some technical fixes for this JavaScript problem. The ones in green are in the expo. I know it’s Thursday so you don't have much time at the expo, but the ones in green, you can go and talk to them on their booth today about what their solutions to this problem are. And there are some very different, innovative ways of doing it.
But none of them are going to be a silver bullet because again, looking from a regulatory perspective, and having read just about every decision notice on data leakage from most regulators in the world, what they would expect you to do is to have risk assessed each stage of the process. So, when did somebody decide to add some script to the page? Did you do a code and behavior review? If it was from a third party, did you have a – did you add them to your vendor risk assessment program? Because you probably have a vendor risk assessment program so their question would be why didn't use it. If you don't have a vendor risk assessment program, that's fine. They wouldn't ask that question. And then who decided it was okay to add to the site?
And then secondly, when you detected change, what did you do? Did you block it? Did you go through some authorization process? Did you do another code review? Who said that change was good? Because that's the way that regulators think. So, no matter what technology you have, what you have to decide is how you are going to risk assess each step of those two processes. And remember, that's for an average of a hundred scripts. Half of which come from you and half of which come from people you have never met and you might not trust.
So, my fourth prediction is the worst, but managing JavaScript is going to be a bit of a pain for the next few years until we get some consensus between what the regulatory view of how we manage JavaScript is and the practicality of doing it.
I mean, maybe we’ll stop loading third party JavaScript from all over the internet. Maybe that’s going to be the answer. I don’t know. But there will be a disconnect between what regulators expect and what's actually practical of how the internet works. And you can see that from the way that PCI DSS said this is what you need to do because practically that’s almost impossible to implement.
So, my one message is that documented risk assessments will be the key to give you regulatory and risk coverage not just against regulators, but against class action lawsuits or anything where you end up leaking data you shouldn't have leaked. And so, we will end up with hopefully something like this.
So, my sixth and final prediction is that tools and ways of doing this will catch up but it might take us three years. In the meantime, our only hope is to do really good risk assessments, which I know is the most exciting thing on the planet.
Anyway, there is some hope. Some vendors are working on script isolation architectures which will isolate third party JavaScript from being able to access the document object model.
Cloudflare has a product called Zaraz which takes the JavaScript and runs it into a Cloudflare worker in Cloudflare’s network at the edge. And I believe will probably use a service worker between the browser and the Cloudflare edge to give the same functionality. It’s almost like creating a remote API for a piece of current JavaScript you have. But that means that that JavaScript doesn't have access into the document object model.
Google had a great thing called Google Caja which tried to rewrite the page and create JavaScript isolation boundaries. That project was terminated, unfortunately, but you can see what they were trying to do. The documentation is still on the Google website.
Oh gosh, I was never allowed – they said if you press the button too early, you can't go back. So, could someone go back a slide for me? Because they also said if I said that, it would happen within seconds. They fibbed, didn't they?
The other product is – so, Jscrambler has got a product called Webpage Integrity or WPI, which monkey patches the browser to create separate – to basically create a sandbox for each separate piece of JavaScript and creates the mediation layer between it.
And then the last one on the list is – I’m going to have to do this from memory now and that is really terrible, isn’t it? Does anyone know what the last one was?
It was Partytown. Thank you. From – and Partytown is a really clever thing because it take the JavaScript and it moves it into a worker. So, again, because it is sitting in a worker, you’ll get an improvement because it’s running on a separate thread to the browser tab but again, it does not have any access into the document object model once it has been removed into a worker.
So, those are so some technical solutions that might save us.
I would like to finish on what can we do today, so finish on a positive note, shouldn't we? And the first thing I would encourage you to do is in the next week, is to go back and have a look at your own website and see how much first party JavaScript and third party JavaScript is on it.
Saying next week because if you have been to lots of these, you know we have to say that as presenters. And if you have been to ten or fifteen sessions, you have ten or fifteen things to do next week. So, in the near future.
Secondly, that when you have got an inventory, go around your organization and work out who owns that JavaScript. Who added it to the webpage? And what I mean by that is if you have to take it away, who would you ask? Work out who you would ask for permission to do that. Find out is there a current business process for adding JavaScript to your website. And if there is, draw it out. Do you use a tag manager? Do you use two tag managers like some weird organizations?
Secondly, once you have done that, you should do this risk assessment. Here are the four threat actors or threat vectors you should consider threats to JavaScript leakage from the browser. Is it advertising and tracking networks that are taking too much data? Is it the fact that marketing haven’t configured things? Do you have things that capture data before it is submitted to you, which I think is a really serious data protection risk? And what is the attractiveness of what you are doing to a criminal?
In ninety days, work out what you will need to do if you need to manage JavaScript. What will be the business process for your organization? Will you need to look at changes and then post approve them or will you be able to have a change CI/CD pipeline where you make approvals before stuff is released? Can you segment your site into like high risk areas and low risk areas and take different approaches?
Obviously, if you are building a single page web app, that doesn’t work because you have a single page with a single set of JavaScript for it.
And then after that, work out how to do that within how your business operates. This is the hardest bit I have found working with other companies of how to manage JavaScript. It becomes a business process problem, not a technology problem because people are used to working a certain way.
And if you buy technology to support any of those risk decision points, make sure that technology works with you by doing a proof of concept. There are a lot of vendors who promise the Earth and deliver slightly less than the Earth, I guess.
So, to summarize, do an inventory, work out who owns the JavaScript, risk assess it, work out the business processes, and then work out what technology fits with your business processes.
To summarize my predictions, because I know we are at the wall and so I thought if I did a summary at the end, it would help wall people, hostile threat actors are going to steal information from your browser but you can't see it happening using skimming techniques to do more than cardholder data in the next three years.
Managing this risk is going to become something that regulators either implicitly or explicitly expect you to do.
Some organizations, depending on your risk view, will need to think of how you treat in-app browsers. In fact – and that might change as more browsers become in-app or – but as a hostile environment, I think if the browser was a hostile environment, what would I do in the way that people who wrote banking apps back in, I don’t know, 1998 or 1999, early banking apps, how to treat the browser as a hostile environment and put other checks and balances in place to make sure that it wasn't fraudsters.
This is my worst prediction but I do think managing JavaScript risk is going to be painful for the next few years and there will be a disconnect between what regulators think we should do and technically what we can do.
And then finally, this is my Pollyanna prediction, that everything will be awesome because tools and frameworks are being taught how to do this and will eventually catch up and it will all be great.
Finally, I would like to acknowledge the fact that most of this presentation was based on the work of lots of other people, of security researchers, of privacy researchers, of academics, of healthcare activists researchers working with security researchers who produced that paper that made the Department of Health and Human Services issue some guidance on it, and also from investigative journalists, which I think goes to prove the fact – the message of this year's RSA conference is that as an information security community, we are much stronger together.
Thank you very much.
Share With Your Community