Tumblelog by Soup.io
Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

June 13 2018


Hiring for the Cambridge Cybercrime Centre

We have three open positions in the Cambridge Cybercrime Centre: https://www.cambridgecybercrime.uk.

We wish to fill at least one of the three posts with someone from a computer science, data science, or similar technical background.

BUT we’re not just looking for computer science people: to continue our multi-disciplinary approach, we wish to fill at least one of the three posts with someone from a criminology, sociology, psychology or legal background.

Details of the posts, and what we’re looking for are in the job advert here: http://www.jobs.cam.ac.uk/job/17827/.

June 01 2018


Bitcoin Redux: crypto crime, and how to tackle it

Bitcoin Redux explains what’s going wrong in the world of cryptocurrencies. The bitcoin exchanges are developing into a shadow banking system, which do not give their customers actual bitcoin but rather display a “balance” and allow them to transact with others. However if Alice sends Bob a bitcoin, and they’re both customers of the same exchange, it just adjusts their balances rather than doing anything on the blockchain. This is an e-money service, according to European law, but is the law enforced? Not where it matters. We’ve been looking at the details.

In March we wrote about how to trace stolen bitcoin, describing new tools that enable us to track crime proceeds on the blockchain with more precision than before. We waited for victims of bitcoin theft and fraud to come to us, so we could test our tools on real cases. However in most of them it was not clear that the victims had ever owned any bitcoin at all.

There are basically three ways you could try to hold a bitcoin. You could buy one from an exchange and get them to send it to a wallet you host yourself, but almost nobody does that.

You could buy one from an exchange and get the exchange to keep the keys for you, so that the asset was unique to you and they were only guarding it for you – just like when you buy gold and the bullion merchant then charges you a fee to guard your gold in his vault. If the merchant goes bust, you can turn up at the vault with your receipt and demand your gold back.

Or you could buy one from an exchange and have them owe you a bitcoin – just as when you put your money in the bank. The bank doesn’t have a stack of banknotes in the vault with your name on it; and if it goes bust you have to stand in line with the other creditors.

It seems that most people who buy bitcoin think that they’re operating under the gold merchant model, while most exchanges operate under the bank model. This raises a whole host of issues around solvency, liquidity, accounting practices, money laundering, risk and trust. The details matter, and the more we look at them, the worse it seems.

This paper will appear at the Workshop on the Economics of Information Security later this month. It contains eight recommendations for what governments should be doing to clean up this mess.

May 29 2018


FIPR 20th birthday

The FIPR 20th birthday seminar is taking place right now in the Cambridge Computer Lab, and the livestream is here.


I may or may not find time to liveblog the sessions in followups…

May 25 2018


IDAPython: wrappers are only wrappers

Intended audience

IDAPython developers who enjoy the occasional headache, leaky abstraction enthousiasts, or simply the curious.


IDAPython wraps C++ types, and the lifecycle of C++ objects (and in particular members of larger objects) is not necessarily the same as that of the Python wrapper object that is wrapping it.

The problem

One of our users reported IDA crashes when an IDAPython script of theirs. The user came up with a very simple way to reproduce the issue (thank you!), showing that this had to do with accessing the parents member of a ida_hexrays.ctree_visitor_t instance.

Here is (an even more simplified version of) the script the user sent us:

from ida_hexrays import *

my_parents = None

class my_visitor_t(ctree_visitor_t):
    def __init__(self, func):
        ctree_visitor_t.__init__(self, CV_PARENTS)

    def visit_expr(self, i):
        global my_parents
        if self.parents is not None:
            my_parents = self.parents
        return 0

def my_cb(event, *args):
    if event == hxe_print_func:
        f = args[0]
        my_visitor_t(f).apply_to(f.body, None)
        import gc
        my_parents.front() # will crash
    return 0


Note: I threw a gc.collect() in there, to make crashes more likely.

The script above is provided in its entirety for the sake of completeness, but really the important lines are only the following:

    def visit_expr(self, i):
        global my_parents
        if self.parents is not None:
            my_parents = self.parents


        my_visitor_t(f).apply_to(f.body, None)
        my_parents.front() # will crash

Details, details, details

Since this issue is non-trivial, I’ll try and provide a step-by-step explanation, hopefully as clear as can be, by annotating the important lines of code mentioned above:


Create a my_visitor_t instance. That is a subclass of the ctree_visitor_t type, which means it eventually extends a C++ object of type ctree_visitor_t.

When the underlying C++ ctree_visitor_t object is created, its member named parents (a ctree_items_t vector) is initialized. For the sake of the example, let’s say the C++ ctree_visitor_t instance is located at memory 0x1000 and the parents member is placed at memory 0x100C.

                       .apply_to(f.body, None)

Call ctree_visitor_t::apply_to. Thanks to SWiG “magic”, C++ virtual method calls will be properly redirected and our my_visitor_t.visit_expr method will be called for each cexpr_t in the tree, as expected.

        if self.parents is not None:

Access self.parents. This will create a Python wrapper object. The key here is to understand that it’s a wrapper object which is backed by the real, C++ ctree_items_t instance.

For example, any access to the object returned by self.parents, will in fact translate to an access into the C++ ctree_items_t vector, so if one were to write, e.g., self.parents.size() (or even len(self.parents)), it’s actually the real underlying C++ ctree_items_t instance’s size() method that will end up being called.

            my_parents = self.parents

Another access to self.parents, and another Python wrapper will be created (once again backed by the actual ctree_items_t vector)

[Note: the fact that another wrapper is created is not a problem (in fact since it went out of scope, the previous wrapper might already have been garbage collected!)]

Once again, for the sake of the example, let’s say the wrapping PyObject instance is placed in memory, at 0xB000.
That wrapper is then bound to the global variable my_parents, causing its python refcount to increase to 2. Past that line, the refcount will drop back to 1 (again, because of scope logic), which means that Python wrapper object will remain alive.

[...apply_to() returns, and we are now back to the `my_cb` function...]

At this point, it’s likely my_visitor_t(f) has just been garbage collected since nobody keeps a reference to it.

That means:

  • the my_visitor_t instance has been destroyed, which means
  • the underlying ctree_visitor_t C++ object located at memory 0x1000 has been deleted, which in turn means
  • its parents object, which was located at memory 0x100C, is now invalid


We are now calling front() on the my_parents Python object. If you recall, that my_parents object is a Python wrapper object located in memory at 0xB000. That wrapper object still has a refcount of (at least) 1, and is thus alive.

What is not quite alive anymore, however, is the actual C++ ctree_items_t vector, which was deleted as part of deleting the C++ ctree_visitor_t it belonged to.

In other words, we have a perfectly valid Python wrapper object, that has a dangling pointer to a member of a freshly-deleted C++ object.

The solution

The solution is, in terms of effort, rather simple: make a copy of the vector:

-            my_parents = self.parents
+            my_parents = ctree_items_t(self.parents)

since it doesn’t belong to the C++ ctree_visitor_t object, this copy won’t be thrashed when it is deleted.

May 24 2018


Security and Human Behavior 2018

I’m at the 2018 Workshop on Security and Human Behavior which is being held this year at Carnegie Mellon University. For background, the workshop liveblogs and websites from 2008–17 are linked here.

As usual, I will try to liveblog the sessions in followups to this post.

May 22 2018


Deobfucsating xor’ed strings

A few days ago a customer sent us a sample file. The code he sent us was using a very simple technique to obfuscate string constants by building them on the fly and using ‘xor’ to hide the string contents from static disassembly:

The decompiler recovered most of the xor’ed values but some of them were left obfuscated:

After some investigation it turned out that it is a shortcoming our the decompiler: the value propagation (or constant folding) can not handle the situation when an unusual part of a value is used in another expression. For example, if an instruction defines a four byte value, the second byte of the value can not be propagated to other expressions. More standard cases, like the low or high two bytes, or even just one byte, are handled well.

It seems that compilers never leave such constants unpropagated, this is why we did not encounter this case before.

Let us write a short decompiler plugin that would handle this situation and propagate a part of a constant into another expression. The idea is simple: as soon as we find a situation when a constant is used in a binary operation like xor, we will try to find the definition of the second operand, and if it is a constant, then we will propagate it. Graphically it will look like this:

mov #N, var.4           ; put a 4 byte constant into var
xor var@1.1, #M, var2.1 ; xor the second byte of var

is converted into

mov #N, var.4
xor #N>>8, #M, var2.1

The resulting xor will then automatically get optimized by the decompiler. However, to speed up things (to avoid another loop of optimization rules), we will call the optimize_flat() function ourselves.

Please note that we do not rely on the instruction opcode: the xor opcode can be replaced by any other binary operation, our logic will still work correctly.

Also we do not rely on the operand sizes (well, to speed up things we do not handle operands wider that 1 byte because they are handled fine by the decompiler).

Also we can handle not only the second byte, but any byte of the variable.

The final version of the plugin can be downloaded here. It is fully automatic, you just need to drop it into the plugins/ directory.

And the decompiler output looks nice now:

We could further improve the output and convert these assignments into a call to the strcpy() function, but this is left as an exercise for our dear readers 😉

P.S. Naturally, we will improve the decompiler to handle this case. The next version will include this improvement.

May 21 2018


New security lecturer

We’re delighted to announce that the new security lectureship we advertised has been offered to Alice Hutchings, and she’s accepted. We had 52 applicants of whom we shortlisted three for interview.

Alice works in the Cambridge Cybercrime Centre and her background is in criminology. Her publications are here. Her appointment will build on our strengths in research on cybercrime, and will complement and extend our multidisciplinary work in the economics and psychology of security.

May 14 2018


Failure to protect: kids’ data in school

If you care about children’s rights, data protection or indeed about privacy in general, then I’d suggest you read this disturbing new report on what’s happening in Britain’s schools.

In an ideal world, schools should be actively preparing pupils to be empowered citizens in a digital world that is increasingly riddled with exploitative and coercive systems. Instead, the government is forcing schools to collect data that are then sold or given to firms that exploit it, with no meaningful consent. There is not even the normal right to request subject access to you can check whether the information about you is right and have it corrected if it’s wrong.

Yet the government has happily given the Daily Telegraph fully-identified pupil information so that it can do research, presumably on how private schools are better than government ones, or how grammar schools are better than comprehensives. You just could not make this up.

The detective work to uncover such abuses has been done by the NGO Defenddigitalme, who followed up some work we did a decade and more ago on the National Pupil Database in our Database State report and our earlier research on children’s databases. Defenddigitalme are campaigning for subject access rights, the deletion of nationality data, and a code of practice. Do read the report and if you think it’s outrageous, write to your MP and say so. Our elected representatives make a lot of noise about protecting children; time to call them on it.

May 09 2018


Leaving on a jet plane: the trade in fraudulently obtained airline tickets

Over the years, I’ve had friends and acquaintances ask me about unauthorised transactions for flight bookings made with their credit cards. The question is usually along the lines of, if the airlines know what flight is being travelled, why don’t the police go and meet the passenger?

This is a great question, but it’s often not quite so straightforward. Although Europol co-ordinates regular Global Airline Action Days, during which those travelling may be detained, this does not create disincentives for those actually obtaining the airline tickets.

A few years ago, Professor Nicolas Christin at Carnegie Mellon University mentioned to me that he was aware of cheap airline tickets being advertised on an online black market. This comment led to an in-depth research project, covering all corners of the globe, to understand how these tickets were being obtained, and why.

You can read more about my research here, including how some of these tickets are connected to other crime types, such as human smuggling and trafficking; theft (including pickpocketing and shoplifting from airport stores); smuggling cash and contraband, such as drugs, cigarettes and tobacco; facilitating money laundering (such as opening bank accounts in other countries); and credit card fraud, including making transactions with compromised cards, and operating skimmers.

May 02 2018


Happy Birthday FIPR!

On May 29th there will be a lively debate in Cambridge between people from NGOs and GCHQ, academia and Deepmind, the press and the Cabinet Office. Should governments be able to break the encryption on our phones? Are we entitled to any privacy for our health and social care records? And what can be done about fake news? If the Internet’s going to be censored, who do we trust to do it?

The occasion is the 20th birthday of the Foundation for Information Policy Research, which was launched on May 29th 1998 to campaign against what became the Regulation of Investigatory Powers Act. Tony Blair wanted to be able to treat all URLs as traffic data and collect everyone’s browsing history without a warrant; we fought back, and our “big browser” amendment defined traffic data to be only that part of the URL needed to identify the server. That set the boundary. Since then, FIPR has engaged in research and lobbying on export control, censorship, health privacy, electronic voting and much else.

After twenty years it’s time to take stock. It’s remarkable how little the debate has shifted despite everything moving online. The police and spooks still claim they need to break encryption but still can’t support that with real evidence. Health administrators still want to sell our medical records to drug companies without our consent. Governments still can’t get it together to police cybercrime, but want to censor the Internet for all sorts of other reasons. Laws around what can be said or sold online – around copyright, pornography and even election campaign funding – are still tussle spaces, only now the big beasts are Google and Facebook rather than the copyright lobby.

A historical perspective might perhaps be of some value in guiding future debates on policy. If you’d like to join in the discussion, book your free ticket here.

April 24 2018


Euro S&P

I am at the IEEE Euro Security and Privacy Conference in London.

The keynote talk was by Sunny Consolvo, who runs Google’s security and privacy UX team, and her topic was user-facing threats to privacy and security. Her first theme was browser warnings, which try to stop users doing what they want to; it’s an interruption, it’s technical and there’s no obvious way forward other than clicking through the warning. In 2013 their SSL warning had a clickthrough rate of 68% while their more explicit and graphic malware warning had only 23% clickthrough. Mozilla’s SSL warning had a much lower 33%, with an icon of a policeman and more explicit tests. After four years of experimenting with watching eyes, corporate styling / branding and extra steps – none of which worked very well – they tried a strategy of clear instruction, attractive preferred choice, and unattractive alternative. The text had less jargon, a low reading level, brevity, specifics, an illustration and colour. Her CHI15 paper shows that the new design did much better, from 69% CTR to 41%. It turns out that many factors are at play; a strong signal is site quality, but this leads many people to continue anyway to sites they have come to trust. The malware clickthrough rate is now down to 5%, and SSL to 21%. That cost five years of a huge team effort, with back-end stuff too as well as UX. It involved huge internal fights, such as with a product manager who wanted the warning to say “this site contains malware” rather than “the site you’re trying to get to contains malware” as it was shorter. Her recent papers are here, here, and here.

A second thread of work is a longitudonal survey of public opinion on privacy ranging from government surveillance to cyber-bullying. This has run since 2015 in sixteen countries. 84% of respondents thought limiting access to online but not public data is very or extremely important. 84% were concerned about hackers vs 55% worried about governments and 53% companies. 20% of Germans are very angry about government access to personal data versus 10% of Brits. Most people believe national security justifies data access (except in South Korea) while no country’s people believes the government should have access to police non-violent crime. Most people everywhere support targeted monitoring but nowhere is there majority support for bulk surveillance. In Germany 53% believed everyone should have the right to send anonymous encrypted email while in the UK it’s 39%. Germans were pessimistic about technology with only 4% believing it was possible to be completely anonymous online. Over 88% believe that freedom of expression is very or extremely important and less than 1% unimportant; but over 70% didn’t believe that cyberbullying should be allowed. Opinions are more varied on extremist religious content, with 10.9% agreeing it should be allowed and 21% saying “it depends”.

Her third thread was intimate partner abuse, which has been experienced by 27% of women and 11% of men. There are typically three phases: a physical control phase where the abuser has access to the survivor’s device and may install malware, or even destroy devices; an escape phase which is high-risk as they try to find a new home, a job and so on; and a life-apart phase when they might want to shield location, email address and phone numbers to escape harassment, and may have lifelong concerns. Risks are greater for poorer people who may not be able to just buy a new phone. Sunny gave some case stories of extreme mate guarding and survivors’ strategies such as using a neighbour’s phone or a computer in a library or at work. It takes seven escape attempts on average to get to life apart. After escape, a survivor may have to restrict childrens’ online activities and sever mutual relationships; letting your child post anything can leak the school location and lead to the abuser turning up. She may have to change career as it can be impossible to work as a self-employed professional if she can no longer advertise. The takeaway is that designers should focus on usability during times of high stress and high risk; they should allow users to have multiple accounts; they should design things so that someone reviewing your history should not be able to tell you deleted anything; they should push 2-factor authentication, unusual activity notifications, and incognito mode. They should also think about how a survivor can capture evidence for use in divorce and custody cases while minimising the trauma. Finally she suggests serious research on other abuse survivors of different age groups and in different countries. For more see her paper here.

I will try to liveblog the rest of the talks in followups to this post.


What you get is what you C

We have a new paper on compiler security appearing this morning at EuroS&P.

Up till now, writers of crypto and security software not only have to fight the bad guys. We also have to deal with compiler writers, who every so often dream up some new optimisation routine which spots the padding instructions that we put in to make our crypto algorithms run in constant time, or the tricks that we use to ensure that sensitive data will be zeroised when a function returns. All of a sudden some critical code is optimised away, your code is insecure, and you scramble to figure out how to outwit the compiler once more.

So while you’re fighting the enemy in front, the compiler writer is a subversive fifth column in your rear.

It’s time that our toolsmiths were our allies rather than our enemies. We have therefore worked out what’s needed for a software writer to tell a compiler that a loop really must be executed in constant time, or that a variable really must be set to zero when a function returns. Languages like C have no way of expressing programmer intent, so we do this by means of code annotations.

Doing it properly turns out to be surprisingly tricky, but we now have a working proof of concept in the form of plugins for LLVM. For more details, and links to the code, see the web page of Laurent Simon, the lead author; the talk slides are here. This is the first technical contribution in our research programme on sustainable security.

April 20 2018


Ethics of mathematics

I’m at the world’s first conference on ethics in mathematics and will be speaking in half an hour. Here are my slides. I will be describing the course I teach to second-year computer scientists on Economics, Law and Ethics. Courses on ethics are mandatory for computer scientists while economics is mandatory for engineers; my innovation has been to combine them. My experience is that teaching them together adds real value. We can explain coherently why society needs rules via discussions of game theory, and then of network effects, asymmetric information and other market failures typical of the IT industry; we can then discuss the limitations of law and regulation; and this sets the stage for both principled and practical discussions of ethics.

April 13 2018


Don’t blame Cambridge for Facebook’s privacy crisis

Mark Zuckerberg tried to blame Cambridge University in his recent testimony before the US Senate, saying “We do need to understand whether there was something bad going on in Cambridge University overall, that will require a stronger action from us.”

The New Scientist invited me to write a rebuttal piece, and here it is.

Dr Kogan tried to get approval to use the data his company had collected from Facebook users in academic research. The psychology ethics committee refused permission, and when he appealed to the University Ethics Committee (declaration: I’m a member) this refusal was upheld. Although he’d got consent from the people who ran his app, the same could not be said of their Facebook “friends” from whom most of the data were collected.

The deceptive behaviour here has been by Facebook, which creates the illusion of privacy in order to get its users to share more data. There has been a lot of work on the economics and psychology of privacy over the past decade and we now understand the dynamics of advertising markets better than we used to.

One big question is the “privacy paradox”. Why do people say they care about privacy, yet behave otherwise? Part of the answer is about context; and part of it is about learning. Over time, more and more people are starting to pay attention to online privacy settings, despite attempts by Facebook and other online advertising firms to keep changing privacy settings to confuse people.

With luck, the Facebook scandal will be a “flashbulb moment” that will drive lots more people to start caring about their privacy online. It will certainly provide interesting new data to privacy researchers.

April 02 2018


PhD studentship in side-channel security

I can offer a 3.5-year PhD studentship on radio-frequency side-channel security, starting in October 2018, to applicants interested in hardware security, radio communication, and digital signal processing. Due to the funding source, this studentship is restricted to UK nationals, or applicants who have been resident in the UK for the past 10 years. Contact me for details of the project proposal.

March 31 2018


A bank statement for app activity (and thus personal data)

During my long sabbatical in 2015-2016 I had plenty of time to think about random things and come up with strange ideas. Most of these ideas are more funny than practical - their primary use is boring people that are reckless enough to have drinks with me.

This blog post describes one of these ideas. With the recent renewed interest in privacy and overreach of smart phone apps, it seems like a topic that is - at least temporarily - less boring than usual.

ML, software behavior, and the boundary between 'malicious' and 'non-malicious'

I have seen a lot of human brain power (and a vast amount of computational power) thrown at the problem of automatically deciding whether a given piece of software is good or bad. 
This is usually done as follows:
  1. Collect a lot of information about the behavior of software (normally by running the software in some simulated environment)
  2. Extract features from this information
  3. Apply some more-or-less sophisticated machine learning model to decide between "good" or "bad"
The underlying idea behind this is that there is "bad" behavior, and "good" behavior, and if we could somehow build a machine learning model that is sufficiently powerful, we could automatically decide whether a given piece of software is good or bad.
In practice, this rarely works without significant false-positive problems, or significant false-negative-problems, or all sorts of complicated corner-cases where the system fails.
In 2015, I had to deal with the fallout of the badly-phrased Wassenaar wording: Export-control legislation which tried to define "bad behavior" for software. During this, it became clear to me that the idea that behavior alone determines good/bad is flawed.
The behavior of a piece of software does not determine whether it is malicious or not. The true defining line between malicious and non-malicious software is whether software does what the user expects it to do
Users run software because they have an expectation for what this software does. They grant permissions for software because they have an expectation for the software to do something for them ("I want to make phone calls, so clearly the app should use the microphone"). This permission is given conditionally, with context -- the user does not want to give the app permission to switch on the microphone when the user does not intend to make a phone call.
The question of malicious / non-malicious software is a question of alignment between user expectations and software behavior.
This means, in practice, that efforts in applying machine learning to separate malicious from non-malicious software are doomed to fail, because they fail to measure the one dimension through which the boundary between good and bad runs.
Intuitively, this can be illustrated with the two pictures below. They show the same set of red and green points in 3d-space from two different perspectives -- once with their z-axis projected away, and once in a 3-d plot where the z-axis is still visible:
Cloud of points from the side, with the "important" dimension projected away. It is near-impossible to draw a sane boundary between red and green points, and whatever boundary you draw won't generalize well.
Same cloud of points, with the "important" dimension going from left to right. It is much clearer how to separate green from red points now. The question that arises naturally, then, is:

How can one measure the missing dimension (user intent)?

User intent is a difficult thing to measure. The software industry has the practice of forcing the user to agree to some ridiculously wide-reaching terms-of-services or EULA that few users read, even fewer understand, and which are often near-equivalent to giving the person you hire to clean your flat a power of attorney over all your documents, and allowing them to throw parties in your flat while you are not looking.
It is commonly argued that - because the user clicked "agree" to an extremely broad agreement - the user consented to everything the software can possibly do.
But consent to software actions is context-dependent and conditioned on particular, specific actions. It is fine for my messenger to request access to my camera, microphone and files - I may need to send a picture, I may need to make a phone call, and I may need to send an attachment. It is not OK for my messenger to use my microphone to see if a particular ultrasonic tracker sound is received, it is not OK for my messenger to randomly search through files etc.
Users do not get to tell the software vendor their intent and the context for which they are providing consent.
Now, given that user intent is difficult to measure up-front - how about we simply ask the user whether something that an app / software did was what he expected it to do?

Information and attention is a currency - but one with bad accounting

The modern ad economy runs on attention and private data. The big advertising platforms make their money by selling the combination of user attention and the ability to micro-target advertisements given contextual data about a user. The user "pays" for goods and services by providing attention and private data.
People often fear that big platforms will "sell their data". This is, at least for the smarter / more profitable platforms, an unnecessary fear: These platforms make their money by having data that others do not have, and which allows better micro-targeting. They do not make their money "selling data", they make money "monetizing the data they have".

The way to think about the relationship between the user and the platform is more of a clicheed "musician-agent" relationship: The musician produces something, but does not know how to monetize it. His Agent knows how to monetize it, and strikes a deal with the musician: You give me exclusive use of your product, and I will monetize it for you - and take a cut from the proceeds.
The profits accumulated by the big platforms are the difference between what the combination of attention & private data obtained from users is worth and the cost of obtaining this attention and data.
For payments in "normal" currency, users usually have pretty good accounting: They know what is in their wallet, and (to the extent that they use electronic means for payments) they get pretty detailed transaction statements. It is not difficult for a normal household to reconstruct from their bank statements relatively precisely how much they paid for what goods in a given month.
This transparency creates trust: We do not hesitate much to give our credit card numbers to online service providers, because we know that we can intervene if they charge our credit cards without reason and in excess of what we agreed to.
Private information, on the other hand, is not accounted for. Users have no way to see how much private data they provide, and whether they are actually OK with that.

A bank statement for app/software activity

How could one empower users to account for their private data, while at the same time helping platform providers identify malicious software better?

By providing users with the equivalent of a bank statement for app/software activity. The way I imagine it would be roughly as follows:
A separate component of my mobile phone (or computer) OS keeps detailed track of app activity: What peripherals are accessed at what times, what files are accessed, etc.

Users are given the option of checking the activity on their device through a UI that makes these details understandable and accessible:
  • App XYZ accessed your microphone in the last week at the following times, showing you the following screen:
    • Timestamp 1, screenshot 1
    • Timestamp 2, screenshot 2
  • Does this match your expectations of what the app should do? YES / NO
  • App ABC accessed the following files during the last week at the following times, showing you the following screen:
    • Timestamp 3, screenshot 3
      • Filename
      • Filename
      • filename
  • Does this match your expectations of what the app should do? YES / NO
At least on modern mobile platforms, most of the above data is already available - modern permissions systems can keep relatively detailed logs of "when what was accessed". Adding the ability to save screenshots alongside is easy.

Yes, a lot of work has to go into a thoughtful UI, but it seems worth the trouble: Even if most users will randomly click on YES / NO, the few thousand users that actually care will provide platform providers valuable information about whether an app is overreaching or not. At the same time, more paranoid users (like me) would feel less fearful about installing useful apps: If I see the app doing something in excess of what I would like it to do, I could remove it.
Right now, users have extremely limited transparency into what apps are actually doing. While the situation is improving slowly (most platforms allow me to check which app last used my GPS), it is still way too opaque for comfort, and overreach / abuse is likely pervasive.
Changing this does not seem hard, if any of the big platform providers could muster the will.

It seems like a win / win situation, so I can hope. I can also promise that I will buy the first phone to offer this in a credible way :-).
PS: There are many more side-benefits to the above model - for example making it more difficult to hack a trusted app developer to then silently exfiltrate data from users that trust said developer - but I won't bore you with those details now.

March 30 2018


IDA on non-OS X/Retina Hi-DPI displays

The problem

Some users running IDA on Windows & Linux X11 platforms with Hi-DPI displays, have reported that IDA looks rather odd: the navigator bar is too narrow, the text under it gets truncated, and there is overall feeling of packing & clumsiness:

  • Windows:
  • Linux X11:

Looking closely, one can notice the following issues (probably not an exhaustive list)

Note: this should not happen and shouldn’t apply for OS X users running IDA on Retina displays. Nor should it happen (but we didn’t get a chance to test this) on non-X11 Linux display managers, such as Wayland.

Fix / mitigation:

On Linux X11 & Windows, if you are using Hi-DPI monitors and IDA looks somewhat like it does in the above screenshots, please try setting the environment variable QT_AUTO_SCREEN_SCALE_FACTOR to 1:

E.g., on Linux/X11:

~# path/to/ida my.idb

IDA should now look more pleasant:

  • Windows:
  • Linux X11:

Some things are still not perfect (e.g., checkboxes might remain small), but IDA definitely looks better.

Please give it a try!

Gory details

When one applies scaling/zooming, either in Windows and Linux X11, that will have the effect of causing the OS to return scaled values for font metrics when queried using point sizes (which is almost always the case.)

For example, when the font metrics for a font of size 12pt are requested by a Qt application, instead of returning 14 pixels as it would on a non-scaled system, the operating system will instead return 28 pixels on a 200% scaled one (in other words, this is essentially a font database-related feature).

That will, in turn, have the net effect of causing Qt to compute layout of the surrounding widgets according to those scaled font metrics, which explains why the text is (for the most part) not truncated.

However, what applying scaling does not do, is tell Qt that it should scale all other pixel measurements by that scale factor.

Consequently, paddings, margins, scrollbars, etc… all have uncomfortably small dimensions, especially when compared to text.

The QT_AUTO_SCREEN_SCALE_FACTOR environment variable is an opt-in that program users can define, in order to control how the program should look. It will in essence instruct Qt to perform automatic scaling of (non-font-related) graphical operations according to the pixel density of the screen(s).

More information can be found on Qt’s website.

Why is this not needed under OS X + Retina?

This is not needed under OS X + Retina, because Qt does not need to perform any kind of scaling there: the scaling is performed by the drawing primitives of the OS itself, and is entirely transparent to the application.

(In fact, an OSX application doesn’t even work with the real screen geometry, but rather with an “abstract” coordinate system, normalizing pixel sizes across screen densities.)

March 26 2018


Tracing stolen bitcoin

A new Computerphile video explains how we’ve worked out a much better way to track stolen bitcoin. Previous attempts to do this had got entangled in the problem of dealing with transactions that split bitcoin into change, or that consolidate smaller sums into larger ones, and with mining fees. The answer comes from an unexpected direction: a legal precedent in 1816. We discussed the technical details last week at the Security Protools Workshop; a preprint of our paper is here.

Previous attempts to track tainted coins had used either the “poison” or the “haircut” method. Suppose I open a new address and pay into it three stolen bitcoin followed by seven freshly-mined ones. Then under poison, the output is ten stolen bitcoin, while under haircut it’s ten bitcoin that are marked 30% stolen. After thousands of blocks, poison tainting will blacklist millions of addresses, while with haircut the taint gets diffused, so neither is very effective at tracking stolen property. Bitcoin due-diligence services supplant haircut taint tracking with AI/ML, but the results are still not satisfactory.

We discovered that, back in 1816, the High Court had to tackle this problem in Clayton’s case, which involved the assets and liabilities of a bank that had gone bust. The court ruled that money must be tracked through accounts on the basis of first-in, first out (FIFO); the first penny into an account goes to satisfy the first withdrawal, and so on.

Ilia Shumailov has written software that applies FIFO tainting to the blockchain and the results are impressive, with a massive improvement in precision. What’s more, FIFO taint tracking is lossless, unlike haircut; so in addition to tracking a stolen coin forward to find where it’s gone, you can start with any UTXO and trace it backwards to see its entire ancestry. It’s not just good law; it’s good computer science too.

We plan to make this software public, so that everybody can use it and everybody can see where the bad bitcoins are going.

I’m giving a further talk on Tuesday at a financial-risk conference in Paris.

March 16 2018


We will make you like our research

This is the title of a paper that appeared today in PLOS One. It describes a tool we developed initially to assess the gullibility of cybercrime victims, and which we now present as a general-purpose psychometric of individual susceptibility to persuasion. An early version was described three years ago here and here. Since then we have developed it significantly and used it in experiments on cybercrime victims, Facebook users and IT security officers.

We investigated the effects on persuasion of a subject’s need for cognition, need for consistency, sensation seeking, self-control, consideration of future consequences, need for uniqueness, risk preferences and social influence. The strongest factor was consideration of future consequences, or “premeditation” for short.

We offer a full psychometric test in STP-II with 54 items spanning 10 subscales, and a shorter STP-II-B with 30 items to measure first-order factors, but that omits second-order constructs for brevity. The scale is here with the B items marked, and here is a live instance of the survey for you to play with. Once you complete it, there’s an on-the-fly interpretation at the end. You don’t have to give your name and we don’t record your IP address.

We invite everyone to use our STP-II scale – not just in security contexts, but also in consumer and marketing psychology and anywhere else it might possibly be helpful. Do let us know what you find!

March 05 2018


IDA 7.1: Qt 5.6.3 configure options & patch

A handful of our users have already requested information regarding the Qt 5.6.3 build, that is shipped with IDA 7.1.

Configure options

Here are the options that were used to build the libraries on:

  • Windows: ...\5.6.3\configure.bat "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "win32-msvc2015" "-opengl" "desktop" "-prefix" "C:/Qt/5.6.3-x64"
    • Note that you will have to build with Visual Studio 2015, to obtain compatible libs
  • Linux: .../5.6.3/configure "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "linux-g++-64" "-developer-build" "-fontconfig" "-qt-freetype" "-qt-libpng" "-glib" "-qt-xcb" "-dbus" "-qt-sql-sqlite" "-gtkstyle" "-prefix" "/usr/local/Qt/5.6.3-x64"
  • Mac OSX: .../5.6.3/configure "-nomake" "tests" "-qtnamespace" "QT" "-confirm-license" "-accessibility" "-opensource" "-force-debug-info" "-platform" "macx-g++" "-debug-and-release" "-fontconfig" "-qt-freetype" "-qt-libpng" "-qt-sql-sqlite" "-prefix" "/Users/Shared/Qt/5.6.3-x64"


In addition to the specific configure options, the Qt build that ships with IDA includes the following patch. You should therefore apply it to your own Qt 5.6.3 sources before compiling, in order to obtain similar binaries (patch -p 1 < path/to/qt-5_6_3_full-IDA71.patch)

Note that this patch should work without any modification, against the 5.6.3 release as found there. You may have to fiddle with it, if your Qt 5.6.3 sources come from somewhere else.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!