Copy-and-paste sharing on Stack Overflow spreads insecure code

It’s the time-saving technique employed by many coders in a hurry – copy and paste snippets of code from crowd-sourcing ‘Q&A’ websites and forums to solve tedious or difficult programming problems.

One of the most popular sites for this is Stack Overflow, and most of the time it works out fine.

But what if some of that code introduces bugs that might compromise the security of the software it ends up being used inside?

The tricky bit, as a new study called An Empirical Study of C++ Vulnerabilities in Crowd-Sourced Code Examples, is working out which code is OK and which isn’t.

After analysing real code from Stack Overflow, the researchers found a small but still significant number of examples where this happened over a 10-year period to 2018.

The team reviewed 72,483 C++ code snippets for weaknesses defined by the industry Common Weakness Enumeration (CWE) guidelines, finding 69 representing 29 different types of security flaw, most often CWE-150 (‘Improper neutralization of space, meta, or control sequence’).

This sounds like a small percentage, but those 69 vulnerable snippets found their way into a total of 2,859 projects on the Microsoft-owned software development platform, GitHub.

The idea that vulnerable code might be floating around on sites such as Stack Overflow is hardly a revelation, although this is apparently the first study that has looked closely at C++, a language that remains widely used for specialised programming tasks.

Bad snippets

One issue the researchers don’t address is whether Q&A code sharing is as good an idea as some assume it to be.

Because most developers are unlikely to ditch the advantages of code sharing because of a few bad snippets, the researchers’ answer is a new class of tools to assess its quality.

This should arrive soon in the form of a Chrome extension which can be used to check copied code against the team’s database of vulnerable code:

The extension then recommends non-vulnerable similar code snippets from other Stack Overflow posts, so that the developer can reuse those safe code snippets instead of the vulnerable code snippet.

Interestingly, when the researchers gave 117 of the affected GitHub project owners the bad news about their use of borrowed code, only 15 responded.

Of those who did, several either refused to fix the issue or offered excuses as to why a vulnerability might not be as risky as it appeared.

This suggests that for some coders, bad or insecure code is either too small a problem to be worth fussing about or an acceptable downside of meeting deadlines.

And once it’s inside software, it’s someone else’s problem.