Source Trojan Attack Is Infecting Open Source Code With Ghost Bugs, Researchers Warn


Remember the old days of custom code pages and operating systems for specific languages, like DOS/V? Unicode has more or less solved the biggest problem of displaying non-ASCII glyphs on computers, although it’s still up to the operating system to implement the support, of course.

Did you know you can write source code in Unicode encoding? We find this very convenient for coders in territories where English proficiency is lacking. As useful as that may be, it’s also the attack vector for the latest terrifying security flaw: “Trojan Source.” Revealed by two researchers from the University of Cambridge, Trojan Source is a way to hide malicious code that is invisible in the source of the application using the text direction features of Unicode.

It works like this: Unicode has a system called “bidi” (short for “bidirectional”) that allows the use of control codes to force text to change direction. This is important when mixing languages ​​that read left to right (like English and Russian) and languages ​​that read right to left (like Hebrew and Arabic). These control codes can be used in the source, including comments and strings.

The thing is, while comments and strings usually have mechanisms for indicating their start and end in a given language, those limits are almost never honored by bidi replacements. This means that by placing these wildcards exclusively in comments and strings, it is possible to create code that appears to do one thing, but actually says something completely different to the compiler.

It’s rather insidious: the resulting code will look flawless to any human code reviewer, but it could contain hand-picked backdoors just waiting to be exploited. As one of the authors explains, “If the change in logic is subtle enough to go unnoticed in later testing, an adversary could introduce targeted vulnerabilities undetected.”

That’s bad enough, but what’s even more annoying is that most modern operating systems and editors will persist bidi codes through copy-and-paste operations. It’s common for coders to joke about their reliance on StackOverflow and similar sites, where helpful programmers produce “example” code for petitioners who then proceed to simply copy-paste the “example” code into their applications. . It’s entirely possible that a bad actor could carefully embed bidi codes to make their “sample” code look innocuous, but actually deliver an obnoxious payload.

The researchers granted a 99-day embargo period to allow tool authors to patch their software, but apparently only nine of the nineteen vendors they contacted actually committed to releasing a patch. Considering the flaw has already been demonstrated in JavaScript, Java, Rust, Go, Python and most C variants, hopefully this public disclosure will set some of the other vendors off.

The only silver lining to this story is that researchers have not been able to determine a single instance of this vulnerability being exploited in the wild. Of course, that doesn’t mean it hasn’t been exploited, nor won’t be, just that no one has noticed it yet. For now, we could go ahead and manually retype this “example” of code that you implement.


Comments are closed.