Useful Scripts: Scanning source code for profanity


Profanity has no place in source code. It can be viewed as unprofessional or offensive, and should be avoided whenever possible. Using this custom Python script with Understand, you can quickly scan your project for instances of profanity and scrub them clean.


Warning: This article contains some profanity as part of the samples

As a software developer, you are usually given a fair amount of freedom – freedom to design your code structure, freedom to write comments that help you understand what the code is doing, and freedom to name your variables what you’d like. This freedom is a good and necessary thing, don’t get me wrong, but with this freedom, you also carry the responsibility of maintaining a certain level of professionalism.

While profanity or inappropriate jokes may be used as a way to vent frustration with a coworker’s overly verbose function or annoyance with some legacy code, the source code is simply not the place for it. This type of thing can wind up deeply offending someone, or at the very least making you look bad.

Over the course of a code base’s lifetime, there’s a good chance you won’t be the only developer that sees or uses it – the obvious exception to this being a personal project that lives on your local machine, for instance. Not only could coworkers see it, but the code could eventually be made open-source, or released as a sample – it’s impossible to predict who will see it in the future!

With this in mind, I’ve written a script using Understand’s Python API to scan every single lexeme in a given project and report all cases of profanity. This script was written with a package called better-profanity that uses string comparison to find swear words (and most leetspeak variations of them) in strings.

Note: The package only works with Python 3.4+ and PyPy3.


$ python3 "/path/to/" "/path/to/my_project.und"

Here’s a simple C++ file we could run the script on:

#include <iostream>
#include <string>

using namespace std;

int main() {
  // what is this fucking string even for??
  string str1 = "hello world...";
  int shitty_number = 1000;
  cout << "program ending" << endl;
  return 0;

Which would give us the following output via the console: output

From here, we could dig into the source code, clean up these unnecessary instances of foul language, and go about our day feeling good about our virtuous deeds.


Be First to Comment

Leave a Reply

Your email address will not be published.